PCA RDL Verification

Message
Author
vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

PCA RDL Verification

#1 Post by vvagr »

In this topic I'll collect results of verification tests performed for PCA reference data library. Code snippets for .15926 Editor used to run tests will be published also.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Root classes of individual

#2 Post by vvagr »

It is supposed that reference data items in PCA RDL are organised in a taxonomy with "ISO 15926-4 THING" as a root.

Subset of reference data is selected for further analysis:

- only classes of individual (no classes of classes);
- except classes of EXPRESS information representation.

The following code shows all classes from this subontology which are not subclasses of any class, therefore are not connected to "ISO 15926-4 THING" by any chain of specializations. Some of them have their own subtaxonomies.

Code: Select all

cois = find(type = part2.any.ClassOfIndividual)
expr = find(type=part2.any.ClassOfInformationRepresentation)
subont = (cois-expr)
nonroot = find(type = part2.Specialization, hasSubclass=out)
roots=subont-nonroot
show(id=roots)
There were 823 such classes at the time of this publication. The list of URIs can be downloaded here:
roots.rar
(2.38 KiB) Downloaded 655 times

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Cycle identification

#3 Post by vvagr »

Reference data library was checked for specialization cycles. The following code traces specialization chains from "ISO 15926-4 THING" until no new classes are found. The last iteration of this search gives us a set of classes suspicious for cycle membership (this set may also contain classes reachable by more then one specialization chain). Then for each member of this set the check is performed whether the class is subclass of itself.

Code: Select all

root = 'http://posccaesar.org/rdl/RDS398732751'

next = set([root])
all = set()
cycles = set()

while True:
   next = find(type = part2.Specialization, hasSubclass=out, hasSuperclass=next)
   if not next - all:
      break
   all |= next

found = next
show(id = found )

for item in found:
   next = set([item])
   all = set()
   while True:
      next = find(type = part2.Specialization, hasSubclass=out, hasSuperclass=next)
      if item in next:
         cycles.add(item)
      if not next - all:
         break
      all |= next

show(id = cycles )
Three classes are identified:
"CHP 406.4 X 9.53 ASME B36.19M" http://posccaesar.org/rdl/RDS22601516315 which is declared subclass of itself by specialization relationship http://posccaesar.org/rdl/RDS2260812411

"SFT CLASS 200" http://posccaesar.org/rdl/RDS978526101 and "WATER" http://posccaesar.org/rdl/RDS1012769 are each declared subclass of another.
This test doesn't cover taxonomies not connected to "ISO 15926-4 THING" (see the test described above). It should be run again after taxonomy structure is unified.
Last edited by vvagr on Fri Apr 26, 2013 12:02 am, edited 3 times in total.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Quantification of properties

#4 Post by vvagr »

The first query shows all instances of Property:

Code: Select all

show("all_prop", type=part2.Property)
The second query shows all properly quantified instances of Property:

Code: Select all

show("quant_prop", type=part2.PropertyQuantification, hasInput=out, hasResult=find(type=part2.RealNumber))

The third shows quantified instances of Property which have Scale properly defined:

Code: Select all

show("scaled_prop", hasInput=out,  id=find(type=part2.Classification, hasClassifier=find(type=part2.Scale), hasClassified=out==check(type=part2.PropertyQuantification, hasResult=find(type=part2.RealNumber))))
The query:

Code: Select all

show('non_quant', id=all_prop - quant_prop)
returns all instances of Property which are not properly quantified by relating them to numbers. The list of 32 URIs can be downloaded below:
non_quant.rar
(265 Bytes) Downloaded 366 times
The query:

Code: Select all

show('non_scaled', id=quant_prop - scaled_prop)
returns instances of Property which are quantified but Scale for the quantification is missing. There are only 4:
Last edited by vvagr on Thu Apr 25, 2013 11:41 pm, edited 3 times in total.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Specialization errors

#5 Post by vvagr »

This query shows specialization relationships where classes of classes of individuals are specialized from classes of relationships:

Code: Select all

show(type=part2.Specialization, hasSubclass=find(type=part2.any.ClassOfClassOfIndividual), hasSuperclass=find(type=part2.any.ClassOfRelationship))
There are 16 such relationships.

This query shows specialization relationships where classes of relationships are specialized from classes of classes of individuals:

Code: Select all

show(type=part2.Specialization, hasSubclass=find(type=part2.any.ClassOfRelationship), hasSuperclass=find(type=part2.any.ClassOfClassOfIndividual))
There are 8 such relationships.

The list of 24 URIs can be downloaded below:
wrongspec.rar
(237 Bytes) Downloaded 386 times
Last edited by vvagr on Thu Apr 25, 2013 11:39 pm, edited 2 times in total.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Missing definitions

#6 Post by vvagr »

The following code looks for all entities without hasDefinition property, except instances of
ClassOfInformationRepresentation (including subtypes), Relationship (including subtypes), ClassOfRelationship (including subtypes), ClassOfClassOfRelationship (including subtypes), Property, ArithmeticNumber (including subtypes), PropertyRange.

Code: Select all

alls = find(type = part2.any.Thing)
expr = find(type=part2.any.ClassOfInformationRepresentation)
rels = find(type=part2.any.Relationship)
corels = find(type=part2.any.ClassOfRelationship)
cocorels = find(type=part2.any.ClassOfClassOfRelationship)
props = find(type=part2.Property)
numbs = find(type=part2.any.ArithmeticNumber)
ranges= find(type=part2.PropertyRange)
subont = (alls-expr-rels-corels-props-numbs-ranges-cocorels)
show('nodef', id=subont, hasDefinition=void)
There are 1358 such entities, the list of URIs can be downloaded below:
nodefs.rar
(4.13 KiB) Downloaded 383 times

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Classification mistakes

#7 Post by vvagr »

The most complex test and the one which really uses Part 2 data model.

Part 2 contains many restrictions on class membership. Unfortunately they are mostly expressed in natural language and only part of them had found their way into the formal OWL model as described on https://www.posccaesar.org/wiki/ISO1592 ... Membership . Available formal restrictions are imported into the data model and can be accessed in .15926 Editor environment, making it possible to automatically generate tests for correct class membership.

Unfortunately class membership restrictions are obviously missing for certain Part 2 types, and we were obliged to exclude instances of these types from testing. For example, it is not represented formally that instances of Class can classify anything, instances of EnumeratedSetOfClass can classify instances of Class, or that instances of Scale can classify instances of PropertyQuantification. Likewise it is not formally defined that instances of PropertyRange or SinglePropertyDimension can classify instances of Property.

Types of classifiers excluded from testing are:
• Class
• EnumeratedSetOfClass
• PropertyRange
• SinglePropertyDimension
• Scale
• DocumentDefinition
• RepresentationForm


The following code cycles through all classifier classes in the RDL except instances of types listed above.

It takes the type of the classifier class and finds the set of all allowed types for its members, using internal representation of Part 2 membership restrictions encoded in the Editor. Then it takes the set of types of real members for classifier class analyzed and checks whether it is a subset of a set of allowed types. If not - there is a classification or typing mistake.

Code: Select all

import iso15926.kb as kb
from graphlib import compact_uri, expand_uri, curi_head, curi_tail

def GetClassified(uri):
   result = set()
   curi = compact_uri(uri)
   name = curi_tail(curi)
   entry = kb.part2_itself["part2:" + name]
   classified = entry.get("classified")
   if classified:
      for v in classified:
         entry = kb.part2_itself[v] 
         result.add(expand_uri(curi_head(curi)) + entry['name'])
   return result

def GetSubtypes(uriset):
   result = set()
   for uri in uriset:
      name = uri[max(uri.rfind('#'), uri.rfind('/'))+1:]
      if isinstance(getattr(part2.any, name).uri, set):
           subtypes = getattr(part2.any, name).uri-  set([getattr(part2, name).uri])
      result |= subtypes
   return result

all_classifiers=find(type=part2.Classification, hasClassifier=out)
cls=find(type=part2.Classification, hasClassifier=out==find(type=part2.Class))
esocs=find(type=part2.Classification, hasClassifier=out==find(type=part2.EnumeratedSetOfClass))
propr=find(type=part2.Classification, hasClassifier=out==find(type=part2.PropertyRange))
spds=find(type=part2.Classification, hasClassifier=out==find(type=part2.SinglePropertyDimension))
scls=find(type=part2.Classification, hasClassifier=out==find(type=part2.Scale))
docds= find(type=part2.Classification, hasClassifier=out==find(type=part2.DocumentDefinition))
rfrms= find(type=part2.Classification, hasClassifier=out==find(type=part2.RepresentationForm))

classifiers= all_classifiers-esocs-propr-scls-cls-spds-docds-rfrms
show(id=classifiers)

mist =set()
counter=0
for classifier in classifiers:
   counter+=1
   print(counter)
   classifier_type=find(id=classifier, type=out).pop()
   classified_type_must=GetClassified(classifier_type)
   classified_types_must= GetSubtypes(classified_type_must)
   classified_types_must  |= classified_type_must
   classified_types=find(id=find(type=part2.Classification, hasClassifier=classifier, hasClassified=out), type=out)
   if classified_types&classified_types_must != classified_types:
      mist.add(classifier)
show(id=mist)
There are 230 suspicious classifiers with inappropriately typed members at the time of this publication, the list of URIs can be downloaded below:
badclassif.rar
(959 Bytes) Downloaded 378 times
Last edited by vvagr on Mon Feb 17, 2014 2:05 pm, edited 1 time in total.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Re: PCA RDL Verification

#8 Post by vvagr »

Part of our RDL verification work was done during Ontology Summit 2013 Hackathon event. We've also calculated some ontology quality metrics for RDL. If you are interested in this part - full report on Hackathon event is published at the Ontology Summit mailing list, please navigate to http://ontolog.cim3.net/forum/ontology- ... 00038.html to see an archived posting. All links to metric methodology are there.

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Re: PCA RDL Verification

#9 Post by vvagr »

If you have problems unpacking .rars - here are all URI lists from verification tests in one ZIP archive. File names correspond to .rar archive names posted above.
verif_res.zip
(10.8 KiB) Downloaded 398 times

vvagr
Posts: 282
Joined: Mon Feb 27, 2012 11:01 pm
Location: Moscow, Russia
Contact:

Re: PCA RDL Verification

#10 Post by vvagr »

The following SPARQL query is looking in PCA RDL for RDS-entities which have R-numbers assigned via rdsWipEquivalent predicate but have no other data about them at the endpoint (no type, no label, nothing).

Code: Select all

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX RDL: <http://posccaesar.org/rdl/>

SELECT ?y
WHERE { GRAPH <http://irm.dnv.com/ontologies/iring.map> {
?x RDL:rdsWipEquivalent ?y .}
OPTIONAL { ?y ?p ?z } .
FILTER (!bound(?z)) }
There are 13403 such entities. See URIs in attached file.
VoidWithRNumb.zip
(59.98 KiB) Downloaded 350 times
Attachments
VoidWithRNumb.zip
(59.98 KiB) Downloaded 336 times

Post Reply