Mapping Procedure

latest update: 2022-03-10   

Introduction

This topic is about a mapping procedure describing the steps and rationales to be taken when mapping any data to ISO 15926-7 templates in ISO 15926-8 Turtle format.

As an example a small table, AREA, from the CFIHOS data model will be used:



In general any mapping comes in three parts:
  1. Declarations
  2. Interrelationships
  3. Attributes (Information)
This can be illustrated as follows:


In above image of the AREA table we find these aspects:
  1. We declare the given Area,
  2. relate it to the given Plant
  3. type it as PLANT AREA
  4. and give it the given Name

Procedure

Step # 1 - Collect the table representation with the exact names for the data elements
Step # 2 - Collect discipline-written definitions of the data elements
Step # 3a - Define the baseOOI - Object Of Interest about which the data element is
Step # 3b - Define other OOIs that are interrelated to the baseOOI
Step # 4 - Write a Narrative in '15926-ish' that defines the semantics of the data element
Step # 5 - Choose the applicable template from the list of Template Specifications based on that Narrative
Step # 6 - List which data values have to be fetched to declare the OOI(s) and the selected template type
Step # 7 - Declare all required OOIs, if not already done (this is handled by the ISO 15926 Mapper tool)
                   Using an ETL (Extract-Transform-Load) tool enter these values in the ISO 15926 Mapper tool

Step # 1 - Collect the table representation with the exact names for the data elements

The representation in whatever form is required for an audit trail, also because table layout don't remain the same forever.
Obviously those exact names are required for proper addressing by the ETL tool.

EXAMPLE
 

Step # 2 - Collect discipline-written definitions of the data elements

These are definitions by the discipline involved. These could be taken from the data dictionary (if exists at all and if is contains semantically senseful definitions).

Having no precise definitions at the source is a guarantee for mismatches. Proper definitions are essential for a proper documentation of the mapping process.

EXAMPLE
AREA
A geographical surface occupied by a Plant. A plant area is created to identify the physical location of an equipment within a plant.
Plant code (FK)
The Plant the Area is a part of
Area code
A code that uniquely identifies the Area within the Plant
Area name
A name describing the location of an Area within the Plant

Step # 3a - Define the baseOOI - Object Of Interest about which the data element is

Information without knowing about what that information is is useless. Determine what that baseOOI is in the source data.

EXAMPLE
In the above image of the AREA table the OOI is in the heading and isn't a foreign key (FK). Actually it is the thing with the 'Area code'.

See further Step # 7 below.

Step # 3b - Define other OOIs that are interrelated to the base OOI

Often such OOIs have already been declared, but in certain cases it may lead to a discovery of things missing.

EXAMPLE
In the above image of the AREA table such an OOI is anything that is a foreign key (FK). Actually it is here the thing with the 'Plant code (FK)'.

For the rest it follows the rules of 
Step # 7 below.

Step # 4 - Write a Narrative in '15926-ish' that defines the semantics of the data element

Based on the definition in Step # 2 above the mapping person creates a definition using terminology of ISO 15926.

EXAMPLE

DECLARATION OF
AREA
Declare AREA, which is a member of dm:SpatialLocation, dm:WholeLifeIndividual, dm:ActualIndividual, rdl:RDS7151497 , where the latter is the 'EssentialType', with its 'Area code' as its rdfs:label
INTERRELATIONSHIP
AREA and PLANT
AREA is relatively located 'INSIDE' the PLANT
TYPE OF AREA(space)
AREA The exact type of AREA(space) is PLANT AREA
IDENTIFICATION
AREA
AREA is described with an 'Area name' in English

When deemed necessary additional rationale can be added.

Step # 5 - Chose the applicable template from the list of Template Specifications based on that Narrative

This is the difficult step. With the definitions at hand a selection has to be made from a list with some 200 template types.
This number is reduced, because there are templates for Classes and for Individuals. In Step #2 that has been taken care of.
The templates come in sets per subject:

TEMPLATE SUBJECTS
for CLASS for INDIVIDUAL EXPLANATION
ACTIVITY ACTIVITY participation, involvement-by-reference, recognizing, IDEF0 like but more possibilities
CLASSIFICATION CLASSIFICATION a Class is a member of a ClassOfClass, an Individual is a member of a Class
DEFINITION   only Classes can be defined, Individuals are what they are
DESCRIPTION DESCRIPTION describe anything, but a description of a Class does not define it
DOCUMENT DOCUMENT NOTE - "document" is a Class, often with many member Individuals (e.g. the newspaper copy in your mail box)
EXISTENCE EXISTENCE an Activity or Event can cause something to begin or end its existence
FUNCTION FUNCTION define or apply a function
IDENTIFICATION IDENTIFICATION give an identifier of some kind
LOCATION LOCATION either relative or in coordinate system
MATERIAL MATERIAL solids and fluids
NUMBER   for the initiated
PROPERTY-CLASSOF   about property ranges
PROPERTY-INDIRECT PROPERTY-INDIRECT a property that cannot be measured but is calculated, e.g. max, min, average, rated, design
PROPERTY and STATUS PROPERTY and STATUS a property that can be measured
RELATIONSHIP-OTHER RELATIONSHIP-OTHER relationships that are not covered by ISO 15926-2, e.g. IS MANUFACTURED BY
SET OPERATIONS   e.g. union, intersection, disjointness, enumeration
SHAPE SHAPE cylinder, sphere, cube, etc
SHAPE DIMENSION   shape with dimensions
SPECIALIZATION   = subClassOf ; members of a subclass are also members of their superclass
STREAM STREAM fluid streams, information streams, etc
STRUCTURE STRUCTURE e.g. composition, assembly, arrangement, connection, feature

In cases where there is a choice compare the semantics, in particular the examples given in the Template Specification.
It is also important to check whether all data, required by a particular template, are available at all, or can be made available.

Step # 6 - List which data values have to be fetched to declare the OOI(s) and the selected template types

These are Literals that, by means of the ETL tool are fetched from the source data elements and entered in what is called the "Template signature".
It is possible to enter a label of an RDL (Reference Data Library) item, of which the rdl:RDS-number is then fixed, and applicable to all instances resulting from the actual mapping. Actually this is a case of specialization of a Template.

EXAMPLE
AREA • 'Area code'
AREA(space), that is the 'EssentialType' (see note below)
AREA and PLANT • 'Area code'
• 'Plant code'  from the applicable earlier declared PLANT
rdl:RDS2229920 ('INSIDE')
AREA TYPING
• 'Area code'
rdl:RDS282689 (PLANT AREA)
AREA IDENTIFICATION
• AREA-UUID
• 'Area name'
rdl:RDS2227020 (IDENTIFICATION BY PLANT AREA NAME)

NOTE - Generically the declaration of the AREA states that it is a member of rdl:RDS7151497 (AREA(space)), because that is what it is and will always be. In case it is deemed to be a PLANT AREA that may change later, but it will always remain to be an AREA(space). PLANT AREA can be further specialized to, for example, UTILITY AREA.
Any such specialization from AREA(space) is to be done with the template ClassificationOfIndividual.

Step # 7 - Using an ETL (Extract-Transform-Load) tool enter these values in the ISO 15926 Mapper tool

Declaration

The OOI is to be declared in terms of the ISO 15926-2 Upper Ontology and the 'EssentialType' of which it is a member (for Individual) or a subclass (for Class). Essential classes are instances of ClassOfFunctionalObject that can be found here. When no such ClassOfFunctionalObject exists use the highest applicable in the taxonomy. See here for further details.

It is important to determine whether it is an Individual or a Class. See here for an explanation. Pump model MN-948s is a Class, a member of that pump model class is an individual.

When it so happens that the OOI has already been declared earlier any
ISO 15926 Mapper tool shall take care of avoiding a duplicate.

ActualIndividual vs NonActualIndividual

All objects in the real world around us are instances of ActualIndividual, all objects that exist in the design world, for example pump P-101 on a P&ID, are instances of NonActualIndividual. There is only one ('Implemention') relationship between P-101 and an actual pump performing the servive of P-101. See also here.

NOTE - A plant area, as used as example in this topic, is either actual or non-actual. Both can, and normally will, exist. Don't try to relate P-101 to an actual plant or plant area, because that is semantical nonsense.

Templates

The selected template has been completely standardized, so when the Literals are imported in the ISO 19526 Mapper tool and the code is generated.

EXAMPLE


GENERIC CODE
EXAMPLE INSTANTIATION CODE
AREA :AREA-UUID
       rdf:type dm:SpatialLocation, dm:WholeLifeIndividual, dm:ActualIndividual,  rdl:RDS7151497  ;
      rdfs:label "Area Code"  ;          
      meta:valEffectiveDate "yyyy-mm-ddThh:mm:ss.sZ"^^xsd:dateTime .
:617597b3-cca9-4ec9-8573-41199465c485
       rdf:type dm:SpatialLocation, dm:WholeLifeIndividual, dm:ActualIndividualrdl:RDS7151497  ;
        rdfs:label "23-01"  ;
        meta:valEffectiveDate "2021-09-01T00:00:00Z"^^xsd:dateTime .
PLANT
(earlier declared)

:PLANT-UUID
      rdf:type lci:InanimatePhysicalObject, dm:WholeLifeIndividual, dm:ActualIndividual, rdl:RDS7151797   ;      
      rdfs:label "Plant Code"  ;          
      meta:valEffectiveDate "yyyy-mm-ddThh:mm:ss.sZ"^^xsd:dateTime .
:d5a14b57-6dc5-4b3f-931a-ed22f99e7d8d
     rdf:type lci:InanimatePhysicalObject, dm:WholeLifeIndividual, dm:ActualIndividual, rdl:RDS7151797   ;      
      rdfs:label "6000"  ;          
      meta:valEffectiveDate "2021-08-14T11:46:00Z"^^xsd:dateTime .
AREA and PLANT :UUID_IN-LOCTN-200
      rdf:type tpl:RelativeLocationOfIndividual ;
      rdfs:label "[EssentialType] individual [hasLocated] is located [hasRelativeLocationType] [EssentialType] individual [hasLocator]"@en ;
      tpl:hasLocated "ID"^^dm:PossibleIndividual ;  # AREA-UUID
      tpl:hasLocator "ID"^^dm:PossibleIndividual ;  # PLANT-UUID
      tpl:hasRelativeLocationType "ID"^^dm:ClassOfRelativeLocation ;
      meta:valEffectiveDate "yyyy-mm-ddThh:mm:ss.sZ"^^xsd:dateTime .
:1bd53674-23e6-4621-b128-6e3d136780c1
        rdf:type tpl:RelativeLocationOfIndividual ;
        rdfs:label "[AREA(space)] individual [23-01] is located [INSIDE] [PLANT] individual [6000]"@en ;
        tpl:hasLocated :617597b3-cca9-4ec9-8573-41199465c485 ; # 23-01
        tpl:hasLocator :d5a14b57-6dc5-4b3f-931a-ed22f99e7d8d ; # 6000
        tpl:hasRelativeLocationType rdl:RDS2229920 ; # INSIDE
        meta:valEffectiveDate "2021-09-01T00:00:00Z"^^xsd:dateTime .
AREA TYPING
:UUID_IN-CLSIF-100
      rdf:type tpl:ClassificationOfIndividual ;
      rdfs:label "[EssentialType] individual [hasClassified] is classified with [EssentialType] class [hasClassifier]"@en ;
      tpl:hasClassified "ID"^^dm:PossibleIndividual ;
      tpl:hasClassifier "ID"^^dm:ClassOfIndividual ;
      meta:valEffectiveDate "yyyy-mm-ddThh:mm:ss.sZ"^^xsd:dateTime .
ex:4cef7d77-aa8e-49b8-a962-e607494fd38f
      rdf:type tpl:ClassificationOfIndividual ;
      rdfs:label "[AREA(space)] individual [23-01] is classified with [AREA(space)] class [PLANT AREA]"@en ;
      tpl:hasClassified :617597b3-cca9-4ec9-8573-41199465c485 ; # 23-01
      tpl:hasClassifier rdl:RDS282689  ; # PLANT AREA
      meta:valEffectiveDate "2021-09-01T00:00:00Z"^^xsd:dateTime .
AREA IDENTIFICATION
:UUID_IN-DESCR-100
      rdf:type tpl:ClassifiedDescriptionOfIndividual ;
      rdfs:label "[EssentialType} individual [hasDescribed] is described with a [hasDescriptionType] [valDescriptor]"@en ;
      tpl:hasDescribed ID^^dm:PossibleIndividual ;
      tpl:hasDescriptor ""^^xsd:string ;
      tpl:hasDescriptionType ID^^dm:ClassOfClassOfDescription ;
      meta:valEffectiveDate "yyyy-mm-ddThh:mm:ss.sZ"^^dateTime .
:d9403c9d-a477-4474-aa3b-b9883983fbc8
        rdf:type tpl:ClassifiedDescriptionOfIndividual ;
        rdfs:label "[AREA(space)] individual [23-01]  is described with a [IDENTIFICATION BY PLANT AREA NAME] [Plant Area 23 - Pump Area 01]"@en ;
        tpl:hasIdentified :617597b3-cca9-4ec9-8573-41199465c485 ; # 23-01
        tpl:valIdentifier "Plant Area 23 - Pump Area 01" ;
        tpl:hasIdentificationType rdl:RDS2227020 ; # IDENTIFICATION BY PLANT AREA NAME
        meta:valEffectiveDate "2021-09-01T00:00:00Z"^^xsd:dateTime .

NOTES
1) For the creation of ETL scripts it is necessary to specialize above GENERIC CODE with:
  • tpl:hasRelativeLocationType "ID"^^dm:ClassOfRelativeLocation >>> tpl:hasRelativeLocationType rdl:RDS2229920
  • tpl:hasClassifier "ID"^^dm:ClassOfIndividual >>> tpl:hasClassifier rdl:RDS282689
  • tpl:hasDescriptionType ID^^dm:ClassOfClassOfDescription >>> tpl:hasDescriptionType  rdl:RDS2227020
in order to make it specific for the semantics of the source data.

2) The notation ID^^dm:PossibleIndividual means that a URI with an identifier of an instance of dm:PossibleIndividual is expected. If not, the template cannot be parsed. Also ""^^xsd:string means that any literal between "" shall be of the type 'string', as defined in the W3C Recommendation XML Schema Part 2: Datatypes Second Edition.

3) All objects get a UUID (Universal Unique Identifier). This is handled by an ISO 15926 Mapper tool.