AI-aided Mapping

latest update 2023-02-02  


Using the advanced technology of NLP AI, in particular for semantic equivalence between two sentences in a natural language, might be a way to help a mapping Person to find the right mapping between his source data and:

The process to follow is shown in the diagram below:

Explanation (from left to right)

Map source data elements to RDF predicates

The startpoint is always one or a set of two or (seldomly) more data elements from a source application. These may be contained in a table, as shown, or be the prompts of a spreadsheet, or else. From this we create an RDF predicate (property). This alone gives the possibility to store the data as triples in a triple store, using Semantic Web technologies for managing them.

Define the predicates

A person who is intimately knowledgeable with the source application should write a crystal clear definition in plain English or another language that can be handled by AI tools. It helps the mapping process enormously when the objects and concepts used in the definition are typed with ISO 15926-2 entity types, including whether a Class or an Individual.

Use NLP AI for finding a match

A set of generic predicates, with a mapping to ISO 15926-7/8 Templates, is being set up and will be available for free from the internet.

Each such generic predicate also has a crystal clear definition in ISO 15926 terms.
There are AI tools on the market, improving in a rapid pace, that can determine the degree of semantic equivalence between two sentences. In case one such source predicate, with its definition, is made input, the AI tool can scan through the definitions of all such generic templates and come with a list of possible matches, with a rating of equivalence. The mapping person then has to select the best match to his/her judgement.

Sometimes combine two or more predicates for mapping

In above referred to topic we find:

cfihos:equipmentPrice + cfihos:currencyCode (cfihos:00000012.10000170 + cfihos:00000012.10000193)
    TRIPLE: cfihos:Asset cfihos:equipmentPrice xsd:decimal. 
    SUB-PROPERTY OF: pred:hasPrice-currency
    MAPS TO TEMPLATE: tpl:IndividualHasMonetaryValue # a specialization thereof with fixed subclass of 'COST' (rdl:RDS7945027) and 'CURRENCY' (rdl:RDS2229240)
where the RDS-number for the applicable cost type must be taken from the specialization of COST and one for the currency from CURRENCY. CFIHOS could then define their own predicates as as subPropertyOf pred:hasPrice-currency, like cfihos:hasFOBCost-USD.

Map to one or, sometimes, more templates

The generic predicates refer to the applicable template(s). Sometimes a combination of two templates gives the required representation. An example of that can also be found in that topic:

cfihos:purchaseOrderIssuerCompanyName (cfihos:00000012.10000180)
    TRIPLE: cfihos:PurchaseOrder cfihos:purchaseOrderIssuerCompanyName cfihos:Company . # replace labels of subject and object with their (UU)ID
    SUB-PROPERTY OF: pred:documentPublisher
    MAPS TO TEMPLATE: tpl:DocumentPublication + tpl:EndedParticipationOfIndividualInActivity # a specialization thereof with fixed 'PERFORMER' (rdl:RDS222365)

This can be seen in the listing below:

It is possible to create a union of both templates:

Creating these unions of templates is nice, but probably unmangeable at a standard level. Time will tell.

The more specialized a predicate is, such as the (historic) predicate 'normalOperatingOutletSteamPressureAuxiliaryDriver', the more templates may have to be used.

AI tools

An interesting article about tools can be found here.
A recent development, not mentioned, can be found here.

This is in the early phase of discovery.