Ontologies used in the Studio - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

What is an ontology?

An ontology is a description of the concepts, attributes, and the relationships that can exist for data in multiple columns. For example, a customer column is the concept, and date of birth and name are the attributes of the concept. An ontology lists concepts, attributes, and synonyms of the attributes.

What an ontology is used for in the Studio?

Using the ontology repository stored on the log server with the Studio enables knowledge sharing by re-using indicators and patterns that are already analyzed and seen to best suit the type of data you analyze.

Talend Studio analyzes column content based on a set of methods (regex, data dictionary and keyword dictionary) and then decides what category does the data fall in. For example, for data like:

  • user@talend.com, Talend Studio analyzes it against a regex and find it to be an EMAILADDRESS,
  • John, Talend Studio analyzes it against the data dictionary and find it to be FIRSTNAME,
  • 43 Chester Road, Talend Studio analyzes the tokens in the data string against keywords in the dictionary and find Road to be an ADDRESSLINE.

What ontologies are used in the Studio?

An ontology has been built on the log server by merging different business standards, UBL, and OAGI:

  • Universal Business Language (UBL): An OASIS effort to create a synthesis of existing XML business document libraries into one universal business language.
  • Open Application Group (OAGI): OAGI defines a common content model and common messages for communication between business applications.

The final outcome of the merge is 412 concepts that apply on several domains including: customer, company, geography, product, finance, etc.