Enriching the semantic types libraries - Cloud

Talend Cloud Data Inventory User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Inventory
Content
Administration and Monitoring > Managing connections
Data Governance
Data Quality and Preparation > Enriching data
Data Quality and Preparation > Identifying data
Data Quality and Preparation > Managing datasets
Last publication date
2024-02-28

When you add a new dataset and open its sample, the application automatically suggests one of the supported semantic types for each field or column.

If the semantic type proposed by the application is not the desired one, you can manually change it by clicking the menu icon in the column header in grid view. For more information, see Changing the semantic type of a column.

This allows you to choose among the list of semantic types present in the Talend Cloud applications by default. For more information, see Predefined semantic types. You can go further by creating your own semantic types, as well as updating or deleting the existing ones, so that Talend Cloud speaks your business language.

Note: You can upload up to 10 MB of content to Talend Dictionary Service per tenant.

The semantic types modifications are made directly in the Talend Cloud Data Inventory, Talend Cloud Pipeline Designer, Talend Cloud Data Stewardship or Talend Cloud Data Preparation interface, via the Semantic types tab of the left menu.

All the changes are stored using Talend Dictionary Service and are propagated across the different Talend Cloud applications.

The availability of Talend Dictionary Service depends on the licence you have.

In Talend Dictionary Service, the semantic types are divided into three main categories:

  • The DICT type, based on an open or closed list of values.
  • The REGEX type that compares your data against a preselected regular expression.
  • The COMPOUND type, under which you can group several existing types.

To enable the interaction between Talend Dictionary Service and the compatible Talend Cloud applications, you must fulfill the following prerequisites:

  • You have a Platform license.
  • Your Talend Cloud user must have the Semantic types manager role of the Dictionary service application assigned in Talend Management Console, in addition to any of the Talend Cloud Data Inventory, Talend Cloud Pipeline Designer, Talend Cloud Data Stewardship or Talend Cloud Data Preparation roles.