Changing the semantic type of a column - Cloud

Talend Cloud Data Inventory User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Inventory
Content
Administration and Monitoring > Managing connections
Data Governance
Data Quality and Preparation > Enriching data
Data Quality and Preparation > Identifying data
Data Quality and Preparation > Managing datasets
Last publication date
2024-02-28

When you add a dataset, the application automatically suggests one of the supported semantic types for each column.

The semantic type corresponds to the category (names, emails, phone numbers, etc) of the data. If the semantic type that has been applied on a column is not the desired one, you have the possibility to manually change it to one of the predefined types, based on your own experience.

Let's take the example of a dataset containing client data, including the job title of your customers. You can see in the header of the job column that the data type has only been recognized as Text (string). You are going to change the semantic type of the column so that it more accurately reflects the data.

A column named 'job' with a Text semantic type.
Note: You can also modify semantic types from the Data model panel of a dataset hierarchical view.

Procedure

  1. Click the header of the job column.
  2. In the Type section of the right panel, click the pen icon next to the current semantic type.
  3. To change the type, you can either:
    • Start typing the name of the type that you think would be appropriate in the Find a semantic type field.

      As you type, an auto-completion feature will suggest a list of available types for your data.

    • Select one of the suggestions, based on the matching percentage with your column.
    'Edit type' window with a search field and semantic type suggestions.
    Note: To change the semantic type in a preparation column, click the menu icon in the column header and click This column is of type to open the semantic type menu.
  4. Click the Job Title type from the suggestions in this case.
    According to the statistics, this semantic type corresponds most closely to the values contained in the column.
  5. Click Apply 1 change.

Results

The column type is updated to Job Title, as you can see in the header of the job column.

Every time that the semantic type of a column is modified, the dataset quality is calculated again.