Removing a semantic type through the user interface - 2.3

Talend Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
6.5
2.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

You can delete a semantic type in Talend Dictionary Service to remove it from the list of recognized data types in Talend Data Preparation.

The variety of semantic types that are present by default in Talend Data Preparation may not apply to your business context. For example, a five-digit number can be interpreted as a American ZIP code, but also as a French or German one since they share the same format.

Let's say that you are working in an American company, and you only have to deal with data coming from American clients, including ZIP codes. You would prefer to keep only the American ZIP code in the list of recognized semantic types.

In this example, the ZIP column of the dataset can be matched with at least four types.

Using Talend Dictionary Service, you will simply remove the other semantic types that match the five-digit format and only leave US Postal Code. The change will then be ported instantly in Talend Data Preparation, and from now on, ZIP codes will only be validated against the US Postal Code semantic type.

Procedure

  1. From the left panel of the Talend Data Preparation homepage, open the Semantic Types view.
  2. In the list of existing semantic types, look for FR Postal Code.
  3. To delete it, point your mouse over the semantic type and click the garbage bin icon that is displayed on the right.
  4. Repeat the last two steps to delete the FR Insee Code and the DE Postal Code.

Results

You have deleted the other semantic types compatibles with five-digit numbers. From now on, when adding new datasets, only US Postal Code will be proposed as semantic type for the columns containing Zip codes.

If you remove a semantic type that is used in one or more datasets, the relevant columns will switch back to the text category.