Updating an existing semantic type - 7.1

Talend Data Stewardship User Guide

Version
7.1
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Stewardship
Content
Administration and Monitoring > Managing users
Data Governance > Assigning tasks
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Handling tasks
Data Quality and Preparation > Managing semantic types

You can edit an existing semantic type in Talend Dictionary Service to impact how your data is validated in Talend Data Stewardship.

Predefined semantic types in Talend Data Stewardship are based on standard values, but you may need to tailor them to match your own data. Some data that you would expect to fall under a predefined category, may be considered invalid.

Let's take the example of a dataset containing a list of customers, with their email addresses, date of birth, and the country they live in. You can notice that all the entries for United States of America are considered invalid, when they should not since it is the official name of the country.

The problem here is that United States of America is not one of the expected value for the Country semantic type in Talend Data Stewardship. The valid entry in this case would be United States.

To avoid having this problem in the future, you need to update the Country semantic type in Talend Dictionary Service and add United States of America to the list of valid entries. The change will be automatically available in Talend Data Stewardship.

Procedure

  1. In the homepage, click SEMANTIC TYPES.
  2. Click the search icon on the top-right corner of the page and enter country to filter the list of semantic types.
  3. Click Country in the list.
  4. Click the icon next to Values and enter United States of America in the field which displays.
  5. Click to add the new value to the top of the list of valid entries for the Country semantic type.
  6. Click SAVE AND PUBLISH to send the semantic type to the Talend Dictionary Service server and make it available to be used by the system.
    Clicking SAVE AS DRAFT stores the new type on the server without propagating it to the system. The new type is not usable unless it is published. For a use case of this option, let's say that you have new semantic types to deploy as part of a new project. You can prepare the work by creating the semantic types and save them as draft before the go-live of the project, and can deploy the semantic types only the day of go-live.
  7. Go back to Talend Data Stewardship and refresh the task list containing the customers countries or reopen it.
    The change in the semantic type is now available in Talend Data Stewardship and you can see in the quality bar that there is no invalid values anymore.

    Also, an entry is added to the history of each of the tasks to show the changes done in the semantic type.

Results

The Country semantic type has been manually updated to support the new value.

From now on, when working with data that are matched with the Country semantic type, United States of America will be considered a valid value.