You can edit an existing semantic type in Talend Dictionary Service to impact how your data is validated in Talend Data Stewardship.
Predefined semantic types in Talend Data Stewardship are based on standard values, but you may need to tailor them to match your own data. Some data that you would expect to fall under a predefined category, may be considered invalid.
Let's take the example of a dataset containing a list of customers, with their email addresses, date of birth, and the country they live in. You can notice that all the entries for United States of America are considered invalid, when they should not since it is the official name of the country.
The problem here is that United States of America is not one of
the expected value for the
country semantic type in Talend Data Stewardship. The valid entry in this
case would be United States.
To avoid having this problem in the future, you will update the
semantic type in Talend Dictionary Service, and add
United States of America to the list of valid entries. The
change will be automatically available in Talend Data Stewardship.
Open a command prompt window and use the
cdcommand to go to the <Dictionary_Service_Path>/command-line folder.
To add the value
United States of Americato the list of valid countries, execute the following command according to your operating system:
category_manager.bat -a -name COUNTRY -value "United States of America"for Windows.
./category_manager.sh -a -name COUNTRY -value "United States of America"for Linux.
Please note that to be able to use this command, you need to put it on one single line.
You are prompted for your Talend Administration Center credentials. The command is executed after you enter a valid login and password.
To display the list of entries under the
countrysemantic type, execute the following command according to your operating system:
category_manager.bat -e -name COUNTRYfor Windows.
./category_manager.sh -e -name COUNTRYfor Linux.
You can see that
United States of Americahas been properly added at the bottom of the list of valid entries for the
Go back to Talend Data Stewardship and
refresh the task list containing the customers countries or reopen it.
country semantic type has been manually updated to support a new
From now on, when dealing with data that are matched with the
country semantic type, United States of
America will be considered a valid value.
category_manager.bat -hcommand for Windows.
./category_manager.sh -hfor Linux.