You can create a compound semantic type which references other semantic types that are published on Talend Dictionary Service and add it to the list of recognized data types in the data models in Talend Data Stewardship.
You can mix all semantic types when creating a compound type, and a compound semantic type can reference other compound types on the condition that all children types are already published.
Let's say that you have a file which holds information about customers from US, UK, Germany and France. You need to intervene and validate the different zip codes against a compound semantic type you create. Once data matches one of the child types, it is considered as valid and it is not evaluated against the other referenced types.
When defining the data model in Talend Data Stewardship, you can set the semantic
type for the column containing the zip codes to this new compound type,
Zip_codes in this example.
Before you begin
In the homepage, click
- Enter a name and a description for the new semantic type.
- Select the semantic type from the Type list.
Keep the Use for validation switch activated.
This compound type will be used to define which values are considered right or wrong when applied on a given column. The result of this validation process can be seen in the quality bar of each column in your datasets.
In this example, if you were to deactivate the switch, the compound type would only be used for data discovery, and no value would be considered invalid.
- From the Children types list, select the semantic types you want to group in this compound type.
Click SAVE AND PUBLISH to send the semantic type to the
Talend Dictionary Service server and make it
available to be used by the system.
Clicking SAVE AS DRAFT stores the new type on the server without propagating it to the system. The new type is not usable unless it is published. For a use case of this option, let's say that you have new semantic types to deploy as part of a new project. You can prepare the work by creating the semantic types and save them as draft before the go-live of the project, and can deploy the semantic types only the day of go-live.
Go back to Talend Data Stewardship and create the data model for the customers data.
The new semantic category
Phone_numbersis available now in the list of semantic types and you can set it for the column containing the phone numbers.
When you load the customer data to Talend Data Stewardship, data is matched and validated
Phone_numbers compound type you created. Data is
evaluated against the first child type and if data matches it is not evaluated
against the other referenced types and so on.