This example makes you use functions from Talend Cloud Data Preparation.
About this task
To correct the country names, use the fuzzy matching function.
- Select the column: delivery_country.
- In the right panel, select Column and start typing fuzzy matching.
- Select the function Standardize value (fuzzy matching).
- Set the Match threshold to Default (> 80%).
- Click Submit. The step is added to the preparation steps in the left panel and the country names are corrected. For example, United Staates is replaced by United States.
To convert the country codes, use a conversion function. The
delivery_country column is still selected.
- In the right panel, select Column and start typing convert.
- Select the function Convert country names and codes.
- Set From to ISO country code and To to English country name.
- Click Submit. The country names are converted. For example, CA is replaced by Canada.
To correct the TIN, use the lookup feature.
It lets you match the data from the current preparation with a reference dataset. For more information, see the Dynamically using the data from another dataset.You need to associate matching columns.
- Select the column: customer_id. In this example, this column is the matching one.
Click the lookup icon above the right panel.
- Click Select dataset.
Select the reference dataset and click Select.
You are back to the Lookup panel and the
reference dataset is displayed below the preparation.
- In Current preparation and Lookup dataset, select customer_id.
Select the column from the reference dataset to be added to the
In this example, you want to correct the TIN. You need to select customer_tax_id.
- Click Submit. The step is added to the preparation steps in the left panel.