Configuring the process of normalizing rows - 7.3

Standardization

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components
Last publication date
2024-02-21

About this task

To do this, proceed as follows:

Procedure

  1. Double-click tStandardizeRow to open its Component view.
  2. In the Column to parse field, select SKU_Description_Size_Weight. This is the only column that the incoming schema has.
  3. Under the Conversion rules table, click the plus button eight times to add eight rows in this table.
  4. To complete these rows, type in the rules you have figured out when analyzing the raw data at the beginning of this scenario.
    The two Size rules are executed in top-down order. In this example, this order allows this component to match firstly the sizes with three numbers and then those with two numbers. If you reverse this order, this component will match the first two numbers of all sizes before all and then treat the last number of the three-numbers sizes as unmatched.
  5. Click the Generate parser code in routines button.
  6. In the Advanced settings view, leave the options selected by default in the Output format area as they are.
    The Max edits for fuzzy match is set to 1 by default.