Configuring the components - 7.1

Standardization

author
Talend Documentation Team
EnrichVersion
Cloud
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click the first tFileInputDelimited component to open its Basic settings view and set the parameters of the main input flow, including the path and name of the file to read and the number of header rows to skip.
    In this example, the main input file provides a list of people names and US state names. The following shows an extract of the file content:
    name;state
    Andrew Kennedy;Mississippi
    Benjamin Carter;Louisiana
    Benjamin Monroe;West Virginia
    Bill Harrison;Tennessee
    Calvin Grant;Virginia
    Chester Harrison;Rhode Island
    Chester Hoover;Kansas
    Chester Kennedy;Maryland
    Chester Polk;Indiana
    Dwight Nixon;Nevada
    Dwight Roosevelt;Mississippi
    Franklin Grant;Nebraska
  2. Click the [...] button next to Edit schema to open the Schema dialog box and set the input schema.
    According to the structure of the main input file, the input schema should contain two columns: name and state.
    When done, click OK to close the dialog box and propagate the changes to the next component.
  3. Define the properties of the second tFileInputDelimited component similarly.
    In this example, the reference input file provides a list of states and their two-letter codes. Accordingly, the reference input schema should have two columns: state and code.
  4. Double-click the tReplaceList component to open its Basic settings view to set the operation to carry out.
  5. From the Lookup search column list, select the column to be searched. In this use case, we want to carry out a search on the state column.
  6. From the Lookup replacement column list, select the column containing the replacement values, code for the two-letter state codes in this example.
  7. In the Column options table, select Replace check box for the states column, to replace the state names with their corresponding codes.
  8. In the tLogRow component, select the Table check box for a better readability of the output.