Setting up the Job - 7.1

Text standardization

author
Talend Documentation Team
EnrichVersion
Cloud
7.1
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Standardization components > Text standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components > Text standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components > Text standardization components
EnrichPlatform
Talend Studio

Procedure

  1. In the Repository tree view, expand Metadata - DB Connections where you have stored the main input schema and drop the relevant file onto the design workspace.
    The Components dialog box displays with the corresponding component selected by default.
  2. Click OK to drop the tMysqlInput component onto the workspace.
    The input table used in this scenario is called translation. It holds several columns including the translation column that holds the English words we want to stem.
  3. Drop the following components from the Palette onto the design workspace: tNormalize, tFilterRow, tStem, tAggregateRow and tFileOutputExcel.
  4. Connect the component together using the Main links with the exception of the tFilterRow - tStem connection that should use a Filter link.