Configuring the data output - Cloud - 8.0

Text standardization

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components > Text standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components > Text standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components > Text standardization components

Procedure

  1. Double-click tAggregateRow to display its Basic settings view and define the component properties.
  2. Click the [...] button next to Edit schema to open a dialog box. Here you can define the output flow.
  3. In the output flow to the right of the dialog box, click the plus button to add as many columns as you need in the output flow.
    In this example, we want to have two output columns, the translation column and a new output column called count.
    When done, click OK to close the dialog box and proceed to the next step.
  4. In the tAggregateRow basic settings view and in the Group by area, click the plus button to add an many lines as needed. Here you can define the group-by values.
    • Click in the Output column line and select the output column that will hold the aggregated data, the translation column in this example.

    • Click in the Input column position line and select the input column from which you want to collect the values to be aggregated, the translation column in this example.

  5. In the Operations area, click the plus button to add lines for the columns that will hold the aggregated data. Here you can define the calculation values.
    • Click in the Output column line and select the destination column from the list, the translation column in this example.

    • Click in the Function column line and select any of the listed operations.

      In this example, we want to count the number of distinct stems to be listed only once in the output column.

    • Click in the Input column position line and select the input column from which you want to collect the values to be aggregated, the id_key column in this example.

  6. Double-click tFileOutputExcel to display its Basic settings view and define the component properties.
  7. Set the destination file path and define the settings of the file according to your needs.