Scenario: Using a pivot column to aggregate data - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The following scenario describes a Job that aggregates data from a delimited input file, using a defined pivot column.

Dropping and linking components

  1. Drop the following component from the Palette to the design workspace: tFileInputDelimited, tPivotToColumnsDelimited.

  2. Link the two components using a Row > Main connection.

Configuring the components

Set the input component

  1. Double-click the tFileInputDelimited component to open its Basic settings view.

  2. Browse to the input file to fill out the File Name field.

    The file to use as input file is made of 3 columns, including: ID, Question and the corresponding Answer

  3. Define the Row and Field separators, in this example, respectively: carriage return and semi-colon

  4. As the file contains a header line, define it also.

  5. Set the schema describing the three columns: ID, Questions, Answers.

Set the output component

  1. Double-click the tPivotToColumnsDelimited component to open its Basic settings view.

  2. In the Pivot column field, select the pivot column from the input schema. this is often the column presenting most duplicates (pivot aggregation values).

  3. In the Aggregation column field, select the column from the input schema that should gets aggregated.

  4. In the Aggregation function field, select the function to be used in case duplicates are found out.

  5. In the Group by table, add an Input column, that will be used to group by the aggregation column.

  6. In the File Name field, browse to the output file path. And on the Row and Field separator fields, set the separators for the aggregated output rows and data.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. Press F6 or click Run on the Run tab to execute the Job.

    The output file shows the newly aggregated data.