Aggregating values and sorting data - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This example shows you how to use Talend components to aggregate the students' comprehensive scores and then sort the aggregated scores based on the student names.

Creating a Job for aggregating and sorting data

Create a Job to aggregate the students' comprehensive scores using the tAggregateRow component, then sort the aggregated data using the tSortRow component, finally display the aggregated and sorted data on the console.

  1. Create a new Job and add a tFixedFlowInput component, a tAggregateRow component, a tSortRow component, and a tLogRow component by typing their names in the design workspace or dropping them from the Palette.

  2. Link the tFixedFlowInput component to the tAggregateRow component using a Row > Main connection.

  3. Do the same to link the tAggregateRow component to the tSortRow component, and the tSortRow component to the tLogRow component.

Configuring the Job for aggregating and sorting data

Configure the Job to aggregate the students' comprehensive scores using the tAggregateRow component and then sort the aggregated data using the tSortRow component.

  1. Double-click the tFixedFlowInput component to open its Basic settings view.

  2. Click the button next to Edit schema to open the schema dialog box and define the schema by adding two columns, name of String type and score of Double type. When done, click OK to save the changes and close the schema dialog box.

  3. In the Mode area, select Use Inline Content (delimited file) and in the Content field displayed, enter the following input data:

    Peter;92
    James;93
    Thomas;91
    Peter;94
    James;96
    Thomas;95
    Peter;96
    James;92
    Thomas;98
    Peter;95
    James;96
    Thomas;93
    Peter;98
    James;97
    Thomas;95
  4. Double-click the tAggregateRow component to open its Basic settings view.

  5. Click the button next to Edit schema to open the schema dialog box and define the schema by adding five columns, name of String type, and sum, average, max, and min of Double type.

    When done, click OK to save the changes and close the schema dialog box.

  6. Add one row in the Group by table by clicking the button below it, and select name from both the Output column and Input column position column fields to group the input data by the name column.

  7. Add four rows in the Operations table and define the operations to be carried out. In this example, the operations are sum, average, max, and min. Then select score from all four Input column position column fields to aggregate the input data based on it.

  8. Double-click the tSortRow component to open its Basic settings view.

  9. Add one row in the Criteria table and specify the column based on which the sort operation is performed. In this example, it is the name column. Then select alpha from the sort num or alpha? column field and asc from the Order asc or desc? column field to sort the aggregated data in ascending alphabetical order.

  10. Double-click the tLogRow component to open its Basic settings view, and then select Table (print values in cells of a table) in the Mode area for better readability of the result.

Executing the Job to aggregate and sort data

After setting up the Job and configuring the components used in the Job for aggregating and sorting data, you can then execute the Job and verify the Job execution result.

  1. Press Ctrl + S to save the Job.

  2. Press F6 to execute the Job.

    As shown above, the students' comprehensive scores are aggregated and then sorted in ascending alphabetical order based on the student names.