Scenario: Counting the occurrences of different ages - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario counts how many different ages there are within a group of 12 customers. In this scenario, the customer data is generated at random.

This Job uses 5 components which are:

  • tRowGenerator: it generates 12 rows of customer data containing IDs, names and ages of the 12 customers.

  • tSortRow: it sorts the 12 rows according to the age data.

  • tMemorizeRows: it temporarily memorizes a specific number of incoming data rows at any give time and indexes the memorized data rows.

  • tJavaFlex: it compares the age values of the data memorized by the preceding component, counts the occurrences of different ages and displays these ages in the Run view.

  • tJava: it displays the number of occurrences of different ages.

To replicate this scenario, proceed as follows:

Dropping and linking the components

  1. Drop tRowGenerator, tSortRow, tMemorizeRows, tJavaFlex and tJava on the design workspace.

  2. Connect tRowGenerator to tSortRow using the Row > Main link.

  3. Do the same to link together tSortRow, tMemorizeRows and tJavaFlex using the Row > Main link.

  4. Connect tRowGenerator to tJava using the Trigger > OnSubjobOk link.

Configuring the components

Configuring the tRowGenerator component

  1. Double click the tRowGenerator component to open the its editor.

  2. In this editor, click the plus button three times to add three columns and name them as: id, name, age.

  3. In the Type column, select Integer for id and age.

  4. In the Length column, enter 50 for name.

  5. In the Functions column, select random for id and age, then select getFirstName for name.

  6. In the field of Number of Rows for RowGenerator, type in 12.

  7. In the Column column, click age to open its corresponding Function parameters view in the lower part of this editor.

    In the Value column of the Function parameters view, type in the minimum age and maximum age that will be generated for the 12 customers. In this example, they are 10 and 25.

Configuring the tSortRow component

  1. Double click tSortRow to open its Component view.

  2. In the Criteria table, click the plus button to add one row.

  3. In the Schema column column, select the data column you want to base the sorting operation on. In this example, select age as it is the ages that should be compared and counted.

  4. In the Sort num or alpha column, select the type of the sorting operation. In this example, select num, that is numerical, as age is integer.

  5. In the Order asc or desc column, select desc as the sorting order for this scenario.

Configuring the tMemorizeRows component

  1. Double click tMemorizeRows to open its Component view.

  2. In the Row count to memorize field, type in the maximum number of rows to be memorized at any given time. As you need to compare ages of two customers for each time, enter 2. Thus, this component memorizes two rows at maximum at any given moment and always indexes the newly incoming row as 0 and the previously incoming row as 1.

  3. In the Memorize column of the Columns to memorize table, select the check box(es) to determine the column(s) to be memorized. In this example, select the check box corresponding to age.

Configuring the tJavaFlex and tJava components

  1. Double click tJavaFlex to open its Component view.

  2. In the Start code area, enter the Java code that will be called during the initialization phase. In this example, type in int count=0; in order to declare a variable count and assign the value 0 to it.

  3. In the Main code area, enter the Java code to be applied to each row in the data flow. In this scenario, type in

    if(!age_tMemorizeRows_1[0].equals(age_tMemorizeRows_1[1]))
    {
    count++;
    }
    System.out.println(age_tMemorizeRows_1[0]);
    

    This code compares two ages memorized by tMemorizeRows each time and count one change every time when the ages are found different. Then this code displays the ages that have been indexed as 0 by tMemorizeRows.

  4. In the End code area, enter the Java code that will be called during the closing phase. In this example, type in globalMap.put("count", count); to output the count result.

  5. Double click tJava to open its Component view.

  6. In the Code area, type in the code System.out.println("Different ages: "+globalMap.get("count")); to retrieve the count result.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. Press F6, or click Run on the Run console to execute the Job.

In the console, you can read that there are 10 different ages within the group of 12 customers.