Aggregating the scores - 6.5

Talend Job Script Reference Guide

EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
EnrichPlatform
Talend CommandLine
Talend Studio
task
Design and Development > Designing Jobs

Use the procedure below to add and configure a tAggregateRow component to aggregate the scores.

Procedure

  1. Next to the tUnite component, add a new component, tAggregateRow.
    addComponent {
    	setComponentDefinition {
    		TYPE: "tAggregateRow",
    		NAME: "tAggregateRow_1",
    		POSITION: 512, 192
    	}
    
    }
  2. Next to the setComponentDefinition {} of tAggregateRow, define the component properties using the setSettings {} function.

    In this example, the tAggregateRow component, labelled aggregate, will perform sum, avg, max, and min calculations to get the total, average, highest, and lowest scores of each subject.

    	setSettings {
    		GROUPBYS {
    			OUTPUT_COLUMN : "subject",
    			INPUT_COLUMN : "subject"
    		},
    		OPERATIONS {
    			OUTPUT_COLUMN : "sum",
    			FUNCTION : "sum",
    			INPUT_COLUMN : "score",
    			OUTPUT_COLUMN : "average",
    			FUNCTION : "avg",
    			INPUT_COLUMN : "score",
    			OUTPUT_COLUMN : "max",
    			FUNCTION : "max",
    			INPUT_COLUMN : "score",
    			OUTPUT_COLUMN : "min",
    			FUNCTION : "min",
    			INPUT_COLUMN : "score"
    		},
    		LABEL : "aggregate"
    	}
  3. Next to the setSettings {}, enter the addSchema {} function to define the output structure of the component.

    In this example, the tAggregateRow component will output five columns:

    • subject, type String
    • sum, for the total score of each subject, type Double
    • average, for the average score of each subject, type Double
    • max, for the highest score of each subject, type Double
    • min, for the lowest score of each subject, type Double
    	addSchema {
    		NAME: "tAggregateRow_1",
    		CONNECTOR: "FLOW"
    		addColumn {
    			NAME: "subject",
    			TYPE: "id_String"
    		}
    		addColumn {
    			NAME: "sum",
    			TYPE: "id_Double",
    			PRECISION: 2
    		}
    		addColumn {
    			NAME: "average",
    			TYPE: "id_Double",
    			PRECISION: 2
    		}
    		addColumn {
    			NAME: "max",
    			TYPE: "id_Double",
    			PRECISION: 2
    		}
    		addColumn {
    			NAME: "min",
    			TYPE: "id_Double",
    			PRECISION: 2
    		}
    	}