Scenario 2: Clearing the memory before loading data to it in case an iterator exists in the same subjob - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

In this scenario, the usage of the Append option of tHashOutput is demonstrated as it helps remove repetitive or unwanted data in case an iterator exists in the same subjob as tHashOutput.

To build the Job, do the following:

Dropping and linking the components

  1. Drag and drop the following components from the Palette to the workspace: tLoop, tFixedFlowInput, tHashOutput, tHashInput and tLogRow.

  2. Connect tLoop to tFixedFlowInput using a Row > Iterate link.

  3. Connect tFixedFlowInput to tHashOutput using a Row > Main link.

  4. Connect tHashInput to tLogRow using a Row > Main link.

  5. Connect tLoop to tHashInput using an OnSubjobOk link.

Configuring the components

Configuring data input and hash cache
  1. Double-click the tLoop component to display its Basic settings view.

  2. Select For as the loop type. Type in 1, 2 1 in the From, To and Step fields respectively. Keep the Values are increasing check box selected.

  3. Double-click the tFixedFlowInput component to display its Basic settings view.

  4. Select Built-In from the Schema drop-down list.

    Note

    You can select Repository from the Schema drop-down list to fill in the relevant fields automatically if the relevant metadata has been stored in the Repository. For more information about Metadata, see the Talend Studio User Guide.

  5. Click Edit schema to define the data structure of the input flow. In this case, the input has one column: Name.

  6. Click OK to close the dialog box.

  7. Fill in the Number of rows field to specify the entries to output, for example 1.

  8. Select the Use Single Table check box. In the Values table, assign a value to the Name field, e.g. Marx.

  9. Double-click tHashOutput to display its Basic settings view.

  10. Select Built-In from the Schema drop-down list and click Sync columns to retrieve the schema from the previous component. Select Keep all from the Keys management drop-down list and deselect the Append check box.

Configuring data retrieval from hash cache and data output
  1. Double-click tHashInput to display its Basic settings view.

  2. Select Built-In from the Schema drop-down list. Click Edit schema to define the data structure, which is the same as that of tHashOutput.

  3. Select tHashOutput_2 from the Component list drop-down list.

  4. Double-click tLogRow to display its Basic settings view.

  5. Select Built-In from the Schema drop-down list and click Sync columns to retrieve the schema from the previous component. In the Mode area, select Table (print values in cells of a table).

Saving and executing the Job

  1. Press Ctrl+S to save the Job.

  2. Press F6, or click Run on the Run tab to execute the Job.

    You can find that only one row was output although two rows were generated by tFixedFlowInput.