Scenario 1: Splitting one row into two rows - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario describes a three-component Job. A row of data containing information of two companies will be split up into two rows.

  1. Drop the following components required for this use case: tFixedFlowInput, tSplitRow and tLogRow from the Palette to the design workspace.

  2. Connect them together using Row Main connections.

  3. Double-click tFixedFlowInput to open its Basic settings view.

  4. Select Use Inline Content(delimited file) in the Mode area.

  5. Fill the Content area with the following scripts:

    Talend;LA;California;537;5thAvenue;IT;Lionbridge;Memphis;Tennessee;537;Lincoln Road;IT Service;

  6. Click Edit schema to open a dialog box to edit the schema for the input data.

  7. Click the plus button to add twelve lines for the input columns: Company, City, State, CountryCode, Street, Industry, Company2, City2, State2, CountryCode2, Street2 and Industry2.

  8. Click OK to close the dialog box.

  9. Double-click tSplitRow to open its Basic settings view.

  10. Click Edit schema to set the schema for the output data.

  11. Click the plus button beneath the tSplitRow_1(Output) table to add four lines for the output columns: Company, CountryCode, Address and Industry.

  12. Click OK to close the dialog box. Then an empty table with column names defined in the preceding step will appear in the Columns mapping area:

  13. Click the plus button beneath the empty table in the Columns mapping area to add two lines for the output rows.

  14. Fill the table in the Columns mapping area by columns with the following values:

    Company: row1.Company, row1.Company2;

    Country: row1.CountryCode, row1.CountryCode2;

    Address: row1.Street+","+row1.City+","+row1.State, row1.Street2+","+row1.City2+","+row1.State2;

    Industry: row1.Industry, row1.Industry2;

    Note

    The value in Address column, for example, row1.Street+","+row1.City+","+row1.State, will display an absolute address by combining values in Street column, City column and State column together. The "row1" used in the values of each column refers to the input row from tFixedFlowInput.

  15. Double-click tLogRow to open its Basic settings view.

  16. Click Sync columns to retrieve the schema defined in the preceding component.

  17. Select Table in the Mode area.

  18. Save the Job and press F6 to run it.

The input data in one row is split into two rows of data containing the same company information.