Scenario 2: Using Talend Data Integration metadata - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The following scenario creates a three-component Job, reading data from an input file that is transformed using a map that you create in the Mapping perspective and then outputting the transformed data in a new file. It works with Talend Data Integration metadata.

Copying an editable version of the example files

  1. In the Mapping perspective, in the Data Mapper view, expand the Hierarchical Mapper node and the Other Projects folder, right-click Examples and then select Copy in the contextual menu.

  2. In the Data Mapper view, right-click at the root of the Hierarchical Mapper node, and then select Paste in the contextual menu.

    This copies an editable version of all the read-only example files to your local workspace.

Adding and linking the components

  1. In the Integration perspective, create a new Standard Job and call it di_to_di.

  2. Click the point in the design workspace where you want to add the first component, start typing tFileInputDelimited, and then click the name of the component when it appears in the list proposed in order to select it.

  3. Do the same to add a tHMap component and a tFileOutputXML component as well.

  4. Connect the tFileInputDelimited component to the tHMap component using a Row > Main link, then connect the tHMap component to the tFileOutputXML component using a Row > Main link.

Defining the properties of tFileInputDelimited

  1. Select the tFileInputDelimited component to define its properties.

  2. In the Basic settings tab, click the [...] button next to the File name/Stream field then browse to the location on your file system where the input Excel file is stored, or enter the path manually between double quotes. For this example, use <PATH_TO_WORKSPACE>/<PROJECT_NAME>/Sample Data/CSV/PurchaseOrderPayPal/PayPalPO.csv.

  3. Select the CSV options check box.

  4. Change the Field Separator to a comma, between double quotes (",").

  5. Change the value of Header to 1.

  6. Click the [...] button next to Edit schema to define the schema.

  7. Add three columns and rename them txn_id, payment_date and first_name (which correspond to the names of the first three columns in the input file, and is sufficient for the purposes of this example), and then click OK.

  8. Leave all the other parameters unchanged.

Defining the properties of tFileOutputXML

  1. Select the tFileOutputXML component to define its properties.

  2. In the Basic settings tab, click the [...] button next to the File Name field then browse to the location on your file system where the output file will be stored, or enter the path manually between double quotes.

  3. Click the [...] button next to Edit schema to define the schema.

  4. Add three columns to the input schema on the left and rename them id, date and name, copy them to the output schema on the right, and then click OK.

  5. Leave the other elements unchanged.

Defining the properties of tHMap

  1. Select the tHMap component to define its properties.

  2. Click the [...] button next to the Open Map Editor field to create a new map based on the input and output of tHMap.

  3. In the tHMap Structure Generate/Select dialog box that opens, select Generate hierarchical mapper structure based on the schema and then click Next to generate the input structure.

  4. Do the same for the output structure.

  5. In the Map editor that opens, drag the txn_id element of Input (map) to the id element of Output (map). Do the same to map payment_date to date and first_name to name, and then save your changes.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. In the Run tab, click Run to execute the Job.

  3. Browse to the location on your file system where the output file is stored to check that an XML file has been created containing the same data as the input CSV file.