Scenario 3: Transforming from a Data Integration schema to a complex content schema - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The following scenario creates a three-component Job, generating some random data from an input component, transforming this data using a map which was previously created in the Mapping perspective, and then outputting the transformed data in a JSON file. It works with Talend Data Integration metadata for the input and Talend Data Mapper metadata for the output.

Creating a new structure in the Mapping perspective

  1. In the Mapping perspective, in the Data Mapper view, expand the Hierarchical Mapper node, right-click Structures and then select New > Structure.

  2. In the [New Structure] dialog box that opens, select Create a new structure where you manually enter elements, and then click Next.

  3. Name your structure JSON_structure, and then click Next.

  4. In the [Select Representation] dialog box, select JSON from the list of available representations, and then click Next.

  5. Select Don't select a sample document for now, and then click Finish.

Entering the elements for your new structure

  1. In the Mapping perspective, in the Data Mapper view, expand the Hierarchical Mapper node and the Structures node, and then open the JSON_structure structure you created earlier.

  2. In the JSON_structure, right-click to add a new element, click New element and name the new element Root.

  3. Follow the same steps to create a new element called people under the Root element, a person element under people, and four new elements under the person element: firstname, lastname, address and city.

  4. For the person element, change the Occurs Max value to -1 (unlimited).

  5. Press Ctrl + S to save your changes.

Adding and linking the components

  1. In the Integration perspective, in the Repository, right-click Job Designs, and then click Create Standard Job to create a Job named di_to_json. Add a Purpose and Description if you wish, and then click Finish.

  2. Click the point in the design workspace where you want to add the first component, start typing tRowGenerator, and then click the name of the component when it is displayed in the list proposed in order to select it.

  3. Do the same to add a tHMap component, and a tFileOutputRaw component as well.

  4. Connect the tRowGenerator component to the tHMap component using a Row > Main link, then connect the tHMap component to the tFileOutputRaw component using a Row > Main link. When you are asked if you want to get the schema of the target component, click Yes.

Defining the properties of tRowGenerator

  1. Select the tRowGenerator component to define its properties.

  2. In the Basic settings tab, click the [...] button next to RowGenerator Editor to define the rows to be generated.

  3. In the dialog box that opens, click four times the [+] button to add four new columns to the schema, and name them firstname, lastname, address and city.

  4. For each of the columns you just added, change the function to match what is shown in the table below by clicking in the Functions column and scrolling through the list of available functions until you find the one you want, and then click OK when you're done.

    firstname

    TalendDataGenerator.getFirstName

    lastname

    TalendDataGenerator.getFirstName

    address

    TalendDataGenerator.getUsStreet

    city

    TalendDataGenerator.getUsCity

Defining the properties of tFileOutputRaw

  1. Select the tFileOutputRaw component to define its properties.

  2. In the Basic settings tab, click the [...] button then browse to the location on your file system where the output file is to be stored, or enter the path manually between double quotes, and call the output file output.json

    Leave the other parameters unchanged.

Defining the properties of tHMap

  1. Select the tHMap component to define its properties.

  2. Click the [...] button next to the Open Map Editor field to create a new map.

  3. In the [tHMap Structure Generate/Select] dialog box that opens, select Generate hierarchical mapper structure based on the schema for the input structure, and then click Next and then Finish. This means that Talend Data Mapper will automatically generate a structure for you, based on the schema of the input component (tRowGenerator in this case).

  4. For the output structure, select Select an existing hierarchical mapper structure, and then click Next.

  5. Select the JSON_structure structure that you created earlier, and then click Next and then Finish.

  6. In the Map editor that opens, drag row from Input (Map) to person in Output (JSON) to map each of the input elements to its corresponding output element.

  7. Double click SimpleLoop in Loop tab and, in the properties box that opens, check Stream Input and then click OK.

  8. Press Ctrl+S to save your changes to the map.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. In the Run tab, click Run to execute the Job.

  3. Browse to the location on your file system where the output file is stored to check that a JSON file showing the expected data has been successfully created.