Creating the Big Data Batch Job

Creating the Big Data Batch Job - 7.3

Talend Data Mapper User Guide

Version

7.3

Language

English

Product

Talend Big Data Platform

Talend Data Fabric

Talend Data Management Platform

Talend Data Services Platform

Talend MDM Platform

Talend Real-Time Big Data Platform

Module

Talend Studio

Content

Design and Development > Designing Jobs

Last publication date

2023-01-05

Create a Job with a tHMapInput and two output components to convert a JSON file to two CSV files.

About this task

This example uses a local file as input, but you can also create an HDFS connection. For more information, see HDFS components.

Procedure

In the Integration perspective, right-click the Job Designs node and click Create Big Data Batch Job.
Enter a name, purpose and description for your Job, then click Finish.
Add the following components to your design workspace:
- A tHMapInput
- A tFileOutputDelimited
Click the tHMapInput and go to the Components tab to configure the component:
1. If you are working with local files, clear the Define a storage configuration component.
2. In the Input field, enter the path to your input file.
  Example
  "c:/users/jsmith/documents/courses.json"
Double-click the tFileOutputDelimited and configure it:
1. Clear the Define a storage configuration component.
2. Click the ... button next to Edit schema and create two columns named id and title.
3. Enter the path to the folder where the output files should be created.
  Example
  "c:/users/jsmith/documents/modules"
Right-click the tFileOutputDelimited component and click Copy, then paste it on your design workspace to create a second one with the same configuration.
Double-click tFileOutputDelimited_2 and change the value of the Folder field.
Example
"c:/users/jsmith/documents/sections"
Link the tHMapInput to the two tFIleOutputDelimited components with Row > Main connections named modules and sections and click Yes when asked if you want to get the schema from the target component.
Your Job should look like this:
Double-click the tHMapInput and follow the wizard to generate the map.
1. Select the structure created in Creating the input structure for your Big Data Batch Job and click Next.
2. Select Start/End with.
  In this example, the following regular expression is automatically added to the Start with field: \{\s*(\'course\'|\"course\").
3. Optional: Click the ... button to add your sample input file and click Run to check the records found.
  In this case, you should have three records.
  
  Example
4. Click Finish.

Results

The map is generated, it uses the input structure previously created and generated an output structure from the schema defined in the tFileOutputDelimited components. You can now map the elements.

About this task

Procedure

Example

Example

Example

Example

Results