How to create a MapReduce Job - 6.2

Talend Real-time Big Data Platform Studio User Guide

English (United States)
Talend Real-Time Big Data Platform
Talend Studio
Data Quality and Preparation
Design and Development

If you are going to create your first MapReduce Job, you have to start from the Job Designs node of the Repository tree view in the Integration perspective.

  1. Right-click the Job Designs node and in the contextual menu, select Create Big Data Batch Job.

    Then the [New Big Data Batch Job] wizard appears.

  2. From the Framework drop-down list, select MapReduce.

  3. In the Name, the Purpose and the Description fields, enter the descriptive information accordingly. Among the information, the Job name is mandatory.

    Once done, the Finish button is activated.

  4. If you need to change the Job version, click the M and the m buttons next to the Version field to make the changes.

    If you need to change the Job status, select it from the drop-down list of the Status field.

    If you need to edit the information in the uneditable fields, select File > Edit Project properties from the menu bar to open the [Project Settings] dialog box to make the desired changes.

  5. Click Finish to close the wizard and validate the changes.

    Then an empty Job is opened in the workspace of the Studio and the available components for MapReduce appear in the Palette.


    Your Studio may look different from what this image displays, depending on the license you are using.

In the Repository tree view, this created MapReduce Job appears automatically under the MapReduce Jobs node under Job Designs, and any Jobs of non MapReduce type are grouped under the Standard Jobs node that appears alongside the MapReduce Jobs node.

Then you need to drop the components you need to use from the Palette onto the workspace and link and configure them to design a MapReduce Job, the same way you do for a standard Job. Note that only MapReduce components are available and can be used to create a MapReduce Job.

Note that in designing a MapReduce Job, only the MapReduce version of a component is available and at least one transformation component such as tMap is required.

For further information about how to design a standard Job, see Designing a Job.

For further information about the properties to be set of each component available for Talend MapReduce Jobs, see Talend Components Reference Guide.

Then if you need to create more MapReduce Jobs, you have to repeat the operations explained in the current section but start from the MapReduce Jobs node.

You can also create these types of Jobs by writing their Job scripts in the Jobscript view and then generate the Jobs accordingly.