Setting checkpoints in the MapReduce Jobs - 7.1

Talend Real-time Big Data Platform Studio User Guide

English (United States)
Talend Real-Time Big Data Platform
Talend Studio
Design and Development

You can set checkpoints in a MapReduce Job to restart, in case of failure of Job execution, this Job from the last checkpoint previous to the error instead of from the beginning. This feature is typically useful when your Job is huge with multiple execution steps.

The following image presents an example of the checkpoint set on a MapReduce Job in the Studio.

In this example, a checkpoint (visually as a icon) is set up between the two subJobs and in case of execution error, you can use Talend Administration Center to restart the Job from this checkpoint.

Note that a checkpoint can be placed only on the Trigger link between subJobs of your Job and this Job must be hosted in a remote project from Talend Administration Center.

To define a checkpoint in a Job containing subJobs, proceed as follows:


  1. Click the OnSubjobOk link between the subJobs you want to set the checkpoint for.
    The configuration tab of this link is displayed in the Component view.
  2. Click the Error recovery tab to open its view.
  3. Select the Recovery checkpoint check box and enter the metadata of this checkpoint in the Label and the Failure instructions fields.

    If the Recovery checkpoint check box is grey, that is to say, cannot be selected, check and ensure that the Job you are using is properly hosted in a remote project.

    For further information about setting the checkpoints in the Studio, see Recovering Job execution in case of failure.