Writing data in parallel - 7.1

Talend Big Data Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
task
Design and Development
EnrichPlatform
Talend Studio
Parallel data writing refers to the concept of speeding-up the execution of a Job by dividing the data flow into multiple fragments that can be written simultaneously.

About this task

Note that when parallel execution is enabled, it is not possible to use global variables to retrieve return values in a subJob.

The Advanced settings for all database output components include the option Enable Parallel Execution which, if selected, allows to perform high-speed data processing, that is treating multiple data flows simultaneously.

When you select the Enable parallel execution check box, the Number of parallel executions field displays where you can enter the number by which the current processed data is devised to achieve N level of parallel processings.

The current processed data being executed across N fragments might execute N times faster than it would if processed as a single fragment.

You can also set the data flow parallelization parameters from the design workspace of the Integration perspective. To do that:

Procedure

  1. Right-click a DB output component on the design workspace and select Parallelize from the drop-down list to display a dialog box.
  2. Select the Enable parallel execution check box and enter the number of parallel executions in the corresponding field. Alternatively, press Ctrl + Space and select the appropriate context variable from the list.
  3. Click OK to validate data flow parallelization parameters.
    The number of parallel executions displays next to the DB output component in the design workspace.