tParallelize - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tParallelize displays as a component on the design workspace. However, its usage is slightly different to that of typical components.

The tParallelize component itself does not process data or data flows, but helps you to parallelize and synchronize the execution of numerous subjobs in your main Job.

tParallelize Properties

Component family

Orchestration

 

Function

tParallelize allows you to synchronize the execution of a subjob with the execution of other subjobs in your main Job.

Purpose

tParallelize helps you manage complex Job systems. It executes several subjobs simultaneously and synchronizes the execution of a subjob with other sub-jobs within the main Job.

Basic settings

Wait For

end of first subjob: sequence the relevant subjob to be executed at the end of the first subjob.

 

 

end of all subjobs: sequence the relevant subjob to be executed at the end of all subjobs.

 

Sleep Duration

Set the time interval in seconds between each check for subjob execution.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component can be used as either a start or middle component in a main Job built of numerous subjobs. It can be connected to preceding or following components using OnSubjobOk, Parallelize or Synchronize links. You can use as many tParallelize components as you want in your master Job.

Connections

Outgoing links (from this component to another):

Trigger: Synchronize; Parallelize.

Incoming links (from one component to this one):

Trigger: On Subjob Ok; On Subjob Error; Run if; On Component Ok; On Component Error.

For further information regarding connections, see Talend Studio User Guide.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

 

Scenario: Parallelize/synchronize subjobs execution

The following simple scenario creates a five-component main Job that uses one tParallelize component with four tMsgBox single-component subjobs. The tMsgBox_1 component is the trigger subjob. The tParallelize_1 component executes tMsgBox_2 and tMsgBox_3 simultaneously, and then synchronizes tMsgBox_4 to be executed at the end of the simultaneous execution of the subjobs.

  • Drop four tMsgBox components from the Palette to the design workspace.

  • Define their dialog box display properties as desired.

For more information on defining tMsgBox properties, see tMsgBox.

  • Drop a tParallelize component onto the design workspace.

  • Connect the tMsgBox_1 component to tParallelize_1 using an OnSubjobOk link, available on the right-click menu. This link will trigger the next subjob(s) on the condition that the first subjob has completed without error.

  • Connect tParallelize_1 to tMsgBox_2 and tMsgBox_3 using a Parallelize link for each, available on the right-click menu.These links will simply parallelize the execution of the two connected subjobs.

  • Connect tParallelize_1 to tMsgBox_4 using a Synchronize link to seqeunce the execution of this fourth subjob.

  • Select the tMsgBox_4 and set its Basic settings parameters.

  • On the Basic settings panel of the tParallelize component and from the Wait For list, select either end of first subjob or end of all subjobs. This will sequence your fourth subjob to be executed at the end of the first subjob or at the end of all subjobs respectively.

  • In the Sleep Duration field, set the time interval in seconds between each check of a subjob execution.

  • Save your main Job.

  • Click the F6 key to run it.

The four message boxes are displayed according to the defined sequence.

The above was a very simple scenario of what the tParallelize component can do for you. However, you can parallelize/synchronize far more complex Jobs with this component whereby each of the subjobs that build the main Job can execute any possible task processed in Talend Studio.