The Parallelization tab - 7.1

Talend Big Data Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
task
Design and Development
EnrichPlatform
Talend Studio

The Parallelization tab is available as one of the settings tab you can use to configure a Row connection.

You define the parallelization properties on your row connections according to the following table.

Field/Option

Description

Partition row

Select this option when you need to partition the input records into a specific number of threads.

Note:

It is not available to the last row connection of the flow.

Departition row

Select this option when you need to regroup the outputs of the processed parallel threads.

Note:

It is not available to the first row connection of the flow.

Repartition row

Select this option when you need to partition the input threads into a specific number of threads and regroup the outputs of the processed parallel threads.

Note:

It is not available to the first or the last row connection of the flow.

None

Default option. Select this option when you do not want to take any action on the input records.

Merge sort partitions

Select this check box to implement the Mergesort algorithm to ensure the consistency of data.

This check box appears when you select the Departition row or Repartition row option.

Number of Child Threads

Type in the number of threads into which you want to split the input records.

This field appears when you select the Partition row or Departition row option.

Buffer Size

Type in the number of rows to cache for each of the threads generated.

This field does not appear if you select the None option.

Use a key hash for partitions

Select this check box to use the hash mode for dispatching the input records, which will ensure the records meeting the same criteria are dispatched to the same threads. Otherwise, the dispatch mode is Round-robin.

This check box appears if you select the Partition row or Repartition row option.

In the Key Columns table that appears after you select the check box, set the columns on which you want to use the hash mode.