Configuring the departitioning step - 7.1

Talend Real-time Big Data Platform Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Real-Time Big Data Platform
task
Design and Development
EnrichPlatform
Talend Studio

Procedure

  1. Click the link representing the departitioning step to open its Component view and click the Parallelization tab.
    The Departition row option has been automatically selected in the Type area. If you select None, you are actually disabling parallelization for the data flow to be handled over this link. Note that depending on the link you are configuring, a Repartition row option may become available in the Type area to repartition a data flow already departitioned.
    In this Parallelization view, you need to define the following properties:
    • Buffer Size: the number of rows to be processed before the memory is freed.

    • Merge sort partitions: this allows you to implement the Mergesort algorithm to ensure the consistency of data.

  2. If required, change the values in the Buffer Size field to adapt the memory capacity. In this example, we leave the default value.