Streaming execution - 7.3

Talend Data Mapper User Guide

Version
7.3
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs
Last publication date
2023-01-05

Streaming execution is used to process unlimited amounts of data. Without streaming execution, the entire input of the transformation is stored into memory before the transformation is executed, which limits the amount of data to be transformed to what may fit in the available memory.

Streaming execution works by accumulating blocks of input data and then executing the transformation on each block separately. Because of this, there are limitations on what may be specified in the transformation.

You specify that the transformation is to stream by checking the Stream Input property on the SimpleLoop function. In doing so, an xQuery is executed at every 1000th looping element, or at each block. By default, a block count is at 1000. You can change this behavior using a context variable call, transform.streaming.block.count, and adding a positive numeric value.

The following lists some important information when you select the Stream Input property:
  • If you select the Stream Input property on the SimpleLoop function, you cannot use sort keys, since the sort action cannot be performed while streaming.
  • If you select Stream Input property on the SimpleLoop function, and you also select a distinct child element, the input is already sorted by the child element such that the distinct calculation can be done without further sorting.