Additional information about window duration, period and sessions - Cloud

Talend Cloud Pipeline Designer Processors Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Design and Development > Designing Pipelines
EnrichPlatform
Talend Pipeline Designer

The Window processor allows you to partition streaming data into several types of time windows.: fixed time windows, sliding windows and session windows.

The Window processor starts a new window every period of time.

This window stores in memory records during a certain time (duration) then sends micro-batches of data to the output.

Fixed Time Windows

Fixed time windows, also called tumbling windows or "window trains", are the simplest form of windows:
  • all windows have the same consistent duration and never overlap

  • only one window is stored in memory at a given time

  • one piece of data is captured in one window

These windows are useful if you want to capture all data all the time.

Here, all data with timestamp values from 00:00:00 to 01:00:00 belong to Window 1, data with timestamp values from 01:00:00 to 02:00:00 belong to Window 2, etc.

Sliding windows

Sliding windows, also called sliding time windows, are the simplest form of windows:
  • multiple windows can overlap

  • several windows are stored in memory at the same time

  • elements in a data set can be captured in more than one window

These windows are useful for sampling purposes and for taking running averages of data.

Here all data with timestamp values from 00:00:00 to 01:00:00 belong to Window 1, data with timestamp values from 00:30:00 to 01:30:00 belong to Window 2, etc. In this example, you can compute a running average of the past hour's worth of data, updated every 30 minutes.

Here all data with timestamp values from 00:00:00 to 01:00:00 belong to Window 1, data with timestamp values from 01:30:00 to 02:30:00 belong to Window 2 and so on. Windows do not overlap and pieces of data are stored in different windows.

Session windows

Session windows are windows that contain data that are stored in a gap duration of other data:
  • high concentrations of data are grouped into separate windows

  • idle sections of the data stream are filtered out

  • data can be captured in disjoint windows of different sizes

These windows are useful for data that is irregularly distributed with respect to time. For example, a data stream representing user mouse activity may have long periods of idle time interspersed with high concentrations of clicks.

Here the data that represents some activity is stored within a window that closes when it does not receive data during at least five minutes (duration gap). Windows do not overlap and do not have a fixed start and end time.