Configuring how frequent the Tweets are analyzed - 7.1

Kafka

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Messaging components (Integration) > Kafka components
Data Quality and Preparation > Third-party systems > Messaging components (Integration) > Kafka components
Design and Development > Third-party systems > Messaging components (Integration) > Kafka components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click tWindow to open its Component view.
    This component is used to apply a Spark window on the input RDD so that this Job always analyzes the Tweets of the last 20 seconds at the end of each 15 seconds. This creates, between every two window applications, the overlap of one micro batch, counting 5 seconds as defined in the Batch size field in the Spark configuration tab.
  2. In the Window duration field, enter 20000, meaning 20 seconds.
  3. Select the Define the slide duration check box and in the field that is displayed, enter 15000, meaning 15 seconds.

Results

The configuration of the window is then displaed above the icon of tWindow in the Job you are designing.