How a Talend Storm Job works - 6.3

Talend Real-time Big Data Platform Studio User Guide

EnrichVersion
6.3
EnrichProdName
Talend Real-Time Big Data Platform
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

When using this documentation, it is presumed that you have basic knowledge about Apache's Storm project. If this is not your case, see Apache's documentation about Storm.

The same way as you design a Talend MapReduce Job, you can simply use the Storm-specific components to create a Storm Job, a topology in terms of Storm, and configure the connection to the Storm cluster to be used. At runtime, the Studio submits the Storm Job (topology) to the Nimbus server of the Storm cluster being used and leaves this topology continuously running in the cluster until you kill it either directly in the Storm UI provided by Storm or using the proper Storm configuration defined in the Studio. Once this topology is running in the cluster, you can monitor its execution status in the console of the Run view of this Storm Job, if your Storm configuration allows it.

The topology you create in the Studio receives the messages to be processed via Apache's Kafka system, a generic message broker. The Kafka system allows the topology to be completely decoupled from the message producer system and thus to fit any types of message systems. In the meantime, this mechanism also means you need to install and use Kafka along with the Storm cluster to be used.

For further information about Kafka, see Apache's documentation about the Kafka messaging service.

The available modes to run a Storm topology in are as follows:

  • Local: you use the embedded Storm libraries to run a topology within the Studio.

  • Remote: the Studio connects to a Storm cluster to run the topology.

The execution information of a Talend Storm Job (topology) is logged in the Storm UI of the Storm cluster being used. For this reason, you can consult the web console of the Storm UI for that information. The name of this topology is the one you give in the Topology name field of the Storm configuration view of this Job.