Reading messages from a given Kafka topic - 7.1

Kafka

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Messaging components (Integration) > Kafka components
Data Quality and Preparation > Third-party systems > Messaging components (Integration) > Kafka components
Design and Development > Third-party systems > Messaging components (Integration) > Kafka components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click tKafkaInput to open its Component view.
  2. In the Broker list field, enter the locations of the brokers of the Kafka cluster to be used, separating these locations using comma (,). In this example, only one broker exists and its location is localhost:9092.
  3. From the Starting offset drop-down list, select the starting point from which the messages of a topic are consumed. In this scenario, select From latest, meaning to start from the latest message that has been consumed by the same consumer group and of which the offset has been committed.
  4. In the Topic name field, enter the name of the topic from which this Job consumes Twitter streams. In this scenario, the topic is twitter_live.
    This topic must exist in your Kafka system. For further information about how to create a Kafka topic, see the documentation from Apache Kafka or use the tKafkaCreateTopic component provided with the Studio. But note that tKafkaCreateTopic is not available to the Spark Jobs.
  5. Select the Set number of records per second to read from each Kafka partition check box. This limits the size of each micro batch to be sent for processing.