Writing data to an Amazon Kinesis Stream - 6.5

Kinesis

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Data Fabric
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Messaging components (Integration) > Kinesis components
Data Quality and Preparation > Third-party systems > Messaging components (Integration) > Kinesis components
Design and Development > Third-party systems > Messaging components (Integration) > Kinesis components
EnrichPlatform
Talend Studio

Before you begin

In this section, it is assumed that you have an Amazon EMR cluster up and running and that you have created the corresponding cluster connection metadata in the repository. It is also assumed that you have created an Amazon Kinesis stream.

Procedure

  1. Create a Big Data Streaming Job using the Spark framework.
  2. In this example the data, which will be written to Amazon Kinesis, are generated with a tRowGenerator component.
  3. The data must be serialized as byte arrays before being written to the Amazon Kinesis stream. Add a tWriteDelimitedFields component and connect it to the tRowGenerator component.
  4. Configure the Output type to byte[].
  5. To write the data to your Kinesis stream, add a tKinesisOutput component and connect the tWriteDelimitedFields component to it.
  6. Provide your Amazon credentials.
  7. To access your Kinesis stream, provide the Stream name and the corresponding endpoint url.

    To get the right endpoint url, refer to AWS Regions and Endpoints.

  8. Provide the number of shards, as specified when you created the Kinesis stream.