Before you begin
In this section, it is assumed that you have an Amazon EMR cluster up and running and that you have created the corresponding cluster connection metadata in the repository. It is also assumed that you have created an Amazon Kinesis stream.
Create a Big Data Streaming Job using the Spark framework.
- In this example the data, which will be written to Amazon Kinesis, are generated with a tRowGenerator component.
- The data must be serialized as byte arrays before being written to the Amazon Kinesis stream. Add a tWriteDelimitedFields component and connect it to the tRowGenerator component.
- Configure the Output type to byte.
- To write the data to your Kinesis stream, add a tKinesisOutput component and connect the tWriteDelimitedFields component to it.
- Provide your Amazon credentials.
To access your Kinesis stream, provide the Stream name and the corresponding
To get the right endpoint url, refer to AWS Regions and Endpoints.
Provide the number of shards, as specified when you created the Kinesis