Converting the existing Spark Batch Job to a Spark Streaming Job.
Before you begin
You have launched your Talend Studio and opened the Integration perspective.
You have created the aggregate_movie_director_spark Spark Batch Job described in Joining movie and director information using an Apache Spark Batch Job and run it successfully.
- In the Repository tree view, expand the Job Designs node, the Big Data Batch node and then the getting_started folder and the spark folder.
Right-click the aggregate_movie_director_spark Job and from the
contextual menu, select Duplicate.
The Duplicate window is opened.
- In the Input new name field, name this duplicate to aggregate_movie_director_spark_streaming.
- From Job Type drop-down list, select Big Data Streaming.
From the Framework list, select Spark Streaming and click OK to validate the changes.
The aggregate_movie_director_spark_streaming Job is displayed under the Big Data Streaming node in the Repository.
- Right-click this node and select Create folder from the contextual menu.
- In the New Folder wizard, name the new folder to streaming_movies and click Finish to create the folder.
- Drop the aggregate_movie_director_spark_streaming Job into this streaming_movies folder.
This new Spark Streaming Job is now ready for further editing.