Before you begin
You have launched your Talend Studio and opened the Integration perspective.
You have created the aggregate_movie_director_mr MapReduce Job described in Joining movie and director information using a MapReduce Job and run it successfully.
- In the Repository tree view, expand the Job Designs node, the Big Data Batch node and then the getting_started folder and the mapreduce folder.
Right-click the aggregate_movie_director_mr Job and from the contextual menu, select Duplicate.
The Duplicate window is opened.
- In the Input new name field, name this duplicate to aggregate_movie_director_spark_batch.
From the Framework list, select Spark and click OK to
validate the changes.
The aggregate_movie_director_spark_batch Job is displayed in the mapreduce folder in the Repository.
- Right-click the getting_started folder and select Create folder from the contextual menu.
- In the New Folder wizard, name the new folder to spark_batch and click Finish to create the folder.
- Drop the aggregate_movie_director_spark_batch Job into this spark_batch folder.
This new Spark Batch Job is now ready for further editing.