A Talend Job for
Apache Spark Batch allows you to access and use the Talend Spark components to visually
design Apache Spark programs to read, transform or write data.
Procedure
-
In the Repository tree view, expand the Job Designs node, right-click the Big Data
Batch node and select Create
folder from the contextual menu.
-
In the New Folder wizard, name your Job folder
getting_started and click Finish to create your folder.
-
Right-click the getting_started folder and
select Create folder again.
-
In the New
Folder wizard, name the new folder to spark and click Finish to create the folder.
-
Right-click the spark folder and select Create Big Data Batch Job.
-
In the New
Big Data Batch Job wizard, select Spark from the Framework drop-down list.
-
Enter a name for this Spark Batch Job and other
useful information.
For example, enter aggregate_movie_director_spark in the Name field.
Results
The Spark Batch component Palette is
now available in Talend Studio. You
can start to design the Job by leveraging this Palette and the Metadata node in the
Repository.