Define the Spark configuration in the Studio to import the Spark 2.0 related jar files.
- In Studio, open the Job you want to run with Spark 2.0.
- To open the Run view, double-click Run.
- Click the Spark configuration tab.
- Clear the Use local mode check box.
- On the Distribution drop-down list, select Custom - Unsupported. This lets you import Spark jar files that are not natively supported by your Hadoop distribution.
- On the Spark version drop-down list, select 2.0.
- To open the Import Custom Definition wizard, next to the Distribution list, click the ellipsis (...).
- Select the Import from existing version radio button and choose your distribution. Ensure that the Spark check box is selected.
- Click OK, and in the pop-up dialog box, click Yes. The [Custom Hadoop Version Definition] wizard opens.
- On the jar list, remove all entrances except the one for talend-mapred-lib.jar. If you run your Job on Windows, keep winutils-hadoop-2.6.0.exe, too.
- To open the [Select Libraries] wizard, click the plus symbol (+), then select External libraries.
To access and select the Spark jar file you downloaded from your cluster earlier, click Browse....
You see files like these:
- After adding all the jar files, to validate the changes, click OK.