Switching between modes, distributions or environments with Spark Universal - Cloud - 8.0

Talend Data Fabric Studio User Guide

Version
Cloud
8.0
Language
English (United States)
EnrichDitaval
Data Fabric
Product
Talend Data Fabric
Module
Talend Studio
Content
Design and Development

Spark Universal mechanism allows you to easily and quickly switch between the different Spark modes, distributions or environments by changing the Hadoop configuration JAR file while keeping the same Job configuration. The switch operation can be performed on:

  • Spark mode: you can switch between Local and Yarn cluster mode to first test your Job on your local machine before sending it to a cluster.
  • Distribution: you can switch between the different big data distributions available for a given Spark version.
  • Environment: you can switch between your development, integration or production environment.

About this task

This procedure uses a Job on which you work in Local mode.

Procedure

  1. To send your Job to a cluster, select Yarn cluster from the Spark Mode drop-down list in the Spark configuration view of your Job.
  2. Specify the path to the Hadoop configuration JAR file that provides the connection parameters of the development cluster you want to use.
  3. To either change the environment or the distribution, specify the path to another Hadoop configuration JAR file.
    Note: If you have set up the connection parameters in the Repository as explained in Centralizing a Hadoop connection, you can also change the environment or the distribution by selecting Repository from the Propriety type, and then selecting the cluster.