Spark Universal mechanism allows you to easily and quickly switch between the different Spark modes, distributions or environments by changing the Hadoop configuration JAR file while keeping the same Job configuration. The switch operation can be performed on:
- Spark mode: you can switch between Local and Yarn cluster mode to first test your Job on your local machine before sending it to a cluster.
- Distribution: you can switch between the different big data distributions available for a given Spark version.
- Environment: you can switch between your development, integration or production environment.
About this task
- To send your Job to a cluster, select Yarn cluster from the Spark Mode drop-down list in the Spark configuration view of your Job.
- Specify the path to the Hadoop configuration JAR file that provides the connection parameters of the development cluster you want to use.
To either change the environment or the distribution, specify the path to
another Hadoop configuration JAR file.
Note: If you have set up the connection parameters in the Repository as explained in Centralizing a Hadoop connection, you can also change the environment or the distribution by selecting Repository from the Propriety type, and then selecting the cluster.