Switching between modes, distributions, or environments with Spark Universal - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

Spark Universal mechanism allows you to easily and quickly switch between the different Spark modes, distributions, or environments by changing the Hadoop configuration JAR file while keeping the same Job configuration. The switch operation can be performed on:

  • Spark mode: you can switch between Local and Yarn cluster mode to first test your Job on your local machine before sending it to a cluster.
  • Distribution: you can switch between the different big data distributions available for a given Spark version.
  • Environment: you can switch between your development, integration, or production environment.

About this task

This procedure uses a Job on which you work in Local mode.

Procedure

  1. To send your Job to a cluster, select Yarn cluster from the Spark Mode drop-down list in the Spark configuration view of your Job.
  2. Specify the path to the Hadoop configuration JAR file that provides the connection parameters of the development cluster you want to use.
  3. To either change the environment or the distribution, specify the path to another Hadoop configuration JAR file.
    Note: If you have set up the connection parameters in the Repository as explained in Centralizing a Hadoop connection, you can also change the environment or the distribution by selecting Repository from the Propriety type, and then selecting the cluster.