Defining Spark Universal connection details in the Spark configuration view

Complete the Spark Universal connection configuration in the Spark configuration tab of the Run view of your Job. This configuration is effective on a per-Job basis.

Talend Studio allows you to run your Spark Jobs on a Spark Universal distribution in any of the following modes and environments:

Mode or environment	Description
Cloudera Data Engineering	Talend Studio submit Jobs and collects the execution information of your Job from Cloudera Data Engineering service. For more information, see Defining Cloudera Data Engineering connection parameters with Spark Universal.
Databricks	Talend Studio submits Jobs and collects the execution information of your Job from Databricks. The Spark driver runs either on a job Databricks cluster or on an all-purpose Databricks cluster on GCP, AWS, or Azure. For more information, see Defining Databricks connection parameters with Spark Universal.
Dataproc	Talend Studio submits Jobs and collects the execution information of your Job from Dataproc. For more information, see Defining Dataproc connection parameters with Spark Universal.
Kubernetes	Talend Studio submits Jobs and collects the execution information of your Job from Kubernetes. The Spark driver runs on the cluster managed by Kubernetes and can run independently from Talend Studio. For more information, see Defining Kubernetes connection parameters with Spark Universal.
Local	Talend Studio builds the Spark environment in itself at runtime to run the Job locally in Talend Studio. With this mode, each processor of the local machine is used as a Spark worker to perform the computations. For more information, see Defining Local connection parameters with Spark Universal.
Spark-submit scripts	Talend Studio submits Jobs and collects the execution information of your Job from Yarn and ApplicationMaster of your cluster, typically an HPE Data fabric cluster. The Spark driver runs on the cluster and can run independently from Talend Studio. For more information, see Defining Spark-submit scripts connection parameters with Spark Universal
Standalone	Talend Studio connects to a Spark-enabled cluster to run the Job from this cluster. For more information, see Defining Standalone connection parameters with Spark Universal.
Synapse	Talend Studio submits Jobs and collects the execution information of your Job from Azure Synapse Analytics. For more information, see Defining the Azure Synapse Analytics connection parameters with Spark Universal.
Yarn cluster	Talend Studio submits Jobs and collects the execution information of your Job from Yarn and ApplicationMaster. The Spark driver runs on the cluster and can run independently from Talend Studio. For more information, see Defining Yarn cluster connection parameters with Spark Universal.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here

Defining Spark Universal connection details in the Spark configuration view

In this section

Did this page help you?