Skip to main content
Close announcements banner

Defining Spark Universal connection details in the Spark configuration view

Complete the Spark Universal connection configuration in the Spark configuration tab of the Run view of your Job. This configuration is effective on a per-Job basis.

Talend Studio allows you to run your Spark Jobs on a Spark Universal distribution in any of the following modes and environments:
Mode or environment Description
Cloudera Data Engineering Talend Studio submit Jobs and collects the execution information of your Job from Cloudera Data Engineering service.

For more information, see Defining Cloudera Data Engineering connection parameters with Spark Universal.

Databricks Talend Studio submits Jobs and collects the execution information of your Job from Databricks. The Spark driver runs either on a job Databricks cluster or on an all-purpose Databricks cluster on GCP, AWS, or Azure.

For more information, see Defining Databricks connection parameters with Spark Universal.

Dataproc Talend Studio submits Jobs and collects the execution information of your Job from Dataproc.

For more information, see Defining Dataproc connection parameters with Spark Universal.

Kubernetes Talend Studio submits Jobs and collects the execution information of your Job from Kubernetes. The Spark driver runs on the cluster managed by Kubernetes and can run independently from Talend Studio.

For more information, see Defining Kubernetes connection parameters with Spark Universal.

Local Talend Studio builds the Spark environment in itself at runtime to run the Job locally in Talend Studio. With this mode, each processor of the local machine is used as a Spark worker to perform the computations.

For more information, see Defining Local connection parameters with Spark Universal.

Spark-submit scripts Talend Studio submits Jobs and collects the execution information of your Job from Yarn and ApplicationMaster of your cluster, typically an HPE Data fabric cluster. The Spark driver runs on the cluster and can run independently from Talend Studio.

For more information, see Defining Spark-submit scripts connection parameters with Spark Universal

Standalone Talend Studio connects to a Spark-enabled cluster to run the Job from this cluster.

For more information, see Defining Standalone connection parameters with Spark Universal.

Synapse Talend Studio submits Jobs and collects the execution information of your Job from Azure Synapse Analytics.

For more information, see Defining the Azure Synapse Analytics connection parameters with Spark Universal.

Yarn cluster Talend Studio submits Jobs and collects the execution information of your Job from Yarn and ApplicationMaster. The Spark driver runs on the cluster and can run independently from Talend Studio.

For more information, see Defining Yarn cluster connection parameters with Spark Universal.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!