Run profiles - Cloud

Talend Cloud Pipeline Designer User Guide

EnrichVersion
Cloud
EnrichProdName
Talend Cloud
EnrichPlatform
Talend Pipeline Designer
task
Administration and Monitoring > Monitoring executions
Administration and Monitoring > Monitoring logs
Data Governance > Filtering data
Data Quality and Preparation > Filtering data
Data Quality and Preparation > Managing datasets
Deployment > Deploying > Executing Pipelines
Design and Development > Designing Pipelines

Talend Cloud Pipeline Designer allows you to select a pre-defined and pre-configured execution environment, the Remote Engine Gen2, that was previously created by an administrator in Talend Cloud Management Console.

See the Talend Cloud Management Console User Guide for more information on how to create the engine and its profiles from Talend Cloud Management Console. See below a list of the run profiles available in the web application.

Note: To be able to use Spark and Hadoop with Talend Cloud Pipeline Designer, you must have a Talend Cloud subscription with Big Data.

Remote Engine Gen2 run profiles

Description

Standard/Local Spark

Default run profile: the Apache Spark runner runs locally on your machine. This profile is meant for developing and testing purposes.

Big Data/Spark on Yarn

The Apache Spark runner in cluster mode on EMR 5.x (Hadoop 2.7 YARN).

Databricks This run profile allows you to run Spark pipelines on a Databricks cluster. For instructions on how to set up a Databricks cluster to be used with Talend Cloud Pipeline Designer, read this procedure.
Note: The first execution of a pipeline on the cluster takes more time than the following ones because dependencies are deployed on Databricks File System (DBFS). To manually upload these dependencies to DBFS and significantly reduce the first execution duration, follow this procedure.

Advanced

A customizable execution profile in which you can enter your runner JSON properties.