The diagram is divided into two main parts: the Talend Cloud infrastructure and the customer's local network or Virtual Private Cloud (VPC).
Cloud infrastructure
- In Talend Cloud Management Console, you can administrate roles, users, projects, engines and licenses. Talend Cloud Management Console is also used to define the Remote Engine Gen2 as well as the corresponding run profiles in which you can customize the resources allocated to the executions.
- The Dataset service is what provides the unified dataset
list within Talend Cloud.
Talend Cloud Data Inventory is the
central place where you access and maintain your dataset collection.
Talend Cloud Data Preparation and Talend Cloud Pipeline Designer are the two other applications that benefit from the common dataset inventory, and allow you to cleanse or transform your data.
-
The Cloud Engine for Design and its corresponding run profile come embedded by default in Talend Cloud Management Console to help users quickly get started with the apps, but it is recommended to install the secure Remote Engine Gen2 for advanced processing of data.
These engines are used to run artifacts, tasks, preparations and pipelines in the cloud, as well as creating connections and fetching data samples.
Customer's Virtual Private Cloud
- The local Spark engine (default)
- A Spark on Yarn cluster
- Serverless clusters (local Spark engine, Spark-on-Yarn cluster, etc.)