Talend Cloud Data Preparation architecture - Cloud

Talend Cloud Data Preparation User Guide

Talend Documentation Team
Talend Cloud
Administration and Monitoring > Managing connections
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Managing datasets
Talend Data Preparation

This architecture diagram identifies the functional blocks of Talend Cloud Data Preparation

The diagram is divided into two main parts: the local network and the cloud infrastructure.

Local network

The local network includes a web browser, Talend Studio, and a Remote Engine Gen2.

From your web browser, you can access the Talend Cloud Data Preparation application.

From Talend Studio, you can:
  • Publish data integration Jobs to Talend Cloud Management Console as Tasks, make them available to web users, and run them in the cloud.
  • Benefit from the Talend Cloud Data Preparation features through the use of the tDatasetInput, tDatasetOutput, and tDataprepRun components. You can create datasets from various databases and export them in Talend Cloud Data Preparation, or leverage a preparation directly in a data integration Job or Spark Job.

The Remote Engine Gen2 is used to run preparations and objects from the other Talend Cloud applications on premises, as well as creating connections and fetching data samples.

Cloud infrastructure

The cloud infrastructure includes Talend Cloud Data Preparation that relies on the Dataset service, and the Cloud Engine for Design.
  • The Dataset service is what provides the unified dataset list within Talend Cloud.
  • In Talend Cloud Management Console, you can administrate roles, users, projects, and licences. You can create new users for the cloud applications and assign them to custom groups. You can then define roles and assign them to your users. Talend Cloud Management Console is also used to import your license files and create projects to collaborate on in Studio. In addition, you can enable data and file transfer, data integration, and access to shared data sources for web users. You can, for example, import and use preconfigured sample Tasks, or design Tasks that automate the exchange and synchronization of data between applications.
  • In Talend Cloud Data Preparation, you can import your data from local files or other sources, and cleanse or enrich it by creating new preparations.
  • In Talend Dictionary Service, you can add, remove, or modify the semantic categories that are applied to each column in your data when opened in Talend Cloud Data Preparation.

The Cloud Engine for Design is used to run artifacts, tasks and preparations in the cloud, as well as creating connections and fetching data samples.