Talend Cloud Data Preparation architecture - Cloud

Talend Cloud Data Preparation User Guide

Version
Cloud
Language
English (United States)
Product
Talend Cloud
Module
Talend Data Preparation
Content
Administration and Monitoring > Managing connections
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Managing datasets

This architecture diagram identifies the functional blocks of Talend Cloud Data Preparation

The diagram is divided into two main parts: the local network and the cloud infrastructure.

Local network

The local network includes a web browser, Talend Studio, a Remote Engine Gen 1, and a Remote Engine Gen2.

From your web browser, you can access the Talend Cloud Data Preparation application.

From Talend Studio, you can benefit from the Talend Cloud Data Preparation features through the use of the tDatasetInput, tDatasetOutput, and tDataprepRun components. You can create datasets from various databases and export them in Talend Cloud Data Preparation, or leverage a preparation directly in a data integration Job or Spark Job.

The Remote Engine Gen 1 is used to run the Jobs that use the Data Preparation components, and run artifacts and tasks on premises.

The Remote Engine Gen2 is used to run objects from the Talend Cloud applications, as well as creating connections and fetching data samples.

Cloud infrastructure

The cloud infrastructure includes Talend Cloud Data Preparation that relies on the Dataset service, and the Cloud Engine for Design.
  • The Dataset service is what provides the unified dataset list for Talend Cloud Data Preparation, Talend Cloud Data Inventory and Talend Cloud Pipeline Designer.
  • In Talend Cloud Management Console, you can administrate roles, users, projects, and licences. You can create new users for the cloud applications and assign them to custom groups. You can then define roles and assign them to your users. Talend Cloud Management Console is also used to import your license files and create projects to collaborate on in Studio. In addition, you can enable data and file transfer, data integration, and access to shared data sources for web users. You can, for example, import and use preconfigured sample Tasks, or design Tasks that automate the exchange and synchronization of data between applications.
  • In Talend Cloud Data Preparation, you can import your data from local files or other sources, and cleanse or enrich it by creating new preparations.
  • In Talend Dictionary Service, you can add, remove, or modify the semantic categories that are applied to each column in your data when opened in Talend Cloud Data Preparation.

The Cloud Engine for Design is used to run artifacts, tasks and preparations in the cloud, as well as creating connections and fetching data samples.