Talend Cloud Data Inventory architecture - Cloud

Talend Cloud Data Inventory User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Inventory
Content
Administration and Monitoring > Managing connections
Data Governance
Data Quality and Preparation > Enriching data
Data Quality and Preparation > Identifying data
Data Quality and Preparation > Managing datasets
Last publication date
2024-02-28

This architecture diagram identifies the functional blocks of Talend Cloud Data Inventory.

Talend Cloud Data Inventory architecture.

This diagram is divided into two main parts: the local network and the cloud infrastructure.

Customer-controlled environment

The customer-controlled environment includes a web browser that is used to access and manage your data assets in Talend Cloud Data Inventory and a Remote Engine Gen2 to run objects from the other Talend Cloud applications, as well as creating connections, fetching data samples and enabling Data APIs. This environment can also include other third-party applications to potentially consume Data APIs created from datasets.

Talend Cloud environment

The cloud environment includes the cloud applications, that rely on the Dataset service for some of them, and the Cloud Engine for Design.

  • The Dataset service is what provides the unified dataset list within Talend Cloud.
  • Talend Cloud Data Inventory is the central place where you access and maintain your dataset collection. You will be able to quickly search your data, assess its quality, rate, document, or share it with other data consumers.
  • Talend Cloud Data Preparation and Talend Cloud Pipeline Designer are the two other applications that benefit from the common dataset inventory, and allow you to cleanse or transform your data.
  • In Talend Management Console, you can administrate roles, users, projects, and licenses. You can create new users for the cloud applications and assign them to custom groups. You can then define roles and assign them to your users.
  • The Cloud Engine for Design is used to run artifacts, tasks, preparations, and pipelines in the cloud, create connections and fetch data samples, as well as enabling Data APIs.