Talend Data Integration functional architecture

Talend Data Integration Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Data Integration
task
Installation and Upgrade
Design and Development

The Talend Data Integration functional architecture is an architectural model that identifies Talend Data Integration functions, interactions and corresponding IT needs. The overall architecture has been described by isolating specific functionalities in functional blocks.

The following chart illustrates the main architectural functional blocks.

The different types of functional blocks are:

  • The Clients block includes one or more Talend Studio(s) and Web browsers that could be on the same or on different machines.

    From the Studio, you can carry out data integration processes regardless of the level of data volumes and process complexity. Talend Studio allows you to work on any project for which you have authorization.

    From the Web browser, you connect to:

    • the remotely based Talend Administration Center through a secured HTTP protocol.

    • the Talend Data Preparation Web application, where you import your data, from local files or other sources, and cleanse or enrich it by creating new preparations on this data

    • the Talend Data Stewardship Web application, where campaign owners and data stewards manage campaigns and tasks

    • optionally, the Dictionary Service server to add, remove or edit the semantic types used on data in the Web applications.

  • The Server block includes:

    • a web-based application server, Talend Administration Center, which enables the management and administration of all projects:
      • administration metadata (user accounts, access rights and project authorization for example) is stored in the Administration database.
      • data of project items (Jobs, Business Models and Routines for example) is stored in the SVN or Git server.
    • servers used by the Talend Web applications, namely Talend Data Preparation, Talend Data Stewardship and Talend Dictionary Service, and the Identity Access Management server which is used to enable Single Sign-On between those applications.

  • The Repositories block includes the SVN or Git server and the Nexus repository.

    The SVN or Git server is used to centralize all project items like Jobs and Business Models shared between different end-users, and is accessible from the Talend Studio to develop project items and from Talend Administration Center to publish, deploy and monitor the project items.

    The Nexus repository is used to store:

    • Software Updates available for download,

    • Jobs that are published from the Talend Studio and are ready to be deployed and executed.

  • The Talend Execution Servers block includes one or more execution servers, deployed inside your information system. Talend Jobs are deployed to the Job servers through the Administration Center's Job Conductor to be executed on a scheduled time, date, or event.

  • The Databases block includes the Administration, the Audit and the Monitoring databases. The Administration database is used to manage user accounts, access rights and project authorization, and so on. The Audit database is used to evaluate different aspects of the Jobs implemented in projects developed in Talend Studio with the aim of providing solid quantitative and qualitative factors for process-oriented decision support.