What is hybrid for Talend Cloud - Cloud

Talend Cloud Hybrid Installation Guide for Linux

English (United States)
Talend Cloud
Talend Data Preparation
Talend Data Stewardship
Talend Management Console
Installation and Upgrade

Talend Cloud lets you install and host Talend Data Stewardship and Talend Data Preparation applications on-premises. This setup allows you to store sensitive data behind your firewall, while still managing your users and the rest of your platform from Talend Cloud.

Hybrid architecture

Applications that are available for hybrid setup are:

  • Talend Data Stewardship from version 7.2.
  • Talend Data Preparation from version 7.2.

You can enable hybrid individually for each of these applications. While installed on premises, they are still bound to your Talend Cloud license. Platform administration, such as user and account management, remains in Talend Cloud for all applications.

All communication between hybrid Talend Data Stewardship or Talend Data Preparation and Talend Cloud is initiated from the hybrid applications, and never from Talend Cloud to the hybrid applications. Consequently, you do not need to allow inbound communication from the Internet to your hybrid applications.

Note that even with applications located on different domains, logging in to Talend Cloud and switching between the apps works seamlessly.

As you are installing on-premises versions of the applications, refer respectively to the Talend Data Preparation User Guide and Talend Data Stewardship User Guide to learn more about the capabilities they offer.

How are users managed?

User management for hybrid applications is centralized in Talend Cloud Management Console, just as for any Talend Cloud application. With the appropriate roles and permissions for Talend Data Preparation and Talend Data Stewardship, users will be able to access the configured hybrid application.

For more information about user management in Talend Cloud, refer to the Talend Cloud Management Console documentation.

How can I define user preferences?

You can manage your profile preferences such as language preferences for hybrid applications directly in Talend Cloud under Profile preferences. Any change to profile preferences will be applied after reconnecting to Talend Cloud.

Can I migrate my current deployment to hybrid?

Hybrid is available from Talend Cloud Platform level licenses.

Hybrid setup is available for new Talend Cloud deployments as well as for migrating existing on-premises Talend Data Stewardship and Talend Data Preparation on-premises deployments.

Migrating an existing deployment to hybrid is different from a classic on-premises migration and requires performing specific steps described in How to migrate on-premises Data Preparation and Data Stewardship to hybrid mode on Talend Cloud.

For fresh installations, refer to How to enable hybrid for Talend Cloud from scratch.

An alternative installation method via RPM and Ansible is also available for the applications mentioned earlier in this document. Refer to https://github.com/Talend/ansible-talend-platform for more information.

Will my hybrid apps be automatically upgraded with Talend Cloud?

No. Setting up hybrid for these applications requires you to install and update them, as well as their direct dependencies, manually, according to the on-premises product release planning and lifecycle. Automatic Talend Cloud updates will not be available for the hybrid applications. Direct dependencies include Kafka, MongoDB, Tomcat and Talend Dictionary Service if you have the license for it.

Refer to the Support Statements page to learn about the supported versions for hybrid.

Are there functional limitations for hybrid applications compared to their Talend Cloud counterparts?

By setting up hybrid, some limitations can apply.

For Talend Data Preparation:

  • Changes done in the profile preferences from Talend Cloud Management Console, such as selecting the interface language, require relogging for the change to take effect.
  • The German language is not supported on the hybrid mode of Talend Data Preparation.

In addition, with an hybrid setup, you will not be able to benefit from several new features and improvement in the Talend Data Preparation experience brought by the Talend Cloud common inventory of datasets:

  • Concept of reusable connections. To create a remote dataset, stored in Salesforce or Amazon S3 for example, you would usually use the Add dataset button, select the platform, and enter your connection information each time. In the Cloud version, you can set up this connection information only once, save it as a reusable Connection, and reuse it to create new datasets any time. These connections to your datastores are listed in the new Connections tab.
  • Extended native connectivity. A whole new range of connection types is now available natively in the application. Create preparations on datasets from databases, file systems, distributed systems, platforms and more.
  • Direct upload for local files. In the Datasets page, a new Drop a file or browse button is available, allowing you to quickly and easily import your local files. You can either drag and drop your file on the datasets page, or browse using the explorer. A form then opens where you can set some configuration for the dataset, or just Auto-detect the parameters.
  • New indicators in the dataset list. When opening your list of datasets, you would benefit from new columns, containing new indicators.
    • First of all, a quality bar detailing the repartition of empty, valid, and invalid records across the dataset. Point your mouse over each color to access the exact percentage and records number.
    • In addition, a new feature in the application allows you to apply a rating score on the dataset based on its quality and other personal criteria. The rating score that you can see in the dataset list is an average of the scores applied by all the users who have access to the dataset.
    • Finally, the trust score, represented by the shields icon, give you at a glance an overall score of the quality and completeness of your dataset. It aggregates several indicators such as the quality itself, or the presence of a rating score or certification.
  • More flexible sharing. The new sharing dialog allows you to assign a role to other users when sharing connections, datasets, or preparation folders with other users. The Viewer, Editor, or Owner roles all come with different levels of permissions on the actions that can be performed on shared objects. To assign a specific role to a collaborator, open the sharing dialog, select the user or group you want to share your object with, and click Add as.... The role you have assigned someone can be updated anytime, and you can even remove yourself from the list of contributors on a specific shared object.
  • An important change regarding the preparation creation process in the Cloud version is that it is no longer possible to add a preparation based on a dataset imported on the fly. When using the Add preparation button, you will only be able to create a preparation based on one of your existing datasets. However another way to easily create preparation has been introduced. Directly from your list of datasets, point your mouse over a dataset, and select the Talend Cloud Data Preparation icon. Click Add to start cleansing your data right away.
  • Dataset provenance and destination. In addition to its role as preparation creation shortcut, the Talend Cloud Data Preparation button that appears when pointing your mouse over a dataset has another useful purpose. When you click this icon for a given dataset, you will be able to see all the preparations that have been created from it, along with their creator, giving you more insight on how your data is used.
  • Make line as header. This function will not be available from the functions panel of your preparations anymore. Instead, you can select which row to use as header for your dataset in the dataset properties at import time.
  • Excel files with multiple worksheets. When uploading an excel file that contains multiple sheets, only the first one will be imported by default, but you can choose with sheet to import in the dataset creation form. However, the Auto-detect feature is not supported for such files.

For Talend Data Stewardship:

  • You cannot set up email notifications for Talend Data Stewardship from Talend Cloud.
  • Changes done in the profile preferences from Talend Cloud Management Console, such as selecting the interface language, require relogging for the change to take effect.
  • The data quality rules are not available on the hybrid mode of Talend Data Stewardship.

As you are installing on-premise versions of the applications, refer respectively to the Talend Data Preparation User Guide and Talend Data Stewardship User Guide to learn more about the capabilities they offer.

Can I setup multiple instances of an hybrid application?

Yes. You can imagine having dev, staging and production instances of an hybrid app. However, the configuration allows to setup seamless navigation between Talend Cloud and each hybrid application for one instance only, for example the production instance. Users will have to access other instances by typing their URL.

Even with multiple instances of an app, user management is still centralized in Talend Cloud Management Console.

Can Talend Data Preparation and Talend Data Stewardship share the same Talend Dictionary Service?

Yes, as long as Talend Data Preparation and Talend Data Stewardship are both in hybrid mode.

If you have only one of them set to hybrid, then the hybrid application will need a specific Talend Dictionary Service installation.