Adding the latest Big Data Platform dynamically (Dynamic Distributions) - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

In Talend Studio, if there is no support for the Big Data Platform you want to use, follow the procedure explained below to add this distribution yourself to make it available to Studio.

In the current Studio version, you can use this procedure to add Cloudera and Hortonworks distributions only. This procedure uses Cloudera Data Platform to demonstrate how to add a dynamic distribution to Studio.

With this dynamic support feature, you are empowered with more agility and flexibility to use a Cloudera or a Hortonworks version that was not released the moment your Talend Studio was released, by adding this version yourself.

The dynamic distributions added this way are generally minor versions of a Talend-certified major release of your distribution. Talend relies on the distribution vendors' compatibility statements to ensure the compatibility of Studio with these minor versions and, by this measure, provides official support for the use cases that can be produced on these minor versions as well as on the Talend-certified versions. For more information about the Talend-certified distribution versions and Talend general support policy about the certified and the compatible versions, see Talend Installation Guide.
  • On the version list of the distributions, some versions are labeled Builtin. These versions were added by Talend via the Dynamic distribution mechanism and delivered with Studio when Studio was released. They are certified by Talend, thus officially supported and ready to use.
Note: For Cloudera distribution, Talend recommends you to use CDP 7.x built-in distributions rather than CDP dynamic distribution.

Procedure

  1. In the Integration perspective, click File > Edit Project properties to open the Project Settings dialog box.
  2. Click General > Dynamic distribution settings to open its view.
  3. From the Distribution drop-down list, select Cloudera Data Platform.
  4. Set up your local Nexus repository to store the dynamic distribution jar files to be downloaded.
    While not mandatory, this step allows other users or other Studio instances to download the same jar files much faster.
    1. Set up a proxy on your local Nexus repository and link this proxy to the dedicated Talend proxy: https://talend-update.talend.com/nexus/content/groups/dynamicdistribution/.
      The credentials to be used to connect to this Talend proxy are:
      • Username: studio-dl-client
      • Password: studio-dl-client

      When you create your local proxy, you need to define the credentials specific to this local proxy. For more information on how to create a Nexus proxy, see Proxy settings from the Nexus official documentation.

    2. Click General > Artifact Proxy Setting to open its view, select the Override default setup check box to activate the Repository field.
    3. In the Repository field, enter the URL of your local proxy and the credentials defined for this proxy.
    4. Click Check Connection to verify its connection status.
  5. Go back to the Dynamic distribution settings view and click Dynamic distribution setup.
  6. Select the Create new dynamic configuration radio button and click Refresh to display, on the Version drop-down list, the Cloudera versions that are available in the connected Cloudera repository.
  7. Select the Cloudera version for which you want to generate the configuration to be used by Studio.
  8. Click Finish.

    The Studio starts to retrieve the configuration files for this distribution from the Cloudera repository. This retrieval may take a while.

    Once done, the [Dynamic distribution setup] wizard is automatically closed to bring you back to the Dynamic distribution settings view. The newly generated "dynamic" distribution for the version you previously selected is displayed on the Version list.

  9. You can repeat the operations to add more versions if needs be. Otherwise, click Apply and Close to close the Project settings dialog box.

Results

You can then use this new version the same way as you use the built-in distributions provided along with Talend Studio which means you can:
  • Set up the connection to this dynamic distribution in the Repository and reuse this connection in Talend Jobs.

  • Directly use this dynamic distribution in your Jobs. If you build your Job to generate executable files in a zip and need to run the executable files on Windows, do not use the .bat script but use the .ps1 script.

Although you can usually export a Job with its dependencies such as a connection defined in the Repository, the connection to a dynamic distribution cannot be exported the same way. If you need to export such a connection, see Export or import the configuration of a dynamic Big Data platform distribution.