What's new in R2022-09 - Cloud

Talend Cloud Release Notes

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend API Designer
Talend API Tester
Talend Data Inventory
Talend Data Preparation
Talend Data Stewardship
Talend Management Console
Talend Pipeline Designer
Content
Installation and Upgrade
Release Notes

Big Data: new features

Feature

Description

Support for Spark Universal 3.3.x in local mode You can now run your Spark Batch and Streaming Jobs using Spark Universal with Spark 3.3.x in Local mode. You can configure it either in the Spark Configuration view of your Spark Jobs or in the Hadoop Cluster Connection metadata wizard.

Talend Data Mapper Big Data components and Talend Data Quality components were not compatible with Spark 3.3.x at the time of this release, so Spark Universal 3.2.x remains the default version for this patch. Spark 3.3.x is supported from version R2020-10 onwards.

Support for Databricks runtime 10.x and onwards with GCP on Spark Universal 3.2.x You can now run your Spark Batch and Streaming Jobs on transient and interactive Databricks clusters on Google Cloud Platform (GCP) using Spark Universal with Spark 3.2.x. You can configure it either in the Spark Configuration view of your Spark Jobs or in the Hadoop Cluster Connection metadata wizard.

When you select this mode, Talend Studio is compatible with Databricks 10.x and onwards versions.

Support for context group to switch from transient to interactive clusters on Databricks in Spark Jobs You can now switch from transient to interactive clusters on Databricks using context group. When you configure the connection to the Databricks cluster, you can now add specific context groups that are used when you run your Spark Jobs. You can configure it either in the Spark Configuration view of your Spark Jobs or in the Hadoop Cluster Connection metadata wizard, but it is recommended to configure it in the Hadoop Cluster Connection metadata wizard.
Enhancement of tFileInputDelimited to support dynamic schema in Spark Batch Jobs You can now add a dynamic column to the schema of tFileInputDelimited in your Spark Batch Jobs. The dynamic schema functionality allows you to configure a schema in a non-static way, so you won't have to redesign your Spark Job for future schema alteration while ensuring it will work all the time.
Support for MongoDB v4+ for Spark Batch 3.1 and onwards Talend Studio now supports MongoDB v4+ with Spark 3.1 and onwards versions for the following components in your Spark Batch Jobs using Dataset:
  • tMongoDBInput
  • tMongoDBOutput
  • tMongoDBConfiguration

Data Integration: new features

Feature

Description

SSO login to Talend Studio via Talend Cloud If you are working with Talend Cloud, you can now log in to Talend Studio via Talend Cloud, either with SSO or in the regular way. The login screen provides the following two options:
  • Log in with Talend Cloud: allows you to log in to Talend Studio via Talend Cloud.

    Note that you can set the default region for Talend Cloud by adding the -Dtalend.tmc.datacenter=<region> parameter in the Talend Studio .ini file, and <region> is the abbreviation of the region, which can be found in the Talend Cloud URL.

  • Other login mode: allows you to log in to Talend Studio in the traditional mode. Use this option if you are not working with Talend Cloud.
Note: Logging in to Talend Studio via Talend Cloud is currently unsupported for Linux on ARM64.

For more information, see Launching Talend Studio.

Support of basic authentication to access update repositories Talend Studio now supports basic authentication for update repositories based on the Eclipse secure storage. For more information, see Basic authentication for update repositories in Talend Studio.
New option to clean up obsolete libraries upon update installation You can now configure Talend Studio to automatically clean up obsolete libraries upon update installation by:
  • setting the -Dtalend.studio.m2.clean parameter to true in the Talend Studio .ini file, or
  • selecting the Clean up libraries check box in the Preferences > Talend > Update settings view, or
  • selecting the Clean up libraries check box in the update installation wizard.

This helps save disk space and reduce noise triggered by security tools scanning for vulnerabilities in libraries. Note that:

  • The Clean up libraries check box is available only after you install the 8.0 R2022-09 Studio monthly update. To automatically clean up obsolete libraries upon installing the 8.0 R2022-09 Studio monthly update itself, add the -Dtalend.studio.m2.clean parameter in the .ini file and restart Talend Studio before installing the update.
  • After cleaning up obsolete libraries, you may need to redownload all the previously installed third-party libraries.

For more information, see Updating Talend Studio.

Email used as Git commit author when working with Talend Cloud If you are working on a project managed by Talend Cloud Management Console, the email instead of the login name in Talend Cloud Management Console is now used as the Git author and committer when committing your changes to Git in Talend Studio.
New components to connect to Google Bigtable to store or retrieve data Talend Studio now supports the following new components to connect to Google Bigtable to store or retrieve data:
  • tBigtableInput
  • tBigtableOutput
  • tBigtableClose
  • tBigtableConnection
New option in Oracle components and metadata wizard to add globalization support Oracle components and the metadata wizard provide the Support NLS option for enabling globalization support for Oracle 18 and higher versions.
New component to write to Workday clients This release provides the tWorkdayOutput component, which allows you to write data to a Workday client.
Support of Microsoft Exchange authentication in tPOP The tPOP component provides the Microsoft Exchange authentication mode, which allows you to fetch messages in the Microsoft Exchange authentication mode.
ID based pagination for the tDataStewardshipTaskInput component Activate this option to improve performance when fetching and sorting Talend Cloud Data Stewardship tasks.
Extra log attributes via MDC when using Log4j 2 When using Apache Log4j 2 in your Jobs, the Log4j 2 MDC (Mapped Diagnostic Context) is now populated with key-value pairs. These extra log attributes help you easily identify the thread in which log data is generated when all log messages from multiple threads are written into a single file. For more information about Log4j 2 MDC, see Log4j 2 API - Thread Context.

To include the extra log attributes when writing logs, you need to configure the Log4j 2 template in Talend Studio. For more information, see Activating and configuring Log4j.

Data Mapper: new features

Feature

Description

Flat to hierarchical map

A new type of map allows you to create a hierarchical structure from a flat structure and map them. You can edit the output structure to create arrays, records and group your data based on a specific element. Once the output structure is defined, a map is created and the input is automatically mapped to the output.

Data Quality: new features

Feature

Description

Support of Spark 3.2 in local mode and on Databricks You can now run DQ components using Apache Spark 3.2 in local mode and on Databricks.
A few exceptions:
  • tMatchIndex and tMatchIndexPredict: In local mode, they still work for Apache Spark 2.4 only. They do not work on Databricks.
  • tDataEncrypt and tDataDecrypt: They do not work on Databricks with Apache Spark 3.1 and greater.
  • tKMeansStrModel and tPredictCluster: The Streaming components do not work on Databricks with Apache Spark 3.2.

Continuous Integration: new features

Feature

Description

Talend CI Builder upgraded to version 8.0.9 Talend CI Builder is upgraded from version 8.0.8 to version 8.0.9.

Use Talend CI Builder 8.0.9 in your CI commands or pipeline scripts from this monthly version onwards until a new version of Talend CI Builder is released.

Support of basic authentication for update repositories If basic authentication is enabled in Talend Studio, you can now use these parameters in your CI commands to safely access the Talend update repositories:
  • -Declipse.keyring and -Declipse.password (Eclipse secure storage credentials)
  • -Dtalend.studio.p2.base.user and -Dtalend.studio.p2.base.pwd (Talend Studio base repository credentials)
  • -Dtalend.studio.p2.update.user and -Dtalend.studio.p2.update.pwd (Talend Studio update repository credentials)

For more information, see Basic authentication for update repositories in Talend Studio and CI builder-related Maven parameter.