What's new in R2021-03 - Cloud

Talend Cloud Release Notes

EnrichVersion
Cloud
EnrichProdName
Talend Cloud
EnrichPlatform
Talend API Designer
Talend API Tester
Talend Data Inventory
Talend Data Preparation
Talend Data Stewardship
Talend Management Console
Talend Pipeline Designer
task
Installation and Upgrade
Release Notes

Big Data: new features

Feature

Description

Support of Databricks 7.3 LTS with Spark 3.0 components
You can now run Spark Batch and Spark Streaming Jobs on Databricks 7.3 LTS distribution, both on AWS and on Azure for interactive and transient clusters, with Spark 3.0. The following components are supported:
  • tAvroInput and tAvroOutput
  • tAzureFSConfiguration
  • tFileInputDelimited and tFileOutputDelimited
  • tFileInputJSON and tFileOutputJSON
  • tFileInputParquet and tFileOutputParquet
  • tFileInputXML and tFileOutputXML
  • tFixedFlowInput
  • tLogRow
  • tS3Configuration

This feature is not in technical preview anymore.

Support of CDP Public Cloud Data Hub on AWS You can now configure a CDP Public Cloud Data Hub instance on AWS on Cloudera Management Console to run your Job on a remote JobServer in Talend Studio. In that way, this allows you to configure your cluster by choosing directly a Data Hub cluster definition according to your needs (for example Data Engineering for AWS or Data Discovery and Exploration for AWS). Then you only have to import all the configuration files from that cluster in Talend Studio to launch your Job.

This feature allows you to take the advantage from the elasticity of a cloud cluster directly into CDP Public Cloud.

Support of Service Account and OAuth2 Access Token authentification for Google Cloud Platform distribution in Spark Batch Jobs You can now authenticate to Google Cloud Platform either with Service Account or with OAuth2 Access Token in Spark Batch Jobs using Dataproc 1.4 version. These authentication methods are available in the Spark configuration view of your Spark Batch Job.
Update of tCollectAndCheck component in Spark Jobs You can now directly check data input with tCollectAndCheck in Spark Jobs. The following types of input are supported:
  • Text
  • Parquet
  • MySQL
  • Hive
  • Delta
  • Snowflake
  • Redshift
  • JDBC
  • HBase
For Spark Batch Jobs, the component is now connected as follow:

The component checks that there is exactly the number of rows and that the rows value are correct in Spark Batch Jobs.

For Spark Streaming Jobs, the component checks the data at the end of the Job execution, after a timeout, as follow:

The component checks that the values are correct. It is accepted that the input can be null or that the rows are not unique in Spark Streaming Jobs.

Data Integration: new features

Feature

Description

Enhancement of code dependency management Talend Studio now allows you to create custom routine jars, package multiple user routines into a custom routine jar, and set up custom routine jar dependencies on Jobs and Joblets.

By setting up custom routine jar dependencies on Jobs and Joblets, the code dependencies on Jobs and Joblets become more explicit and this can help reduce dependency conflicts.

Note: By default, user routines migrated from any previous version of Talend Studio are all saved under the new Code > Global Routines node.
tELTOracleMap enhancement The ELT Map editor of the tELTOracleMap component now provides a new dialog box Property Settings, which contains two options:
  • Delimited identifiers: With this check box selected, double quotes will be added for all output column names to support delimited identifiers.
  • Automatic alias: With this check box selected, if a schema column has a different name than its database column, an alias will be automatically created in the SQL query for that column.
Enhancement in Git conflict resolution for tELTMap Talend Studio now supports comparing conflicts for tELTMap in the Job compare editor.
Google Drive components: new Read Timeout option The Read Timeout option has been added to the Advanced settings view.
This option is available for tGoogleDriveConnection, tGoogleDriveCopy, tGoogleDriveCreate, tGoogleDriveDelete, tGoogleDriveGet, tGoogleDriveList and tGoogleDrivePut.
Support of Mongo DB 4.4 API for MongoDB and CosmosDB components You can now connect MongoDB and CosmosDB components to the MongoDB 4.4 version.
Performance enhancement for MongoDB and CosmosDB components in Standard Jobs MongoDB and CosmosDB components now provide the following options in Standard Jobs:
  • For input components, you can now define set of fields in the documents to be returned from the database with the Specify fields to return option.
  • For output components, when you want to perform action on data, you now have the ability to delete all the documents in the collection to be used with the Delete all documents option.

New SingleStore components

The following three new SingleStore components are now available. They provide better performance for loading data to database tables.

  • tSingleStoreBulkExec
  • tSingleStoreOutputBulk
  • tSingleStoreOutputBulkExec

New option for granting AWS-predefined permissions to S3 resources

A new option, Canned Access Control, is now provided in the following components for you to grant AWS-predefined permissions to S3 resources:
  • tS3BucketCreate
  • tS3Copy
  • tS3Put

Data Mapper: new features

Feature

Description

EDIFACT importer You can now import UN/EDIFACT specifications as ZIP files to create structures. Talend Data Mapper supports specifications starting from the release D.96A.
XPath functions in Distinct Child Element property In the SimpleLoop function, if you select the Element XPath in the Distinct Option field, you can now use an XPath function to define distinct values.

ESB: new features

Feature

Description

Enhancement of code dependency management Talend Studio now allows you to create custom Bean jars and custom routine jars, bundle multiple Beans or user routines in a custom Bean jar or routine jar, and set up custom Bean jar or routine jar dependencies on Routes and Routelets.
By setting up custom Bean jar or routine jar dependencies on Routes and Routelets, the code dependencies become more explicit and this can help reduce dependency conflicts.