What's new in R2021-03 - 7.3

Talend Big Data products Release Notes

Version
7.3
Language
English (United States)
EnrichDitaval
Big Data
Product
Talend Big Data
Talend Big Data Platform
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Content
Installation and Upgrade
Release Notes

Big Data: new features

Feature

Description

Product

Support of Databricks 7.3 LTS with Spark 3.0 components
You can now run Spark Batch and Spark Streaming Jobs on Databricks 7.3 LTS distribution, both on AWS and on Azure for interactive and transient clusters, with Spark 3.0. The following components are supported:
  • tAvroInput and tAvroOutput
  • tAzureFSConfiguration
  • tFileInputDelimited and tFileOutputDelimited
  • tFileInputJSON and tFileOutputJSON
  • tFileInputParquet and tFileOutputParquet
  • tFileInputXML and tFileOutputXML
  • tFixedFlowInput
  • tLogRow
  • tS3Configuration

This feature is not in technical preview anymore.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support of CDP Public Cloud Data Hub on AWS You can now configure a CDP Public Cloud Data Hub instance on AWS on Cloudera Management Console to run your Job on a remote JobServer in Talend Studio. In that way, this allows you to configure your cluster by choosing directly a Data Hub cluster definition according to your needs (for example Data Engineering for AWS or Data Discovery and Exploration for AWS). Then you only have to import all the configuration files from that cluster in Talend Studio to launch your Job.

This feature allows you to take the advantage from the elasticity of a cloud cluster directly into CDP Public Cloud.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support of Service Account and OAuth2 Access Token authentification for Google Cloud Platform distribution in Spark Batch Jobs You can now authenticate to Google Cloud Platform either with Service Account or with OAuth2 Access Token in Spark Batch Jobs using Dataproc 1.4 version. These authentication methods are available in the Spark configuration view of your Spark Batch Job.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Update of tCollectAndCheck component in Spark Jobs You can now directly check data input with tCollectAndCheck in Spark Jobs. The following types of input are supported:
  • Text
  • Parquet
  • MySQL
  • Hive
  • Delta
  • Snowflake
  • Redshift
  • JDBC
  • HBase
For Spark Batch Jobs, the component is now connected as follow:

The component checks that there is exactly the number of rows and that the rows value are correct in Spark Batch Jobs.

For Spark Streaming Jobs, the component checks the data at the end of the Job execution, after a timeout, as follow:

The component checks that the values are correct. It is accepted that the input can be null or that the rows are not unique in Spark Streaming Jobs.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Integration: new features

Feature

Description

Product

Enhancement of code dependency management Talend Studio now allows you to create custom routine jars, package multiple user routines into a custom routine jar, and set up custom routine jar dependencies on Jobs and Joblets.

By setting up custom routine jar dependencies on Jobs and Joblets, the code dependencies on Jobs and Joblets become more explicit and this can help reduce dependency conflicts.

Note: By default, user routines migrated from any previous version of Talend Studio are all saved under the new Code > Global Routines node.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tELTOracleMap enhancement The ELT Map editor of the tELTOracleMap component now provides a new dialog box Property Settings, which contains two options:
  • Delimited identifiers: With this check box selected, double quotes will be added for all output column names to support delimited identifiers.
  • Automatic alias: With this check box selected, if a schema column has a different name than its database column, an alias will be automatically created in the SQL query for that column.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Enhancement in Git conflict resolution for tELTMap Talend Studio now supports comparing conflicts for tELTMap in the Job compare editor.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Google Drive components: new Read Timeout option The Read Timeout option has been added to the Advanced settings view.
This option is available for tGoogleDriveConnection, tGoogleDriveCopy, tGoogleDriveCreate, tGoogleDriveDelete, tGoogleDriveGet, tGoogleDriveList and tGoogleDrivePut.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support of Mongo DB 4.4 API for MongoDB and CosmosDB components You can now connect MongoDB and CosmosDB components to the MongoDB 4.4 version.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Performance enhancement for MongoDB and CosmosDB components in Standard Jobs MongoDB and CosmosDB components now provide the following options in Standard Jobs:
  • For input components, you can now define set of fields in the documents to be returned from the database with the Specify fields to return option.
  • For output components, when you want to perform action on data, you now have the ability to delete all the documents in the collection to be used with the Delete all documents option.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

New SingleStore components

The following three new SingleStore components are now available. They provide better performance for loading data to database tables.

  • tSingleStoreBulkExec
  • tSingleStoreOutputBulk
  • tSingleStoreOutputBulkExec

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

New option for granting AWS-predefined permissions to S3 resources

A new option, Canned Access Control, is now provided in the following components for you to grant AWS-predefined permissions to S3 resources:
  • tS3BucketCreate
  • tS3Copy
  • tS3Put

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Mapper: new features

Feature

Description

Product

EDIFACT importer You can now import UN/EDIFACT specifications as ZIP files to create structures. Talend Data Mapper supports specifications starting from the release D.96A.

Talend Big Data Platform

Talend Real-Time Big Data Platform

XPath functions in Distinct Child Element property In the SimpleLoop function, if you select the Element XPath in the Distinct Option field, you can now use an XPath function to define distinct values.

Talend Big Data Platform

Talend Real-Time Big Data Platform

ESB: new features

Feature

Description

Product

Enhancement of code dependency management Talend Studio now allows you to create custom Bean jars and custom routine jars, bundle multiple Beans or user routines in a custom Bean jar or routine jar, and set up custom Bean jar or routine jar dependencies on Routes and Routelets.
By setting up custom Bean jar or routine jar dependencies on Routes and Routelets, the code dependencies become more explicit and this can help reduce dependency conflicts.

Talend Real-Time Big Data Platform