What's new in R2020-05 - 7.3

Talend Big Data products Release Notes

author
Talend Documentation Team
EnrichVersion
7.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Installation and Upgrade
Release Notes

The R2020-05 Studio monthly release contains the following new features.

Big Data

Feature

Description

Product

Support for EMR 5.29 You can run Talend Jobs with the Amazon EMR distribution in version 5.29.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Upsert existing Delta Lake tables with new data When you configure how to save the dataset in tDeltaLakeOutput, select Merge to upsert an existing Delta Lake table with new data from a data flow or from another Delta Lake table. New fields are available to configure which columns to merge and how to perform this merge.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Check data consistency with EMR clusters When using tS3Configuration, enable the Use EMRFS consistent view option to use the EMR File System (EMRFS) consistent view. This option allows EMR clusters to check for list and read-after-write consistency for Amazon S3 objects that are written by or synced with EMRFS.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Spark catalog configuration in tHiveConfiguration You must indicate a Spark implementation with the Spark catalog property in the configuration of tHiveConfiguration. The value to select depends on whether the Hive metastore is external to your cluster or not. This configuration prevents errors at runtime. This property is available in Spark Batch Jobs only.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support for Oracle 19c Oracle 19c is now supported by the following Big Data components.
Spark Batch:
  • tOracleConfiguration
  • tOracleInput
  • tOracleOutput
Spark Streaming:
  • tOracleConfiguration
  • tOracleLookupInput
  • tOracleOutput

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Advanced Assume Role configuration in DynamoDB components When you enable the Assume Role option in the tDynamoDBInput and tDynamoDBOutput components, you can now configure the following properties from the Advanced settings view to fine tune your configuration:
  • Signing region (mandatory)
  • External Id
  • Serial number
  • Token code
  • Tags
  • IAM Policy ARNs
  • Policy

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Access data from a secondary index When you retrieve data from a table with the tDynamoDBInput component, you can specify a secondary index in the component configuration to improve the performance of queries and scans.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Integration

Feature

Description

Product

Remote TAC connection improvement A user with LDAP will be prompted for new login credentials in Talend Studio if the AD password has been changed.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Title bar improvement The title of Talend Studio on the title bar will be updated to show the patch version information after installing a patch.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

AWS SDK driver upgrade The AWS SDK driver for Redshift SSO connection in Talend Studio metadata has been upgraded.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Context propagation enhancement The context propagation over the reference project has been enhanced in Data Integration part. Any context variable update in the reference project now can be automatically synchronized to the main project.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tSQLDWH components renamed tSQLDWH components were renamed. The following gives the detail.
  • tSQLDWHBulkExec renamed as tAzureSynapseBulkExec
  • tSQLDWHClose renamed as tAzureSynapseClose
  • tSQLDWHCommit renamed as tAzureSynapseCommit
  • tSQLDWHConnection renamed as tAzureSynapseConnection
  • tSQLDWHInput renamed as tAzureSynapseInput
  • tSQLDWHOutput renamed as tAzureSynapseOutput
  • tSQLDWHRollback renamed as tAzureSynapseRollback
  • tSQLDWHRow renamed as tAzureSynapseRow

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support for Azure Data Lake Storage Gen2 The Azure Synapse components support Azure Data Lake Storage Gen2. The tAzureSynapseBulkExec component provides the Data Lake Storage Gen2 option in the Azure Storage drop-down list in the Basic settings view and the Secure transfer required option in the Advanced settings view. The existing option Data Lake Store in the Azure Storage drop-down list changed to Data Lake Storage Gen1.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tELTTeradataMap: relationship operator updated The ELT Teradata Map Editor uses these operators: =, <=, <, >=, >, and <>; the corresponding previous operators, including EQ, LE, LT, GE, GT, and NE, are deprecated, as shown in the following figures.
The existing:
Changed to:

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support for Azure Active Directory authentication You can now use Azure Active Directory authentication when establishing connections using the following components.
  • tAzureSynapseBulkExec, tAzureSynapseConnection, tAzureSynapseInput, tAzureSynapseOutput, tAzureSynapseRow
  • tELTMSSqlMap
  • tMSSqlBulkExec, tMSSqlConnection, tMSSqlInput, tMSSqlOutput, tMSSqlOutputBulkExec, tMSSqlRow, tMSSqlSCD, tMSSqlSP
  • tCreateTable

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tAzureSynapseBulkExec: support for COPY statement for loading data

The tAzureSynapseBulkExec supports COPY statement for loading data and the following changes were made to the component.

In the Basic settings view:
  • Load method drop-down list (new);
  • Azure storage drop-down list (updated);
  • Authentication method drop-down list (new);
  • SAS token field (new);
  • Endpoint suffix field (new);
  • External paths option (new).
In the Advanced settings view:
  • File type drop-down list (new);
  • Specify map to source table fields option (new);
  • First row field (new);
  • Field quote field (new);
  • Field terminator field (new);
  • Row terminator field (new);
  • Date format drop-down list (new);
  • Encoding drop-down list (new);
  • Identity insert option (new);
  • Max errors field (new);
  • Compressed by drop-down list (updated).

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Quality

Feature Description

Product

Components All Data Quality components can run on Databricks on Azure and AWS, except for tMatchIndex and tMatchIndexPredict.

As those components do not support the Elasticsearch authentication, they cannot run on Databricks.

Talend Big Data Platform

Talend Real-Time Big Data Platform

ESB

Feature Description

Product

REST Services Context variables are now fully supported to be used in REST service provider and consumer endpoints in Data Services and Routes.

Talend Real-Time Big Data Platform

Microservices The Microservices offer now the possibility to provide metrics to Prometheus.

Talend Real-Time Big Data Platform