Big Data
Feature |
Description |
---|---|
Support for EMR 5.29 | You can run Talend Jobs with the Amazon EMR distribution in version 5.29. |
Upsert existing Delta Lake tables with new data | When you configure how to save the dataset in tDeltaLakeOutput, select Merge to upsert an existing Delta Lake table with new data from a data flow or from another Delta Lake table. New fields are available to configure which columns to merge and how to perform this merge. |
Check data consistency with EMR clusters | When using tS3Configuration, enable the Use EMRFS consistent view option to use the
EMR File System (EMRFS) consistent view. This option allows EMR clusters to
check for list and read-after-write consistency for Amazon S3 objects that
are written by or synced with EMRFS. |
Spark catalog configuration in tHiveConfiguration | You must indicate a Spark implementation with the
Spark catalog property in the
configuration of tHiveConfiguration. The value to select depends on whether
the Hive metastore is external to your cluster or not. This configuration
prevents errors at runtime. This property is available in Spark Batch Jobs
only. |
Support for Oracle 19c | Oracle 19c is now supported by the following Big Data
components. Spark Batch:
Spark Streaming:
|
Advanced Assume Role configuration in DynamoDB components | When you enable the Assume
Role option in the tDynamoDBInput and tDynamoDBOutput
components, you can now configure the following properties from the
Advanced settings view to fine
tune your configuration:
|
Access data from a secondary index | When you retrieve data from a table with the
tDynamoDBInput component, you can specify a secondary index in the component
configuration to improve the performance of queries and scans. |
Data Integration
Feature |
Description |
---|---|
Remote TAC connection improvement | A user with LDAP will be prompted for new login credentials in Talend Studio if the AD password has been changed. |
Title bar improvement | The title of Talend Studio on the title bar will be updated to show the patch
version information after installing a patch. |
AWS SDK driver upgrade | The AWS SDK driver for Redshift SSO connection in Talend Studio metadata has been upgraded. |
Context propagation enhancement | The context propagation over the reference project has been enhanced in Data Integration part. Any context variable update in the reference project now can be automatically synchronized to the main project. |
Advanced Assume Role configuration | When you enable the Assume
Role option, you can now configure the following properties
from the Advanced settings view to
fine tune your configuration:
This enhancement is available in the following components:
|
tSQLDWH components renamed | tSQLDWH components were renamed. The following gives
the detail.
|
Support for Azure Data Lake Storage Gen2 | The Azure Synapse components support Azure Data Lake
Storage Gen2. The tAzureSynapseBulkExec component provides the Data Lake Storage Gen2 option in the
Azure Storage drop-down list in
the Basic settings view and the
Secure transfer required option
in the Advanced settings view. The
existing option Data Lake Store in
the Azure Storage drop-down list
changed to Data Lake Storage
Gen1. |
tELTTeradataMap: relationship operator updated | The ELT Teradata Map Editor uses these operators:
= , <= , < , >= , > ,
and <> ; the corresponding previous
operators, including EQ , LE , LT ,
GE , GT , and NE , are deprecated,
as shown in the following figures.The existing:
Changed to:
|
Support for Azure Active Directory authentication | You can now use Azure Active Directory authentication
when establishing connections using the following components.
|
tAzureSynapseBulkExec: support for COPY statement for loading data |
The tAzureSynapseBulkExec supports COPY statement for loading data and the following changes were made to the component. In the Basic
settings view:
In the Advanced
settings view:
|
Data Quality
Feature | Description |
---|---|
Components | All Data Quality components can run on Databricks on
Azure and AWS, except for tMatchIndex
and tMatchIndexPredict. As those components do not support the Elasticsearch authentication, they cannot run on Databricks. |
Application Integration
Feature | Description |
---|---|
REST Services | Context variables are now fully supported to be used in REST service provider and consumer endpoints in Data Services and Routes. |
Microservices | The Microservices offer now the possibility to provide metrics to Prometheus. |