Skip to main content Skip to complementary content

What's new in R2022-01

Big Data: new features

Feature

Description

Available in

Support of CDP Private Cloud Base and Public Cloud with Atlas If you use CDP Private Cloud Base or CDP Public Cloud to run your Spark Jobs and Apache Atlas has been installed in your Cloudera cluster, you can now make use of Atlas when you run your Job.

Atlas allows you to trace the lineage of given data flow to discover how this data was generated by a Spark Job.

All subscription-based Talend products with Big Data

Support of CDP Public Cloud authentication via Knox
Availability-noteBeta contentBeta
You can now authenticate using Knox when you use CDP Public Cloud 7.2.x with Hive to run your Spark Jobs.

As it is a beta feature only, it is not suitable for production environment.

All subscription-based Talend products with Big Data

Hive Warehouse Connector supports SSL encryption and Kerberos authentication on CDP Public Cloud with Knox You can now use SSL encryption and Kerberos authentication when you configure the connection to Hive Warehouse Connector with the tHiveWarehouseConfiguration component if you authenticate to your Spark Batch Job with Cloudera CDP Public Cloud using Knox.
When you enable these two options, it ensures a better protection of your data:
  • SSL allows you to protect your data by keeping the sensitive information encrypted
  • Kerberos allows you to provide a secure authentication to your data

All subscription-based Talend products with Big Data

Cloudera version renamed to reflect Talend Studio compatibility with CDP In Talend Studio user interface, in the Hadoop Configuration Import Wizard window and in the Spark Configuration view of your Spark Job, the Cloudera version has been renamed from Cloudera CDP 7.1.1.0-xxx to Cloudera CDP 7.x. This change is to reflect Talend Studio compatibility with both CDP Private Cloud Base (7.1.x) and CDP Public Cloud (7.2.x) distributions.

All subscription-based Talend products with Big Data

Kafka upgrade to 2.4.x version Talend Studio now supports Kafka 2.4.x version in all the Kafka components.

All subscription-based Talend products with Big Data

New After variables for tKafkaOutput in Standard Jobs You can now use the following After variables for the tKafkaOutput component in Standard Jobs:
  • ERROR_MESSAGE: displays the error message generated by the component when an error occurs.
  • NB_LINE: displays the number of rows processed.
  • NB_ERRORS: displays the number of rows processed with errors.
  • NB_SUCCESS: displays the number of rows successfully processed.

These variables are useful to cover the errors when the Die on error option from the Basic settings view of tKafkaOutput is disabled.

All subscription-based Talend products with Big Data

tKafkaInput and tKafkaOutput improvements in Standard Jobs With the new improvements provided to tKafkaInput and tKafkaOutput components, you can now perform the following actions in your Standard Jobs:
  • Support confluent Schema registry
  • Support custom deserializers
  • Support consuming messages holding Avro data
  • Expose the Kafka objects and allow objects to be created outside of the component to be stored in Kafka
  • Expose and configure the message header
  • Expose the key, topic and partition
  • Override any configuration parameters using the Advanced settings

All subscription-based Talend products with Big Data

tMongoDBInput: new options available in Standard Jobs
The tMongoDBInput component provides the following two new options:
  • Skip, which specifies the number of retrieved lines to skip.
  • Batch size, which specifies the maximum number of lines that can be retrieved in one batch.

All subscription-based Talend products with Big Data

Adding and specifying update operations for specific JSON nodes in Standard Jobs This release allows you to add and specify update operations for specific JSON nodes with MongoDB 4.4.x and later versions. This feature applies to tMongoDBOutput and tCosmosDBOutput only when you select Set or Upsert with set from the Action on data property in the Basic settings view.

All subscription-based Talend products with Big Data

Data Integration: new features

Feature

Description

Available in

Connecting to Talend Cloud by only using a token You now only need to enter your token, instead of both login name and token, when connecting to Talend Cloud in Talend Studio.

All Talend Cloud products and Talend Data Fabric

Option to prevent artifacts with unpushed changes from being published to Talend Cloud You can now configure Talend Studio to prevent an artifact from being published to Talend Cloud if it contains changes not pushed to the remote Git repository.

All Talend Cloud products and Talend Data Fabric

Salesforce components: support of the latest Endpoint version The Salesforce components in this release support the latest Endpoint version, that is, Endpoint Version 52.

All Talend products

New components: tApacheKuduInput and tApacheKuduOutput This release provides two new components: tApacheKuduInput and tApacheKuduOutput, which allow you to access Apache Kudu cluster.

All subscription-based Talend products with Talend Studio

New component: tAzureAdlsGen2Connection This release provides tAzureAdlsGen2Connection, which creates connections to ADLS Gen2 file system for other Azure ADLS Gen2 components using an Azure storage account. In addition, the tAzureAdlsGen2Input component and the tAzureAdlsGen2Output component provide two new options: Use an existing connection and Timeout. The Use an existing connection option allows you to use a connection created by a tAzureAdlsGen2Connection component and the Timeout option specifies the timeout time for creating a connection.

All subscription-based Talend products with Talend Studio

ELT Map components: a new option available

A new option, Dry run, is available. When this option is selected, the component just generates a query and passes it to the connected ELT output component, which adds the query to the QUERY variable, without performing any actions on the database (such as connecting to the database or executing the query).

This feature applies to tELTMap, tELTMSSqlMap, and tELTOracleMap.

All subscription-based Talend products with Talend Studio

ELT output components: a new variable provided

A new AFTER variable, QUERY, is provided. This variable is populated from ELT Map component that the current component connects to. This allows the generated query to be used by other components.

This feature applies to tELTOutput, tELTMSSqlOutput, and tELTOracleOutput.

All subscription-based Talend products with Talend Studio

Data Mapper: new features

Feature

Description

Available in

JSONL support The JSON reader automatically recognizes the JSONL format, and a new option in the JSON representation allows you to write JSONL.

All Talend Platform and Data Fabric products

Support of oneOf keyword in JSON reader The JSON reader was improved and now allows you to create expressions to select the correct alternative when a structure contains the oneOf keyword.

All Talend Platform and Data Fabric products

Application Integration: new features

Feature

Description

Available in

Enhancement of project analysis report to warn about custom components dependencies risk

Project analysis report now warns about custom components dependencies risk.

All subscription-based Talend products with ESB

Synchronization of the region names in cAWSConnection with AWS

The region names in cAWSConnection is synchronized with AWS.

All subscription-based Talend products with ESB

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!