Big Data: new features
Feature |
Description |
---|---|
Support of CDP Private Cloud Base and Public Cloud with Atlas | If you use CDP Private Cloud Base or CDP Public Cloud to run your Spark Jobs
and Apache Atlas has been installed in your Cloudera cluster, you can now make use
of Atlas when you run your Job. Atlas allows you to trace the lineage of given data flow to discover how this data was generated by a Spark Job. |
Support of CDP Public Cloud authentication via Knox (technical preview) | You can now authenticate using Knox when you use CDP Public Cloud 7.2.x with
Hive to run your Spark Jobs. This feature is technical preview only, it is not suitable for production environment. |
Hive Warehouse Connector supports SSL encryption and Kerberos authentication on CDP Public Cloud with Knox | You can now use SSL encryption and Kerberos authentication when you
configure the connection to Hive Warehouse Connector with the
tHiveWarehouseConfiguration component if you authenticate to your Spark Batch Job
with Cloudera CDP Public Cloud using Knox. When you enable these two options, it
ensures a better protection of your data:
|
Cloudera version renamed to reflect Talend Studio compatibility with CDP | In Talend Studio
user interface, in the Hadoop Configuration Import Wizard
window and in the Spark Configuration view of your Spark
Job, the Cloudera version has been renamed from Cloudera CDP
7.1.1.0-xxx to Cloudera CDP 7.x. This change
is to reflect Talend Studio
compatibility with both CDP Private Cloud Base (7.1.x) and CDP Public Cloud
(7.2.x) distributions. |
Kafka upgrade to 2.4.x version | Talend Studio
now supports Kafka 2.4.x version in all the Kafka components. |
New After variables for tKafkaOutput in Standard Jobs | You can now use the following After variables for the tKafkaOutput component
in Standard Jobs:
These variables are useful to cover the errors when the Die on error option from the Basic settings view of tKafkaOutput is disabled. |
tKafkaInput and tKafkaOutput improvements in Standard Jobs | With the new improvements provided to tKafkaInput and tKafkaOutput
components, you can now perform the following actions in your Standard Jobs:
|
tMongoDBInput: new options available in Standard Jobs |
The tMongoDBInput component provides the following two new options.
|
Adding and specifying update operations for specific JSON nodes in Standard Jobs | This release allows you to add and specify update operations for specific
JSON nodes with MongoDB 4.4.x and later versions. This feature applies to
tMongoDBOutput and tCosmosDBOutput only
when you select Set and Upsert with
set from the Action on data property in the
Basic settings view. |
Data Integration: new features
Feature |
Description |
---|---|
Connecting to Talend Cloud by only using a token | You now only need to enter your token, instead of both login name and token,
when connecting to Talend Cloud
in Talend Studio. |
Option to prevent artifacts with unpushed changes from being published to Talend Cloud | You can now configure Talend Studio
to prevent an artifact from being published to Talend Cloud
if it contains changes not pushed to the remote Git repository. |
Salesforce components: support of the latest Endpoint version | The Salesforce components in this release support the latest Endpoint version, that is, Endpoint Version 52. |
New components available: tApacheKuduInput and tApacheKuduOutput | This release provides two new components: tApacheKuduInput and tApacheKuduOutput, which allow you to access Apache Kudu cluster. |
New component available: tAzureAdlsGen2Connection |
This release provides the tAzureAdlsGen2Connection component, which creates connections to ADLS Gen2 file system using an Azure storage account for other Azure ADLS Gen2 components. In addition, the tAzureAdlsGen2Input component and the tAzureAdlsGen2Output component provide two new options: Use an existing connection and Timeout. The Use an existing connection option allows you to use a connection created by a tAzureAdlsGen2Connection component and the Timeout option specifies the timeout time for creating a connection. |
ELT Map components: a new option available | A new option, Dry run, is available. When this option is selected, the component just generates a query and passes it to the connected ELT output component, which adds the query to the QUERY variable, without performing any actions on the database (such as connecting to the database or executing the query). This feature applies to tELTMap, tELTMSSqlMap, and tELTOracleMap. |
ELT output components: a new variable provided | A new AFTER variable, QUERY, is provided. This variable is populated from ELT Map component that the current component connects to. This allows the generated query to be used by other components. This feature applies to tELTOutput, tELTMSSqlOutput, and tELTOracleOutput. |
Data Mapper: new features
Feature |
Description |
---|---|
JSONL support | The JSON reader automatically recognizes the JSONL format, and a new option in the JSON representation allows you to write JSONL. |
Support of oneOf keyword in JSON reader |
The JSON reader was improved and now allows you to create expressions to
select the correct alternative when a structure contains the
oneOf keyword. |