What's new in R2020-06 - 7.3

Talend Data Fabric Release Notes

author
Talend Documentation Team
EnrichVersion
7.3
EnrichProdName
Talend Data Fabric
task
Installation and Upgrade
Release Notes

The R2020-06 Studio monthly release contains the following new features.

Big Data: new features

Feature

Description

Support for Cloudera Data Platform (CDP) When you configure a connection to a Hadoop cluster, you can select Cloudera CDP 7.1. You can also add and use the dynamic distributions of CDP Private Cloud Base 7.x.

The CDP integration in Talend Studio includes a new dependency management system that improves the performance of your Jobs at runtime.

CDP supports the following elements:
  • Data Integration components:
    • HBase
    • HDFS
    • Hive
  • Spark Batch components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kudu
  • Spark Streaming components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kafka
Support of Microsoft HD Insight 4.0 You can now use the Microsoft HD Insight 4.0 distribution in Standard Jobs and in Spark Jobs that use Spark v2.3 and v2.4. This new support comes with several features:
  • Support of Azure Data Lake Storage (ADLS) Gen2: this storage option is available when you use Hive or HDFS, and to configure a connection with tAzureFSConfiguration. You can also use ADLS Gen2 as a primary storage when configuring a centralized connection to HD Insight in Metadata.
  • Support of TLS to secure connections to ADLS Gen2 and Azure Blob Storage.
Check the status of Jobs that run on HD Insight To check if a Job is still running, configure a polling that retrieves the status of this Job. In the Spark Configuration tab in the Run view of the Job, in the Job status polling configuration section, specify the time period between polls and the maximum number of retries.
Use Databricks pools You can reduce the start and auto-scaling times of your Databricks cluster by using a pool. In the Spark Configuration tab in the Run view of your Job, select the Use pool check box and indicate the ID of the pool that you want to use. You must also select the Use transient cluster check box. For more information about Databricks pools, see Pools in the Databricks documentation.

Azure ADLS Gen2 components: Azure Active Directory authentication supported

The following Azure ADLS Gen2 components support the Azure Active Directory authentication (AD authentication).

  • tAzureAdlsGen2Input
  • tAzureAdlsGen2Output

Data Integration: new features

Feature

Description

Further enhancement of context propagation The context propagation over the reference project has been further enhanced by improving the conflicts resolution for the Git/SVN technical files when merging branches.
Microsoft SQL Server metadata wizard update The default Db Version for Microsoft SQL Server in Talend Studio metadata wizard is changed to Microsoft.
Stitch connectors integration You can now search Stitch connectors on the design workspace and in the Palette in Talend Studio. The search result will lead you to the Stitch web page about the connector you select.

tDataprepRun enhancement

The tDataprepRun component now supports the dynamic schema feature.

New components available

This release provide the following two new components.

  • tCosmosDBSQLAPIInput, which retrieves data from a Cosmos database collection through SQL API.
  • tCosmosDBSQLAPIOutput, which inserts, updates, upserts or deletes documents in a Cosmos database collection based on the incoming flow from the preceding component through SQL API.

Snowflake components: external OAuth support provided

The following Snowflake components support external OAuth for data accessing.

  • tSnowflakeBulkExec
  • tSnowflakeConnection
  • tSnowflakeInput
  • tSnowflakeOutput
  • tSnowflakeOutputBulk
  • tSnowflakeOutputBulkExec
  • tSnowflakeRow

MS SQL Server connectors: the default JDBC provider changed to the official Microsoft driver

The default JDBC provider of the following components changed to the official Microsoft driver.

  • tCreateTable
  • tELTMSSqlMap
  • tMSSqlBulkExec, tMSSqlConnection, tMSSqlInput, tMSSqlOutput, tMSSqlOutputBulkExec, tMSSqlRow, tMSSqlSCD, tMSSqlSP, tMSSqlCDC, tMSSqlInvalidRows, tMSSqlValidRows

tJDBCInput: new option provided to prevent unexpected character conversion in dynamic column

The tJDBCInput component provides the Allow special character in dynamic table name, which keeps special characters in input table column names as they are.