What's new in R2020-06 - 7.3

Talend Big Data products Release Notes

author
Talend Documentation Team
EnrichVersion
7.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Installation and Upgrade
Release Notes

The R2020-06 Studio monthly release contains the following new features.

Big Data: new features

Feature

Description

Product

Support for Cloudera Data Platform (CDP) When you configure a connection to a Hadoop cluster, you can select Cloudera CDP 7.1. You can also add and use the dynamic distributions of CDP Private Cloud Base 7.x.

The CDP integration in Talend Studio includes a new dependency management system that improves the performance of your Jobs at runtime.

CDP supports the following elements:
  • Data Integration components:
    • HBase
    • HDFS
    • Hive
  • Spark Batch components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kudu
  • Spark Streaming components:
    • Azure Blob Storage
    • HBase
    • HDFS
    • Hive
    • Kafka

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support of Microsoft HD Insight 4.0 You can now use the Microsoft HD Insight 4.0 distribution in Standard Jobs and in Spark Jobs that use Spark v2.3 and v2.4. This new support comes with several features:
  • Support of Azure Data Lake Storage (ADLS) Gen2: this storage option is available when you use Hive or HDFS, and to configure a connection with tAzureFSConfiguration. You can also use ADLS Gen2 as a primary storage when configuring a centralized connection to HD Insight in Metadata.
  • Support of TLS to secure connections to ADLS Gen2 and Azure Blob Storage.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Check the status of Jobs that run on HD Insight To check if a Job is still running, configure a polling that retrieves the status of this Job. In the Spark Configuration tab in the Run view of the Job, in the Job status polling configuration section, specify the time period between polls and the maximum number of retries.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Use Databricks pools You can reduce the start and auto-scaling times of your Databricks cluster by using a pool. In the Spark Configuration tab in the Run view of your Job, select the Use pool check box and indicate the ID of the pool that you want to use. You must also select the Use transient cluster check box. For more information about Databricks pools, see Pools in the Databricks documentation.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Azure ADLS Gen2 components: Azure Active Directory authentication supported

The following Azure ADLS Gen2 components support the Azure Active Directory authentication (AD authentication).

  • tAzureAdlsGen2Input
  • tAzureAdlsGen2Output

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Integration: new features

Feature

Description

Product

Further enhancement of context propagation The context propagation over the reference project has been further enhanced by improving the conflicts resolution for the Git/SVN technical files when merging branches.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Microsoft SQL Server metadata wizard update The default Db Version for Microsoft SQL Server in Talend Studio metadata wizard is changed to Microsoft.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Stitch connectors integration You can now search Stitch connectors on the design workspace and in the Palette in Talend Studio. The search result will lead you to the Stitch web page about the connector you select.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tDataprepRun enhancement

The tDataprepRun component now supports the dynamic schema feature.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

New components available

This release provide the following two new components.

  • tCosmosDBSQLAPIInput, which retrieves data from a Cosmos database collection through SQL API.
  • tCosmosDBSQLAPIOutput, which inserts, updates, upserts or deletes documents in a Cosmos database collection based on the incoming flow from the preceding component through SQL API.

Talend Open Studio for Big Data

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Snowflake components: external OAuth support provided

The following Snowflake components support external OAuth for data accessing.

  • tSnowflakeBulkExec
  • tSnowflakeConnection
  • tSnowflakeInput
  • tSnowflakeOutput
  • tSnowflakeOutputBulk
  • tSnowflakeOutputBulkExec
  • tSnowflakeRow

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

MS SQL Server connectors: the default JDBC provider changed to the official Microsoft driver

The default JDBC provider of the following components changed to the official Microsoft driver.

  • tCreateTable
  • tELTMSSqlMap
  • tMSSqlBulkExec, tMSSqlConnection, tMSSqlInput, tMSSqlOutput, tMSSqlOutputBulkExec, tMSSqlRow, tMSSqlSCD, tMSSqlSP, tMSSqlCDC, tMSSqlInvalidRows, tMSSqlValidRows

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tJDBCInput: new option provided to prevent unexpected character conversion in dynamic column

The tJDBCInput component provides the Allow special character in dynamic table name, which keeps special characters in input table column names as they are.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform