Big Data: known issues and known limitations - Cloud

Big Data: known issues and known limitations - Cloud - 8.0

Talend Release Notes

Version

Cloud

8.0

Language

English

Product

Talend Big Data

Talend Big Data Platform

Talend Cloud API Services Platform

Talend Cloud Big Data

Talend Cloud Big Data Platform

Talend Cloud Data Fabric

Talend Cloud Data Integration

Talend Cloud Data Management Platform

Talend Data Fabric

Talend Data Integration

Talend Data Management Platform

Talend Data Services Platform

Talend ESB

Talend MDM Platform

Talend Real-Time Big Data Platform

Module

Talend Cloud API Designer

Talend Cloud API Tester

Talend Cloud Data Inventory

Talend Cloud Data Preparation

Talend Cloud Data Stewardship

Talend Cloud Pipeline Designer

Talend Data Preparation

Talend Data Stewardship

Talend Management Console

Talend Studio

Content

Installation and Upgrade

Release Notes

Last publication date

2024-04-16

Limitation	Description	Available in
Hive	Hive is not supported in Spark Local mode.	ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data
Java 11	Java 11 is not supported in the Standard Jobs or the Metadata Repository once they involve big data distributions. Java 11 is not supported in the Spark Jobs. This limitation is due to the constraint to support Java 11 of the big data distributions. To run your Spark Jobs and Standard Jobs or Metadata Repository that involve big data distributions, you need to install Java 8 on your computer, and in Talend Studio customize the path in Preferences > Talend > Java interpreter and then browse the location of JDK 8 in Preferences > Java > Installed JREs.	ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data

Limitation

Description

Available in

Hive

Hive is not supported in Spark Local mode.

ⓘ

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

Java 11

Java 11 is not supported in the Standard Jobs or the Metadata Repository once they involve big data distributions.
Java 11 is not supported in the Spark Jobs.

This limitation is due to the constraint to support Java 11 of the big data distributions.

To run your Spark Jobs and Standard Jobs or Metadata Repository that involve big data distributions, you need to install Java 8 on your computer, and in Talend Studio customize the path in Preferences > Talend > Java interpreter and then browse the location of JDK 8 in Preferences > Java > Installed JREs.

ⓘ

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

Issue Workaround Available in

Issue	Workaround	Available in
When you run Spark Jobs with Dataproc 2.x, Azure Synapse and HD Insight 4.0 distributions, the following error can be returned: `java.lang.NoSuchMethodError: org.apache.log4j.helpers`.	Following the Log4j2 security issue (CVE-2021-44228), make sure to disable Log4j loggers when you run Spark Batch and Spark Streaming Jobs with Dataproc 2.x and onwards, Azure Synapse and HD Insight 4.0 distributions. To avoid any Job failure, clear the Activate log4j in components check box from the Log4j view in File > Edit Project Properties > Project Settings or clear the log4jLevel check box from the Advanced settings view of your Spark Job.	ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data
When you run a Spark Batch Jobs with MapRDB components that have `Date` type columns in schema columns, the following compile error appears: "The method toBytes(ByteBuffer) in the type Bytes is not applicable for the arguments (Date)".	`Date` type columns in schema columns cannot be used when you run a Spark Batch Job with MapRDB components.	ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data
HBase is not working with a CDP 7.1.x cluster using Kerberos in YARN Client mode and returns the following error: `hbase.pb.AuthenticationService.GetAuthenticationTokenorg.apache.hadoop.hbase.HBaseIOException: com.google.protobuf.ServiceException: Error calling method hbase.pb.AuthenticationService.GetAuthenticationToken`.	If you want to use Kerberos when using HBase with a CDP 7.1.x cluster, it is recommended to use YARN Cluster mode instead of YARN Client mode.	ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data

When you run Spark Jobs with Dataproc 2.x, Azure Synapse and HD Insight 4.0 distributions, the following error can be returned: java.lang.NoSuchMethodError: org.apache.log4j.helpers.

Following the Log4j2 security issue (CVE-2021-44228), make sure to disable Log4j loggers when you run Spark Batch and Spark Streaming Jobs with Dataproc 2.x and onwards, Azure Synapse and HD Insight 4.0 distributions.

To avoid any Job failure, clear the Activate log4j in components check box from the Log4j view in File > Edit Project Properties > Project Settings or clear the log4jLevel check box from the Advanced settings view of your Spark Job.

ⓘ

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

When you run a Spark Batch Jobs with MapRDB components that have Date type columns in schema columns, the following compile error appears:

"The method toBytes(ByteBuffer) in the type Bytes is not applicable for the arguments (Date)".

Date type columns in schema columns cannot be used when you run a Spark Batch Job with MapRDB components.

ⓘ

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

HBase is not working with a CDP 7.1.x cluster using Kerberos in YARN Client mode and returns the following error:


                     hbase.pb.AuthenticationService.GetAuthenticationTokenorg.apache.hadoop.hbase.HBaseIOException:
                     com.google.protobuf.ServiceException: Error calling method
                     hbase.pb.AuthenticationService.GetAuthenticationToken

If you want to use Kerberos when using HBase with a CDP 7.1.x cluster, it is recommended to use YARN Cluster mode instead of YARN Client mode.

ⓘ

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data