Big Data: known issues and known limitations - Cloud - 8.0

Talend Release Notes

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud API Services Platform
Talend Cloud Big Data
Talend Cloud Big Data Platform
Talend Cloud Data Fabric
Talend Cloud Data Integration
Talend Cloud Data Management Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Cloud API Designer
Talend Cloud API Tester
Talend Cloud Data Inventory
Talend Cloud Data Preparation
Talend Cloud Data Stewardship
Talend Cloud Pipeline Designer
Talend Data Preparation
Talend Data Stewardship
Talend Management Console
Talend Studio
Content
Installation and Upgrade
Release Notes
Last publication date
2024-04-16

Limitation

Description

Available in

Hive Hive is not supported in Spark Local mode.

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

Java 11
  • Java 11 is not supported in the Standard Jobs or the Metadata Repository once they involve big data distributions.
  • Java 11 is not supported in the Spark Jobs.

This limitation is due to the constraint to support Java 11 of the big data distributions.

To run your Spark Jobs and Standard Jobs or Metadata Repository that involve big data distributions, you need to install Java 8 on your computer, and in Talend Studio customize the path in Preferences > Talend > Java interpreter and then browse the location of JDK 8 in Preferences > Java > Installed JREs.

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

Issue Workaround Available in
When you run Spark Jobs with Dataproc 2.x, Azure Synapse and HD Insight 4.0 distributions, the following error can be returned: java.lang.NoSuchMethodError: org.apache.log4j.helpers. Following the Log4j2 security issue (CVE-2021-44228), make sure to disable Log4j loggers when you run Spark Batch and Spark Streaming Jobs with Dataproc 2.x and onwards, Azure Synapse and HD Insight 4.0 distributions.

To avoid any Job failure, clear the Activate log4j in components check box from the Log4j view in File > Edit Project Properties > Project Settings or clear the log4jLevel check box from the Advanced settings view of your Spark Job.

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

When you run a Spark Batch Jobs with MapRDB components that have Date type columns in schema columns, the following compile error appears:

"The method toBytes(ByteBuffer) in the type Bytes is not applicable for the arguments (Date)".

Date type columns in schema columns cannot be used when you run a Spark Batch Job with MapRDB components.

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data

HBase is not working with a CDP 7.1.x cluster using Kerberos in YARN Client mode and returns the following error: hbase.pb.AuthenticationService.GetAuthenticationTokenorg.apache.hadoop.hbase.HBaseIOException: com.google.protobuf.ServiceException: Error calling method hbase.pb.AuthenticationService.GetAuthenticationToken. If you want to use Kerberos when using HBase with a CDP 7.1.x cluster, it is recommended to use YARN Cluster mode instead of YARN Client mode.

Available in:

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

All subscription-based Talend products with Big Data