R2020-05 (cumulative patch) - 7.3

Version
7.3
Language
English
Product
Talend Data Fabric
Module
Talend Studio
Last publication date
2020-05-29

R2020-05 (cumulative patch)

Info Value
Patch name Patch_20200529_R2020-05_v1-7.3.1
Release date 2020-05-29
Target version 20200219_1130-7.3.1
Product affected Talend Studio

Introduction

This monthly release includes all previous generally available patches for Talend Studio 7.3.1. For more information about the new features and bug fixes included, see Talend Data Fabric Release Notes.

NOTE: For information on how to obtain this patch, reach out to your Support contact at Talend.

New features

This patch contains the following features:

  • TBD-9991 Support for EMR 5.29 Static Distro - DI / Spark batch / Spark streaming
  • TBD-10328 Merge operation for Delta Lake in tDeltaLakeOutput
  • TBD-10169 Upgrade Delta Lake connectivity
  • TBD-10265 Upgrade Snowflake connectivity
  • TBD-9643 Support Oracle 19c
  • TBD-10206 Upgrade httpclient to align with other components
  • TUP-16546 Prompt users for new password in Studio Connection when AD/LDAP credentials change in TAC
  • TUP-25708 Avoid GIT Branch conflicts due to Timestamp and ID
  • TUP-26186 Use the operators =, <=, <, >=, >, <> instead of EQ, LE, LT, GE, GT, NE in tELTTeradataMap
  • TUP-26284 Upgrade AWS SDK for driver in metadata
  • TUP-26956 Update Studio top bar to reflect monthly deliveries for better understanding
  • TUP-26288 Enhance context propagation over reference project
  • TDI-43878 Accessing DynamoDB GSI
  • TDI-43724 Extend AWS STS assume role field for Studio
  • TDI-43719 Rename tSQLDWHXXX component to tAzureSynapseXXX and support adls gen2
  • TDI-44179 Support new load method with COPY statement in tSynapseBulkExec
  • TESB-27806 Expose Talend Microservices Metrics in a monitoring system like Prometheus
  • TESB-24998 Interpret context variables used in provider and consumer endpoints in Data Services and Routes
  • TMC-19740 Delete "update corresponding task" check box from Studio publish to Cloud dialog
  • TDI-39575 Support AD authentication for Azure SQL Data Warehouse with tMSSQL connectors
  • TDQ-17784 DQ components (Spark Batch) support Databricks on Azure & AWS
  • TDQ-18049 tMatchModel: Feature importance report can be saved on Databricks (Azure/AWS) and HDInsight (Azure)

Fixed issues

This patch contains the following fixes:

  • TBD-10616 Compiler error when spark streaming job use tMap that linked with tFixFlowInput,tHiveInput
  • TBD-10615 Compiler error when spark streaming job use tFileInputDelimited
  • TBD-10613 [BUG] tDeltaLakeOutput - SQL Merge option and time-travel corrections
  • TBD-10606 [CDH 6.1 Spark Batch] Hive job fails with "Unable to instantiate SparkSession with Hive support because Hive classes are not found" error
  • TBD-9956 Sqoop issue with parquet/avro format with HDP 2.6
  • TBD-10095 Sqoop issue with parquet/avro format with EMR 5.15(2.8.3)
  • TBD-10113 [BUG] Streaming, no output/NPE on tFileStreamInputJSON
  • TBD-10363 lost data after read by tFileInputDelimited component
  • TBD-10374 Cannot connect to hive against CDH 5.16.2
  • TBD-10401 [7.3.1]Getting [ERROR]: org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator in a few job post migration to Talend 7.3.1
  • TBD-10442 tFileOuputDelimited - wrong columns order
  • TBD-10443 CDH - Dynamic distribution - CDH 6.X not available
  • TBD-10449 NullPointerException when column is empty
  • TBD-10479 [BUG] Compile error with tHiveOutput
  • TBD-10572 Streaming - tExtractDelimitedFields - Compile error
  • TBD-9802 Compile error in spark job with HBase on HDP 3.1
  • TBD-10067 Incorrect Encoding in FileInputJSON
  • TBD-10153 Compiler error for sqoop components in DI job when using java 11
  • TBD-10159 (Spark Local 2.4) java.lang.NoSuchMethodError on Oracle components
  • TBD-10337 [Spark Streaming, CDH 6.1] Compile errors on tHiveOutput component using CDH 6.1 when all data types are used in schema
  • TBD-10340 [Spark Streaming, CDH 6.1] NoClassDefFoundError on tHiveOutput component
  • TBD-10370 "Hadoop configuration jar not found" when run spark job with jobserver on Windows
  • TBD-10436 Hive component fails with java.lang.NoClassDefFoundError: org/apache/log4j/Logger
  • TBD-10454 [7.3.1] DI job using tHDFSExist works from Studio to RE but not from TMC to RE
  • TBD-10476 [CDH6.1] java.lang.NoSuchMethodError on job with Hive components
  • TBD-10505 Databricks on AWS, cannot output to DBFS when S3 configured
  • TBD-10089 Standardize encoding UI for I/O BD components
  • TBD-10470 Wrong logo for tKinesisInput component
  • TUP-27262 tAzureAdlsGen2Output component fails with class not found error
  • TUP-25961 org.talend.commons.exception.CommonExceptionHandler - java.util.ConcurrentModificationException
  • TUP-26639 Talend Cloud studio Issue -Changes to Context Parameter Names is not reflected on the Connection for Snowflake
  • TUP-26870 Incorrect sort when listing version of Jobs
  • TUP-26958 Test case: possible to remove input / output node
  • TUP-26961 When saving/deleting a test case, it might break the poms / CI
  • TUP-26990 Possible conflict in talend project
  • TUP-26994 Usage data collector: change the way Studio Unique Id is calculated
  • TUP-27000 Talend Salesforce Einstein connector Repository connection issue in Talend 7.3
  • TUP-27003 Should disable the Commit button of Uncommitted files found dialog when project is in MERGING state
  • TUP-27199 DB version of sybase isn't hide for other database on tCreateTable
  • TDI-44066 Illegal argument exception in tSAPBapi name field
  • TDI-44089 Program Z_TLD_BI_INFOPROV not found when moving from 7.1.1 to 7.2.1
  • TDI-44159 Data viewer on tSybaseInput get error in context model with sybase 16 anywhere database
  • TDI-43619 Null Value Treated as in Subjob
  • TDI-44051 tJDBCInput Dynamic Schema does not preserve special characters
  • TDI-44130 UPDATE_OR_INSERT Mode in tDBOutput for MSSQL does not display SQL
  • TDI-43935 NString type for prepared statement in tMSSQLRow component
  • TDI-44185 Authentication Fields missing in tMongoDBBulkLoad
  • TDI-43822 tDataPrepRun Token expiration must be taken into account
  • TDI-44191 Job Deployed to MDM server and using tFileCopy fails with java.lang.ClassNotFoundException: org.talend.FileCopy
  • TDI-44122 Proxy settings not being picked up in tSalesforceBulkExec API v2
  • TDI-43605 tDBConnection to Snowflake exception "Not connected" when API call is made
  • TDI-43612 Problem with retrieving Snowflake tables from Studio
  • TDI-43629 Dynamic schema issue in tSnowflakeOutput Component
  • TDI-43682 Get error when retrieving record types in tNetsuiteInput/tNetsuiteOutput
  • TDI-43752 Snowflake component throw warnings
  • TDI-44114 Snowflake multi statement execution error
  • TDI-43903 7.3 Enhancements to tSnowflakeBulkexec broke existing functionality in the Copy (manual) command
  • TBD-10507 7.3.1 Simple Databricks Delta Lake Job is trying to do a broadcast hash join
  • TBD-10520 Files with commas get truncated by tFileInputDelimited & tFileInputFullRow Databricks / Spark 2.4
  • TDM-8036 The tHMap component giving error in 7.3.1 Studio
  • TDM-8001 Streaming with XML output creates extra tags
  • TDM-8013 Job working in 7.2.1 fails in 7.3.1 with class org.apache.commons.compress.archivers.zip.ZipFile$1 not implementing InputStreamStatistics
  • TDM-8028 Remove dependency on avro-mapred-1.7.7
  • TDM-8031 Studio keeps crashing after opening the tHMap component and running the Job
  • TDM-7667 Flattener: flattening structure that contains element with same label does not generate correct name
  • TDM-7871 Attributes are lost when importing avro schema
  • TDM-7925 Data type is changed to STRING for element with data type NONE after export to avro schema
  • TDM-7964 Problem to parse JSON array having heteregeneous elements
  • TDM-7998 Regression: test run will return error when using &/$ in variable name of get/set variable function
  • TESB-28369 CI Publish To Cloud Error when publishing demorest job to Cloud using CI on Studio
  • TESB-28130 Duplicate dependencies in POM.xml for routes lead to compile issues
  • TESB-28815 tRestClient when called in a DS Job using tRunJob fails to load in the runtime
  • TMC-20647 Publishing large Job to the Cloud always ends up with SocketTimeoutException even if it has been uploaded to Cloud
  • TDQ-18174 Remote DQ project: Fixed Git conflict with some system and technical files
  • TDQ-18322 Support the retrievial of the schema on Sybase SQL anywhere
  • TDQ-18091 tSynonymOutput: Improved the error message on incorrect path parameter
  • TDQ-18383 Fixed the high vulnerability on Log4j dependency
  • TDQ-18444 Fixed CVE issues in Nimbus

Prerequisites

Consider the following requirements for your system:

  • Talend Studio 7.3.1 must be installed.

Installation

Installing the patch using Software update

  1. Log in to Talend Administration Center and go to Settings->Configuration->Software Update. Enter the correct values and save them referring to the documentation: Configuring the Software Update repository in Talend Administration Center.
  2. Download the new patch from the Settings->Software Update page into the nexus repository.
  3. Log in to Talend Studio with remote mode.
  4. Click the Update button displayed on the login window to install the patch.
  5. When the patch is installed, if the following cache folder has been created: {Talend-Studio}/configuration/.m2/repository/org/talend/libraries/dataquality-reconciliation-visualization, delete it.

Installing the patch manually using Talend Studio (for Cloud users)

  1. Create a folder named "patches" in your Studio installer directory and copy the patch .zip file to this folder.
  2. Restart your Studio.
  3. Click OK when prompted to install the patch, or restart the commandline to install the patch automatically.
  4. When the patch is installed, if the following cache folder has been created: {Talend-Studio}/configuration/.m2/repository/org/talend/libraries/dataquality-reconciliation-visualization, delete it.

Installing the patch using CommandLine

Execute the following commands to install the patch:

  1. Talend-Studio-win-x86_64.exe -nosplash -application org.talend.commandline.CommandLine -consoleLog -data commandline-workspace startServer -p 8002 --talendDebug
  2. initRemote {tac_url} -ul {TAC login username} -up {TAC login password}
  3. checkAndUpdate -tu {TAC login username} -tup {TAC login password}
  4. When the patch is installed, if the following cache folder has been created: {Talend-Studio}/configuration/.m2/repository/org/talend/libraries/dataquality-reconciliation-visualization, delete it.

After installing the patch, you need to stop CommandLine and clean the org.eclipse.osgi folder under the {Talend-Studio}/configuration directory, where {Talend-Studio} is the installation directory of your Talend Studio.

Installing the patch using Continuous Integration

To install the patch using the CI builder, use the -Dpatch.path option at build time. See Building and Deploying for details. When the patch is installed, if the following cache folder has been created: {Talend-Studio}/configuration/.m2/repository/org/talend/libraries/dataquality-reconciliation-visualization, delete it.

Note: It is strongly recommended to resynchronize poms after installing the patch by clicking the Force full re-synchronize poms button in the Build -> Maven view in the Project Settings dialog box in Talend Studio.

Changes from previous monthly releases

New features

This patch contains the following features:

  • TBD-10198 Add database/table parameters on DeltaLake components
  • TDM-7679 Flattener: generate the flattening map (via New Map wizard)
  • TDM-7095 Get the index of the parent Loop
  • TDI-43218 Azure storage components provide connectivity through SAS Token or Account key. This feature request is to provide connectivity though AD authentication.

Fixed issues

This patch contains the following fixes:

  • TUP-26809 Fix: Build job could contain jars from test cases as well. (while it should only contain jars from job)
  • TUP-27077 NoClassDefFoundException when using "independent process to run subjob" and tAzureAdlsGen2Input
  • TUP-26596 Proxy of libraries not working when studio have no internet access
  • TUP-26189 metadata connection with proxy may not select proxy properly
  • TUP-26264 tELTMap generated query has extra symbols generated when more input mapping in component
  • TUP-26156 tCreateTable: change "DBType" and "Property Type" not work
  • TUP-26482 Studio is very slow to build the job(the performance issue)
  • TUP-26212 Share more than one car component to nexus3.19.1 will fail to generate the right index file
  • TUP-26213 Compilation issue after migration to v7.2
  • TUP-26539 High Memory Consumption by Studio with GIT
  • TUP-26793 JDBC Redshift in context mode still asked for jars which is not required
  • TUP-26876 NoClassDefFoundError when I run spark job with JobServer
  • TDI-43810 fix MongoDB issue with option "Create empty element if needed"
  • TDM-7969 TDM adds unencrypted passwords to error message
  • TDM-7957 Studio commandLine error on Headless Linux following Git pull
  • TDM-7952 Nested distinct element causes performance issues
  • TDM-7932 Flatten: some value isn't mapped after flatten map
  • TDM-7931 Flatten: will return error when create flatten map from JSON structure
  • TDM-7929 Importer CSV need clean not supported header in Xquery,rriahi
  • TDM-7928 Move "Flattening Map" option at 2nd position in New Map Wizard
  • TDM-7924 merged file path is wrong after thmapfile job when "merge file path" is the same as "output folder"
  • TDM-7894 CSV show sample on element only display one record
  • TDM-7870 importing avro schema is failed for openAPI json
  • TDM-7867 Exclude Commons Collections 3.2.1
  • TDM-7865 Bad initial values and exception on the LoopIndex function dialog
  • TDM-7858 IsPresent updates are not persisted
  • TDM-7855 Exception thrown on drag/drop of looping element where dialog was expected
  • TDM-7840 Variable names in GetVariable/SetVariable produce unfriendly errors
  • TDM-7839 LoopIndex in a distinct NestedContext Simple Loop can not detect the index_of element
  • TDM-7836 Distinct funtion generate a not valid xquery in the context of AgConcat
  • TDM-7835 Distinct function generate NullPointer Exception in the context of root elements
  • TDM-7834 Distinct function generate NullPointer Exception in case of SimpleLoop Having NestedContext
  • TDM-7829 tHMapFile merge functions doesnt not check correctly the folder name
  • TDM-7826 thmapFile merge option doesnt work with Hadoop 3.1
  • TDM-7789 CSV reader should use the optimization done for the CSV writer
  • TDM-7772 Don't include a _osdtTerminator column when importing a CSV
  • TDQ-18135 tDataEncrypt cannot generate a crypto file with context variable
  • TDQ-17954 tMatchIndex, tMatchIndexPredict: parameter "Nodes" and "Index" can not use context mode value
  • TDQ-18220 After migrating from 6.3 to 7.2 the tMatchGroup job has compiler errors
  • TBD-10355 Missing Dataset to RDD method call in a migrated spark job
  • TBD-10005 HDP3.1 Compile error tHiveOutput with partitioning: hiveContext_tHiveOutput_1 can not be resolved
  • TBD-10101 tFileOutputDelimited can not keep the format of date column
  • TBD-10167 Compiler error when run spark job on Databricks with tS3configuration that use context value
  • TBD-10212 [BUG] Avro dependencies cause error in databricks
  • TBD-9674 tHiveOutput append action in spark big data batch job under HDP3.1
  • TBD-9872 byte[] type is written to file incorrectly
  • TBD-9986 user token is printed as plain text in joblog when databrick debug level log is on
  • TBD-10081 Fix wrong release maven url with -SNAPSHOT in artifact id
  • TBD-10108 Date conversion in tExtractJSONFields is not correct
  • TBD-10157 Advanced option is not considered when using tHDFSPut in Standard jobs
  • TBD-10158 Migration task can alter the pattern selected by user
  • TBD-10210 Cannot load component "tFileOutputJSON"
  • TBD-10217 org.apache.parquet cannot be resolved to a type
  • TBD-10297 slf4j logger does not contains error() method
  • TBD-10008 Wrong timestamp in tFileOutputDelimited component
  • TBD-10063 Errors on spark with Yarn Cluster with custom hadoop path on Remote Engines
  • TBD-10251 Fix issues with some characters for hadoop configuration jar
  • TUP-26576 Remove the warning about repository setup even if artifact repository is disabled in TMC
  • TUP-26751 Fix CI issues, due to invalid test case
  • TUP-26165 Fix possible missing jars when build job
  • TUP-26728 Missing spark dependencies when using test cases and fix spark compilation issues
  • TBD-10324 Fix compilation issue with RDD with tMap
  • TBD-10284 Fix issues with custom hadoop configuration with DI jobs and Spark jobs with Yarn-Cluster setup