Data Quality: new features - 7.1

Talend Data Integration products Release Notes

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Data Integration
Talend Data Management Platform
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
task
Installation and Upgrade

Feature

Description

Talend Open Studio for Data Quality

Talend Data Management Platform

Word-based patterns profiling in Talend Studio In this more generic profiling pattern, the granularity of the analysis is the word and not the characters.

Word-based patterns make new data patterns more highly visible when preparing data as well as for exploratory analysis and discovery purposes.

Profiling Japanese data in Talend Studio Japanese characters are supported in the Profiling perspective, bringing support for Japanese characters to a similar level of that provided to Latin characters and enabling data curation and data quality for Japan.
Processing Japanese data in Talend Studio New components, which work in Apache Spark framework, have been introduced in Talend Studio:
  • tJapaneseNumberNormalize normalizes Japanese numbers (kansÅ«ji) to regular Arabic numbers
  • tJapaneseTokenize splits Japanese text into tokens.
  • tJapaneseTransliterate converts Japanese text to kana and Latin scripts.
Data masking for Asian data in Talend Studio The following functions in the tDataMasking component support Asian characters:
  • Generate from Pattern
  • Replace characters between two positions
  • Replace all
  • Replace all letters
  • Replace n first characters
  • Replace n last characters
Consistent data masking in Talend Studio The Generate unique phone number function has been added to the tDataMasking component: this function masks phone numbers for different countries (China, France, Germany, India, Japan, UK and US) by generating valid unique random phone numbers.
Get international phone number using the tGoogleAddressRow component The tGoogleAddressRow component has been updated to be able to retrieve international phone numbers.
Audit user actions in Talend Dictionary Service Audit all user actions in Talend Dictionary Service, including login/logout and configuration update and deployment. This helps ensure greater compliance with security rules and regulations.
Mass actions on semantic types in Talend Dictionary Service You can now import, export, remove and publish multiple semantic types at once, enabling you to promote a whole project at once from one environment to another.
Internationalization The Profiling perspective interface in Talend Studio has been translated into Chinese, increasing international reach.
Support for additional databases Talend now supports additional databases for the data quality data mart, Talend DQ Portal and data quality components:
  • Microsoft SQL Server 2017
  • MySQL 8.0
  • PostgreSQL 10
Support for additional databases Talend now supports additional databases for the Profiling perspective:
  • Denodo
  • Microsoft SQL Server 2017
  • MySQL 8.0
  • PostgreSQL 10
Spark 2.3 support Talend supports Spark 2.3 (local mode) when running Jobs in Talend Studio with the following components:
  • tALSModel
  • tDataMasking
  • tDataShuffling
  • tJapaneseNumberNormalize
  • tJapaneseTokenize
  • tJapaneseTransliterate
  • tMatchIndex
  • tMatchIndexPredict
  • tMatchModel
  • tMatchPairing
  • tNaiveBayesModel
  • tPatternMasking
  • tPredict
  • tRecommend
  • tReservoirSampling
  • tRuleSurvivorship
  • tStandardizePhoneNumber
  • tSynonymSearch
  • tTransliterate
  • tVerifyEmail