Talend Open Studio for Data Quality
Talend Data Management Platform
|Survivorship rules per column||With the ability to run Survivorship rules on a per-column basis, you now have finer control over the master value you want to keep, thus extending the ways in which survivorship can be defined.|
|Introduction of a new component: tPatternMasking||
tPatternMasking consistently masks data that follows
a specific pattern.
You can define custom masking patterns in a way that is closer to regular expressions, avoiding the need to manually create a dictionary of values.
|Checkpointing interval in machine learning components (tALSModel, tMatchModel, tMatchPredict and tRandomForestModel||With the ability to activate Spark checkpointing and configure the checkpointing interval in machine learning components, you can now break up long Resilient Distributed Dataset (RDD) lineage and save the intermediate RDDs to a checkpointing directory at the configured interval. Checkpoints are useful when the lineage graphs are long and help avoid StackOverflow errors in high-iteration jobs.|
|Maximum Elasticsearch bulk size parameter in tMatchIndex and tMatchIndexPredict components||You can now define the maximum number of records for bulk operations in Elasticsearch.|
|Context node in the Repository tree view of Talend Studio||You can now create, edit, delete, import and export context items from the Profiling perspective.|
|Microsoft SQL Server database profiling||You can connect to a Microsoft SQL Server database using Windows authentication mode and perform data profiling on this database.|
|Microsoft SQL Sever 2016 support||
Talend DQ Portal supports
Microsoft SQL Sever 2016.
You can use a Microsoft SQL Sever 2016 database for the data quality data mart.
New DMG for Talend installation on MacOS
A new DMG is provided for installation of Talend Open Studio products on MacOS to prevent macOS Sierra from setting downloaded files to "quarantine".