Data Catalog Application - 7.1

Talend Data Catalog Release Notes

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Installation and Upgrade
EnrichPlatform
Talend Data Catalog

General changes

Feature Description
OpenJDK All Java code is now compiled with OpenJDK.
Upgraded third-party and open source software All third-party and open source software has been upgraded to their latest versions for better security and vulnerability protection.

New Data Cataloging capabilities

Feature Description
New Data Cataloging capabilities Talend Data Catalog includes the essential features of Data Cataloging, such as faceted search, data sampling and profiling, semantic discovery, social curation and data relationship discovery.

You can manage enterprise data architectures for both cloud based data lake and classic data warehouse.

You can import metadata from both modern data technologies, including XML, JSON, Avro, Parquet, ORC, and Hive tables, and classic ones, including relational tables and CSV files.

New user interface experience

The Talend Data Catalog user interfaces have been refreshed to have a modern and more distinct look.

Feature Description
Metadata Explorer Talend Data Catalog Standard edition is now fully implemented in Metadata Explorer. You can directly create models, import metadata, stitch models, and trace lineage.
Metadata Manager Metadata Manager is now only available in the Talend Data Catalog Advanced and Advanced Plus editions for repository management, including multi version and configuration management.

Metadata Explorer

Feature Description
Improved editing capabilities All editing capabilities are now available in Metadata Explorer, including data mapping, enterprise architecture and administration features.
Improved browse and search capabilities with Apache Lucene indexing Talend Data Catalog now uses Apache Lucene as the search engine for performance improvements.

The BROWSE menu now presents a hierarchical display at all levels of any data sources.

You can now use the semantic search syntax and the faceted search capability for better search results.
New metadata reporting page The new metadata reporting capabilities allow both search and browse features to end up to the same reporting page.

You can select multiple categories and subsets by content before drilling down with the default and custom filters.

Reports can be reused by saving the URL as favorites.

New Lists feature You can define and manage lists of metadata objects and share them with other users. Lists can contain any type of metadata and multiple type of content.
Improved architecture diagram The graphical navigation and display have been improved.
New administration capabilities You can now handle most of administration tasks from the MANAGE menu.
Redesigned metadata element home pages The metadata element home pages have been redesigned with multiple tabs to offer quick access to all key information.
Improved metadata tagging with labels The metadata tagging with labels has been improved to be harmonized with the new list management experience to facilitate adding/removing objects anywhere.

Metadata documentation

Feature Description
Multi-line text documentation Multi-line text has been introduced for better formatting and layout, with support for URL links and embedded image attachments using a JIRA like syntax.

Multi-line text is the default format for all descriptions and comments and is now available as a new type of custom attribute that can be applied to any metadata for documentation.

Rich text documentation Rich text documentation with (WYSIWYG) Visual Edition is the default format for glossary term documentation and is now available as a new type of custom attribute that can be applied to any metadata for documentation.
SQL text of SQL View SQL text of SQL View, Stored Procedures and more are now better presented with colored syntax and optional reformatting.

This is not a new type of custom attribute but any predefined attribute with SQL is better formatted.

Improved attachment capability Attachments have been enhanced and have been integrated in Metadata Explorer. Management (drag and drop), preview, and thumbnails can be embedded in the text and multi-line text descriptions, comments and custom attributes.

Data modeling and documentation process

Feature Description
Improved documentation process with the new Semantic Flow tab Any DI, BI, reports or data stores can be documented, including support for relational data models.

You can classify any object with a local semantic link to a glossary term.

You can document any object with a local business name and definition overwriting any semantic link (Classified, Mapped or Inferred).

Physical data models Physical data models are now only available in the Talend Data Catalog Advanced Plus edition for data store requirements and database design.

You can use physical data models to create data models from scratch, such as design new HIVE table requirements, without pointing to a live database rather than simply documenting existing data stores.

Harvesting of databases documented in Data Modeling tools You can import a data model as a separate model and automatically stitch to its matching harvested database without using any semantic mapping model.

The semantic stitching is automatically maintained as both the database and its associated data model are independently re-imported on regular basis.

The documentation (business name, descriptions, relationships, diagrams) of any harvested database table or column is automatically inherited from its associated data model.

Data sampling, profiling and security

Feature Description
New data security roles The Data Viewer role allows you to view data profiling and sampling information.

The Data Manager role allows you to run data profiling and sampling. You can also hide sensitive data.

New data security protection Data managers can set a Hide Data property at the column level. You can automatically hide sensitive data using the semantic types.
New Sample Data tab You can view sample data to better define the metadata.
New data profiling statistics The data profiling statistics are displayed in the Overview tab from the home page of any data store object.

Semantic type discovery and management

Feature Description
Semantic discovery You can use semantic discovery to detect automatically the nature of data, using semantic types, patterns, lists and machine learning.

Relationship discovery and management

Feature Description
Relationship discovery Talend Data Catalog can automatically draw the links between datasets with relationship discovery. It can automatically detect:
  • inferred relationships using the surrounding data flow usage, such as joins in Data Integration and Business Intelligence activities.
  • new relationships using the relationship detection search feature based on metadata name, semantic definition and semantic type matching.
Relationship Management You can use social curation on the relationships.
Dynamic diagram You can visualize the relationships in dynamic diagrams from the Relationships tab.

Social curation

Feature Description
New social curation capabilities You can add endorsements, warnings and certifications on metadata elements, with impact on search ranking.

Semantic mapping

Feature Description
Improved semantic mapping The semantic mapping has been improved with two approaches:
  • top-down from business glossary term or data model entity/attribute,
  • bottom-up from Data Store tables/columns or report fields.

Semantic flow

Feature Description
Improved semantic flow analysis The semantic flow analysis now supports the documentation process acting as an interactive dashboard for finding definitions that are:
  • Local that has been either Imported (metadata harvesting) or locally Documented (edited description overwrite),
  • locally Classified within the model to an external glossary term,
  • directly Mapped via a semantic mapping model or direct stitching (such as between a database and its data model),
  • indirectly Inferred through complex data flow pass through and semantic flow (which can be graphically analyzed in the data flow diagram), or
  • Searched for by name in all glossaries.

Any of the Searched, Inferred or Mapped definitions can be quickly reused/promoted as a Classified or Mapped definition.

Related reports

Feature Description
Improved related reports Related reports are now available on any metadata objects such as files, tables or columns.

You can look at the result of a search to have direct access to a simple list of any related reports in any Business Intelligence tools, crossing all semantic and data flows.

You can open directly these reports in their respective Business Intelligence tool technologies.

Metadata stitching

Feature Description
Improved data connection and metadata stitching capabilities File format harvesting and stitching are now fully supported in Talend Data Catalog.

Connection pool factorization (such as from Data Integration and Business Intelligence servers) to minimize the number and complexity of stitching connections.

Data Mapping specifications and design

Feature Description
New Data Mappings model The data mapping specifications and design have been fully resigned and merged into a new Data Mappings model.

You can use data mappings for multiple purposes, including capturing data flow mapping requirements, and developing a full data mapping design that can be exported into SQL scripts or Data Integration/ETL tool jobs.

Improved data mapping capabilities The Data Mapping tool allows for the mapping of multiple source data stores into a target data store in multiple steps with bulk mappings and query mappings.

It also offers new graphical mapping visualization, and new expression syntactical editors when designing joins, lookups, filters, etc.

Active Data Governance

Feature Description
New export capabilities for physical data models or models Depending on your licence edition, you can export physical data models (PDM) or models harvested from Data Modeling tools or relational databases:
  • to any supported Data Modeling tools (such as Erwin),
  • to any supported Data Integration (DI/ETL) tools (such as Talend Data Integration) as source/target models,
  • to any supported Business Intelligence Design Tools (such as Tableau, SAP BusinessObjects or IBM Cognos).
New export capabilities for data mapping specifications and design You can export data mapping specifications and design to Talend Data Integration, depending on your licence edition.

Architecture, deployment and integration

Feature Description
Improved search engine Search engine has been redesigned and optimized with Apache Lucene offering near real-time search and navigation, and removing any dependencies of underlying database text search requirements.
Upgraded embedded third-party software Embedded third-party tools have been upgraded to the latest version, including Java 8, Apache Tomcat 9 or PostgreSQL 10, for security and performance improvements.
Improved Single Sign On (SSO) integration architecture Single Sign On (SSO) integration architecture has been redesigned for easy external authentication with redirect using custom scripts in any language such as Python.

Next cumulative patches will include support for the native cloud authentications, such as Amazon AWS and Microsoft Azure.

Support for OAuth 2.0 (Open Authorization) Talend Data Catalog now supports the SSO authentication with OAuth 2.0.