Skip to main content

Talend Data Catalog Application

Feature Description
New metamodel management With the new metamodel management feature, you can define custom models and extend imported models for data management such as reference data, data quality, data security, data issue management, business rules and policies, business process modeling and improvements or regulation compliance.
The metamodeling capability is available in the Manage Metamodel page.
New custom business models You can use the new metamodeling capability to define custom models, such as models related to reference data, business policy and rule management or data issue management.

Custom models are instantiations of a custom model type defined in the Manage Metamodel page. Talend Data Catalog provides an object modeling capability and a graphical editing capability of UML class diagrams for these custom models.

Talend Data Catalog also provides standard and system business models and model extensions. The Standard package now includes the Glossary model with new KPI and Acronym objects.

Once defined, you can use the same features available for the imported models on the custom models, including data entry, analysis and reporting using worksheets and dashboards.
  • You can use the new Hierarchies tab in the object page to explore the models hierarchically and facilitate the data entry and reporting.
  • You can enable and customize workflow and publication processes on custom models to control the changes made to the object classes from the Workflow tab.
  • You can import or export subsets of the metamodel for external bulk editing and reporting using metamodel packages. It allows you to develop and supply third-party extension packages and define actual connectors with the tools or applications behind the custom models, such as JIRA for a model related to data issue management. This feature is available from the Manage Metamodel page.
New extensions of imported technical models You can use the new metamodeling capability to extend the data documentation of imported models.
Imported models are the models associated with an import bridge and populated through the harvesting process.
  • You can define new imported object types from the Manage Metamodel page. It allows you to group similar profile objects into collections of imported object types and apply custom attributes to these collections at the same time.
  • You can define, apply and reuse custom attributes for both imported and custom objects. You no longer need to redefine the scope of each custom attribute applying to similar imported objects.
  • You can define custom relationships from custom objects to imported objects, for example to enforce business rules and policies in your data assets.

    Custom relationships can also be set to be involved in the semantic flow. For example, you can see the term definition of a table column and the associated business rules in the Semantic Flow tab.

  • You can define a new Is Defined By relationship to implement the process of term documentation. It helps you to adapt the processes to the continuous changes in technology and architectures, such as reimporting the data documentation into a new implementation after a cloud migration.
New role-based access control with global and object roles Talend Data Catalog now provides a new role-based access control.
  • The global roles determine the global responsibilities that users have on all catalog assets. You manage these roles from the new Manage Global Roles page.
  • The object roles determine the responsibilities that users have on specific catalog assets, such as glossaries or models. You manage these roles from the new Manage Object Roles page.
    You assign the object roles to objects from the new Responsibilities tab from the configuration or repository manager or from the object page.
  • The global and object roles provide a set of predefined capabilities that define the actions you can perform in the catalog, such as management or editing capabilities.
  • Predefined global and object roles are available and customizable to meet the specific needs of your organization. The predefined roles are no longer hard-coded.

    You can customize the predefined global and object roles and modify the capability assignment. You can also create new roles from scratch or based on existing ones.

  • When you associate users or group of users with object or global roles, this association is referred to as a responsibility.
New data classification Data classification helps you to detect, understand and classify the nature and purpose of the elements contained in the data sources imported in your catalog. Data classes now replaces the semantic types.
You manage data classes from the new Manage Data Classes page.
There are several types of data classes:
  • The Data type has been improved and helps you to detect the nature of data using the data sampling and profiling capability based on enumerations, patterns and regular expressions.
    You can now set matching rules using the new matching threshold and the uniqueness threshold features.
    You can now use the new automatic semantic discovery based on machine learning for data patterns or enumerations, such as for automatically learning new code values. This process improves the suggestions for the auto tagging made by Talend Data Catalog.

    You can now use the server-side re-classification capability on demand after adding or updating data classes. You no longer need to generate the data sampling and profiling operation to propagate the changes.

  • The new Metadata type detects classes by metadata attributes. It helps you to detect sensitive data that cannot be identified by the data sampling and profiling process. This feature is powered by the Metadata Query Language.
  • The Compound type has been improved and is based on multiple data classes.
Talend Data Catalog provides new PII data classes of type Data, Metadata and Compound to help you to identify and hide sensitive data easily.
New sensitivity labels The new sensitivity labels allow you to identify sensitive data.
You can see a new Sensitivity Label icon on the top right side of the object details pages.
You can manage and customize these labels from the new Manage Sensitivity Labels page.

You can apply these labels by tagging each object manually, using the bulk editing capability in worksheets, using the automatic detection of data classification or using inferred sensitivity labels.

New conditional labels You can define new conditional labels based on the Metadata Query Language (MQL), such as a "Highly Commented" label based on objects with over five comments.
You can manage and create these labels from the new Manage Conditional Labels page.
You can see conditional labels in the Conditional Labels area from the overview tab of the object details pages.

Conditional labels can be displayed in search results, worksheets or data flow lineage diagrams.

New label management You can now review or delete the label assignments from the new Manage Labels page.

By default, you see all labels in the repository from the All Labels view. You can also see the labels assigned to objects in the current configuration from the Configuration Labels view.

New object watcher and email notifications The new watcher notifications allow you to inform the watchers of an object when certain events occur to this object.
You enable the email notifications for watchers at the server side from the Manage Email page.

You set the watcher editing and management capabilities via object roles from the Manage Object Roles page.

You set the notification frequency for watchers from the Manage Users page or from the User Profile UI.

You can see a new Watcher icon on the top right side of the object pages. The menu allows you to start or stop watching an object, see the count of watchers or manage the watchers on an object.
This feature is available for imported and custom models. It is available only at the model level or at the sub-model level if it is a multi model.
  • If you are a watcher of a technical model, you receive an email per model and per type of activity if there are changes after an import or any other changes such as a new certification.
  • If you are a watcher of a business model, data or semantic mapping or physical data model, you receive an email per model on any changes at any level.

You can receive an email with change summary statistics and a link to a model version comparator report.

You can also receive notification emails depending on your role and capability assignments for workflow transitions, configuration changes or server errors.

New capability to store credentials on cloud secret managers You can now store the bridge credentials such as user, password or private key on a cloud secret manager from the new Manage Secret Vaults page.
Talend Data Catalog supports the following cloud secret managers:
  • Amazon AWS Secrets Manager
  • Microsoft Azure Key Vault
  • Google Secret Manager
Improved automation and productivity of data documentation The data documentation helps you to define technical data in business terms that everyone can understand. There are now several categories of data documentation:
  • Business documentation provides a local documentation with a business name and description. You can use it as an alternative of the term documentation.
  • Term documentation (previously called term classification) allows to document an imported object with one or more terms from a glossary. It now creates a Is Defined By relationship.
  • Mapped documentation allows to document an imported object connected by a semantic mapping with one or more glossary terms or entities/attributes from a data model.
  • Inferred documentation provides data documentation on an imported object automatically generated from other objects involved in its data flow pass-through lineage and impact. This feature improves the automatic data documentation coverage on many data stores.
You can find new Business, Term, Mapped or Inferred Documentation wizards in the Overview tab of an object page. Talend Data Catalog can suggest you business names based on the technical names by using the naming standards and supervised learning features. It can also suggest a business description from the inferred documentation.

You can create new KPI graphical widgets on the data documentation coverage by using the new Term Documentation and Inferred Documentation attributes available in the REST API, MQL, worksheets and dashboards.

New Updated Date sort capability You can now sort results by Updated Date in the object explorer, worksheets or your searches.
Improved architecture of data profiling and sampling The results of data sampling and profiling collected by the remote harvesting servers are now saved on the server side. You can now have an update of the data sampling and profiling automatically or on demand, for example after the creation of a data class, without having to go back to the remote harvesting servers.
Improvement of metadata reporting and presentations You can now import and export default presentations between servers from the Manage Default Presentations page.

New graphical widgets are available for the presentations of object details pages.

Improvement of the REST API features The new features are available in the REST API:
  • new scope parameter for the MQL Query functions
  • new import and export features for metamodel packages in Repository
  • new import and export features for imported and custom models in Repository
  • new features to manage global and object roles in Roles
  • new Data Classes group to manage data classes and replace the semantic types
  • new Sensitivity Labels group to use sensitivity labels

For more information, click the See General Documentation link from the Talend Data Catalog REST API documentation page.

Improvement of Metadata Query Language (MQL) You no longer need to use the special character syntax on attributes when using Metadata Query Language in your reports, dashboards or worksheets.

The reporting capability in worksheets and dashboards has been improved with new supported system objects related to data sampling, data profiling, data classification, object and global roles and workflow actions. You can refer to the New worksheet attributes entry below to see the new system attributes available for queries.

For more information, click the See General Documentation link from the Talend Data Catalog REST API documentation page.

New worksheet attributes
  • New lineage attributes are available, including Has Semantic Usage, Has Semantic Definition, Has Data Lineage and Has Data Impact, to detect unused objects. These attributes can also be used as filters.

  • New data classification attributes are available including Data Classifications, Data Classification Matched, Data Classification Rejected and Data Classification Approved.

  • New data documentation attributes are available including:
    • Term Documentation shows the list of terms (name and description) documenting the object.
    • Mapped Documentation shows the list of semantically mapped objects documenting the object.
    • Inferred Documentation shows the list of terms indirectly documenting the objects through its pass-through data lineage / impact.
    • Documentation shows the summarized documentation of the object.

      The summarized documentation returns the first documentation found on the object following the following priority: Business Documentation > Term Documentation > Mapped Documentation > Inferred Documentation > Imported (Documentation) > Searched (Documentation).

  • New glossary attributes are available including Is Defined By to show a list of terms and Long Description.

  • New data profiling attributes are available including Data Profiling, Distinct, Duplicate, Empty, Valid, Invalid, Min, Max, Mean, Variance, Median, Lower Quantile, Upper Quantile, Avg Length, Min Length, Max Length and Inferred Data Types.

  • New attributes related to social curation are available including Certified By, Endorsed By, Commented By, Warned By.

    New Endorsement Count, Comment Count, Warning Count attributes have been added to the list of possible filters to produce worksheets or dashboards with popular objects.

  • New workflow attributes are available including Workflow State, Workflow Published and Workflow Deprecation Requested that now apply to any object of a model involved in a workflow process.

  • New Stewards attribute is available in the Responsibilities tab.

  • The Parent Object Name and Parent Object Type attributes have been added.

  • Object roles can be used as columns or filters, such as expandedMembersOfRole('Steward') = ANY('Business Users') as a filter example or membersOfRole('Steward') as a search example.

  • Object relationships/children can be used as columns.

Third-party and open source software All third-party and open source software has been upgraded to their latest versions for better security and vulnerability protection.
Security improvements
  • The metadata harvesting browse path is no longer defined as * by default (allowing to browse any drives, directories and files) for security reasons. Administrators must use the Setup UI or command line to define the scope of file browsing.
  • Tomcat web applications like MMDoc.war (REST API help) are no longer enabled by default for security reasons (Swagger unauthenticated sensitive endpoints). They have been moved from tomcat/webapps to tomcat/dev.

    If necessary, they can be enabled with the Setup UI or command line using Setup.bat -we mmdoc. This will create the context MMDoc.xml in tomcat/MetaIntegration/localhost to make the web application available and start it.

Improvement of the Export to CSV feature

Exported CSV file is improved to contain BOM (Byte Order Mark).

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!