Working with the Glossary model - 8.0

Talend Data Catalog User Guide

Version
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Catalog
Content
Data Governance
Last publication date
2023-09-26

The glossary is a type of custom model based on a flexible metamodel.

Talend Data Catalog provides an extensible metamodel-based glossary to capture, define, maintain and implement an enterprise glossary of terminology, data definitions, code sets, domains, validation rules, etc.

You can also use the terminology to classify and associate objects with data classes and automatic data classification. You can define semantic mappings to describe how elements in a source model (more conceptual like the business glossary) define elements in a destination model (closer to an implementation or representation).

A glossary helps your enterprise to reach agreement between all stakeholders on their business assets (such as terms) and how they relate to data assets (such as database tables) and technology assets (such as ETL mappings). You can use the glossary to document logical/physical data entities and attributes across IT collaboratively. It involves tracing dependencies between business and technical assets.

Glossary object types

There are two types of objects in the glossary metamodel:
Object type Description Impacted by workflow Contained by
Term Terminology used generally throughout the architecture Yes
  • Glossary root
  • Another term
Acronym Acronym (short form) for a term, often used in implementations Yes
  • Glossary root
  • Another term

Term hierarchy

A glossary is generally a flat collection of terms. As you can have any number of glossaries, you can include the terminology for particular domains in different glossaries. You can also include terms inside of a term to create a hierarchy of terms.

Term association types and semantic lineage

Terms can be cross-linked in a wide variety of relationship types. These relationships can have an impact on both semantic usage and definition.
Association type Description Effect on inferred name and description assignment Effect on definition lookup Effect on semantic usage
Has Synonym Terms with nearly identical meaning No Yes Yes
Has Acronym Acronym for a term No - -
See Also Additional related terms No No No
More General Terms which have a more general context or are a more abstract concept Yes Yes No
More Specific Terms which have a more precise context or are a more specific concept No No Yes
Contains Terms which are considered to contribute to the complete concept, such as Name Contains First Name No No No
Contained by Term which is the parent container in which this term is defined, such as Street Name is Contained By Physical Address No No No
References - No No No
Referenced by - No No No
Represents The relationship between a domain type term and the terms which are expressions of that domain, such as Account Amount Available Represents Unified Dollar Amount No No Yes
Represented by The relationship between a term that is an expression of a domain type term, such as Unified Dollar Amount is Represented by Account Amount Available No Yes No

These relationships can be defined between terms in the same glossary model or across different glossaries. You can replace a separate semantic mapping between two glossaries with the More General and More Specific term associations and they will behave the same way as the semantic links in a semantic mapping.

Workflow

By default, the workflow and approval processes are disabled for a glossary.
  • Talend Data Catalog provides a flexible and complete set of possible workflow and publication processes.

    You can use the workflow process when you have a formal glossary development process that involves multiple users.

  • Without a workflow, changes made to the glossary are reflected immediately throughout the system.

    This mode can be useful when you do not want the complexity of a workflow process. It is also useful when you are first building and populating a glossary and related semantic mappings.

You must have both the Workflow Editor and the Metadata Editor object role assignments to edit terminology for a glossary under workflow.

When you enable the workflow feature, Talend Data Catalog creates a published version of the glossary. The published version is the one to be presented to most of the users. Its contents are not directly editable (with or without permission). You cannot see the current edits and workflow states of the published version. You will only see what is published. Instead, you can edit the development version and then use the publish workflow step to change what is in the published glossary. You can see the glossary object in its current workflow status in the development version.

In terms of implementation, the published version of the glossary is associated with any configuration version.

You may associate an archived (historical) version of a glossary with a configuration, thereby making it the published version for the purposes of presentation.

The workflow process applies to all object in a glossary. When the workflow is enabled, some restrictions apply to the ability to perform certain actions:

  • You cannot delete a term that contains published terms.
  • You cannot publish a term until its parent is published (when creating them together).

Glossary models and other custom models

You can extend additional business attribute, object, and relationship types as needed. You can define associations with other custom models in the metamodel and associate these custom model instances, such as glossaries.