A catalog is an inventory of data assets, such as database tables, Data Integration Jobs or BI reports.
Metadata is structured information that describes a data resource, such as its name, type, location, author, date created, size and relationships with other data objects.
- Metadata repository
Metadata repository stores metadata created or imported from data sources, project configurations and reports.
- Metadata harvesting
Metadata harvesting means collecting metadata from a data source, by using Talend Cloud Data Catalog bridges. The metadata is imported in a model and stored in the metadata repository.
A bridge is a platform-dedicated connector. It uses a specific driver to connect to a source tool and collect its metadata.
You can import metadata from data stores, Data Integration tools, Business Intelligence tools and business applications.
Once created, models are linked together in a configuration to define the data flow in the information system.
A configuration is an environment or workspace where you connect models to each other to build a global schema of the enterprise information system.
A glossary captures and defines the enterprise vocabulary to build a common language that everyone can understand.
- Data profiling
Data profiling is the process of examining the data from data sources imported in your catalog and collecting statistics and information about this data.
- Data sampling
Data sampling allows to preview the contents of database tables and data files imported in your catalog.
- Semantic type
A semantic type defines the structure or the possible values of elements.
During the data profiling and metadata harvesting process, Talend Cloud Data Catalog compares the data values with the semantic types available in its dictionary. When there is a match, the semantic type is assigned automatically.
These definitions will help you understand the main concepts in Talend Cloud Data Catalog.