Metadata harvesting means collecting all metadata from a data source.
You harvest metadata by using Talend Data Catalog bridges.
A bridge is a connector dedicated to a platform. It uses a specific driver to connect to a data source system and collect its metadata.
The following table presents the types of data sources from which you can
harvest metadata, depending on your edition.
Talend Data Catalog | Standard | Advanced | Advanced Plus |
---|---|---|---|
Harvesting from any supported data store technologies | |||
Harvesting from any supported Data Model tools | |||
Data Integration with DI, ETL and ELT tools | |||
Harvesting from Talend Data Integration, Talend MDM and Talend Data Preparation | |||
Harvesting from any supported Data Integration tools | |||
Data Integration with SQL Scripts and other codes | |||
Harvesting from HiveQL Scripting | |||
Harvesting from any supported SQL Scripting | |||
Business Intelligence (BI Reporting) | |||
Harvesting from Tableau or Qlik | |||
Harvesting from any supported Business Intelligence tools | |||
Harvesting from any supported Metadata Management tools (such as Apache Atlas or Cloudera Navigator) | |||
Business Applications | |||
Harvesting from Salesforce | |||
Harvesting from any supported Business Application tools (such as SAP Business Warehouse 4 HANA) |
For more information about the bridges, see Talend Data Catalog Bridges on Talend Help Center.
Before harvesting metadata
Before harvesting metadata, it is important to analyze where the metadata reside, what technology are required to extract them and what process to be followed in order to ensure a proper extraction.
When harvesting metadata in a Talend Data Catalog project, you should follow a specific order:
- Identify sources data stores, such as operational data stores.
- Identify data transformation process, such as ETL or ELT.
- Identify business intelligence systems.
- Identify existing conceptual models.
- Configure a bridge and harvest metadata for each system.
You should also organize your metadata repository with labeled folders, for example for each category of metadata.