Skip to main content

Metadata Harvesting General Principles

We standardized our terminology as follows, there are 2 types of models on the repository:

  • Imported Models are the models associated with an import bridge to be populated through the model harvesting process. Many times these models are referred to as technical models, however they are also sometimes considered business models when imported from business applications or business intelligence (BI) tools.
  • Custom Models are instantiations of a custom model type in the metamodel and may be populated via the UI, bulk CSV import, or the REST API. They are commonly referred to as business models, however they are also sometimes considered technical models withing the domains of reference data, business rules, etc.

When harvestings an imported model from source tools and formats, there are several considerations:

  • Ensuring that one has proper connectivity to the external format metadata source. This could be:
  • One or more files
  • An external tool application programming interface (API)
  • An external tool API based upon a client installation
  • Ensuring that one has full access to any auxiliary resources as need. This depends upon the external format one is attempting to connect to, but general examples include:
  • Substitution parameter definition files for tools where substitution variables may be defined in the source metadata and are required in order to parse it successfully
  • Connection information to data sources like database connection names

Many harvest actions will require pointing to files on the Talend Data Catalog application server. The drives available for browsing are controlled by the conf.properties file. More details may be found in the deployment guide.

Harvesting always captures the metadata of the source. In addition, in the case of data stores (e.g. File systems, databases, etc.) May also include sampling and profiling the data contained within these sources. It is optional and requires greater access to the source systems.

Information note

All these requirements are documented in the bridge tool tips, which are available in the Help panel on the Import Setup tab. A harvested model may also be used at the basis for a Documentable Physical Data Model.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!