Configuring a remote harvesting server using the Setup utility - 7.3

Talend Data Catalog Administration Guide

author
Talend Documentation Team
EnrichVersion
7.3
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Administration and Monitoring
Data Governance
EnrichPlatform
Talend Data Catalog

Before you begin

You have downloaded and decompressed the TDC-x.y-YYYYMMDD.zip or TDC-x.y-YYYYMMDD.tbz2 file on your machine.

Procedure

  1. In the software home directory, execute the setup.bat or setup.sh file.
  2. Go to the Application Server tab.
  3. Select the Metadata Harvesting Server Only check box.
  4. Configure the following fields:
    Field Description
    Metadata Harvesting Browse Path Type in the local and mapped network drives that will be available in the Talend Data Catalog user interface during the metadata harvesting.

    By default, the value is set to '*' which includes any Windows drive or any directory from root on Linux. You should limit the access to a common shared data location and avoid the system area.

    The server must have access to the metadata harvesting files and directories anytime another event is to occur such as a scheduled harvest. When harvesting a model, the user interface presents a set of paths that can be browsed to select these files and directories.

    For Windows based application servers, when running as a service, you should specify the physical drives by letters and the complete network paths, for example M_BROWSE_PATH=C:\, E:\, \\network-drive\shared\.

    The mapped drive names and paths cannot be the same as what a user sees when logged in. The "*" value will not see all drives when selecting drives from the UI. It is not sufficient to enter the mapped drive ID such as N:\, as that drive mapping is generally not available to services. It also applies to script backup and restore drives.

    Data Directory Optionally, enter a new location to relocate the data files such as the log files and the metadata incremental harvesting cache, if needed for very large Data Integration or Business Intelligence tools.

    By default, the data directory is located in the data subdirectory of the application server home directory. It is recommended to separate the program data from the program files. It allows you to provide a new location for the data in a separate area.

    Max memory

    Optionally, define the maximum memory used by Java (JRE) on the Talend Data Catalog application server.

    Port Number Optionally, set a custom port number by default to avoid conflicts with other web application servers.
  5. If the remote harvesting server is connected to a Talend Data Catalog server installed on the cloud, open the <TDC_HOME>\TalendDataCatalog\conf\agent.properties configuration file to perform additional customizations.
  6. Configure the following parameters:
    Parameter name Description
    M_SERVER_URL Enter the URL of the Talend Data Catalog server installed on the cloud.
    M_AGENT_NAME Type in a shared secret set to anything as long it is unique, reasonably descriptive so it can usefully be identified in the UI, hard to guess since it also works as a shared secret string.

    You will use this shared secret when adding the remote harvesting server in the Talend Data Catalog UI.

  7. Save your changes.

Results

You are ready to add the remote harvesting server in Talend Data Catalog.