Big Data Platform
Data Management Platform
Data Services Platform
Real-Time Big Data Platform
Talend Studio allows you to enable runtime lineage for Standard Jobs, which can be leveraged in a future release by the analysis capability of Talend Data Catalog for the runtime metadata, for example, the query with variables, the schema with dynamic columns, etc.
When executing a Standard Job for which runtime lineage is enabled, the information needed by Talend Data Catalog, for example, the Job name, the component name, the schema, the query, etc., will be written into a JSON file.
About this task
- Go to the Talend Studio installation directory.
-Druntime.lineage=trueattribute in the corresponding .ini file according to your operating system to enable the runtime lineage feature in Talend Studio.
- Save the file and start Talend Studio.
- Click on the toolbar of the Talend Studio main window or click from the menu bar to open the Project Settings dialog box.
In the tree view of the dialog box, expand the Job
Settings node and then click Runtime
lineage to display the corresponding view.
Enable runtime lineage for Standard Jobs via either of the following two
- To enable runtime lineage for all Standard Jobs, select the Use runtime lineage for all Jobs check box.
- To enable runtime lineage for specific Standard Jobs, select the check boxes corresponding to the Jobs in the Use runtime lineage for selected Jobs area.
In the Output path field, specify the path where you
want to save the JSON files used by Talend Data Catalog.
Later, each time you execute a Standard Job for which runtime lineage is enabled, a JSON file will be saved under a directory with the format <output_path>/<project>/<jobname>/<version>/runtime_log_<timestamp>.json, where
- <output_path> is the path specified in the Output path field,
- <project> is the name of the project,
- <jobname> is the name of the Job,
- <version> is the version of the Job, and
- <timestamp> is the timestamp when the JSON file is generated.
You can also set the output path by adding a JVM parameter
-Druntime.lineage.outputpath=<output_path>for the Job via one of the following ways:
Note: The output path must be specified for saving the JSON files. If the output path value is specified in multiple places, one of them will take effect according to the following precedence: 1) the value of the JVM parameter for specific Job, 2) the value of the Output path field, 3) the value of the JVM parameter for all Jobs, 4) the value of the JVM parameter in the shell script.
- add the JVM parameter for a specific Job in the Setting advanced execution settings. view. For more information, see
- add the JVM parameter globally for all Jobs in the Preferences dialog box. For more information, see Debug and Job execution preferences (Talend > Run/Debug).
- add the JVM parameter in the shell script used for building Jobs in the Project Settings dialog box. For more information, see Customizing shell command templates.
- Click Apply and Close to apply your changes and close the dialog box.