Available in...Big Data
Big Data Platform
Data Fabric
Data Integration
Data Management Platform
Data Services Platform
ESB
MDM Platform
Real-Time Big Data Platform
Talend Studio
allows you to enable runtime lineage for Standard Jobs, which can be leveraged in a
future release by the analysis capability of Talend Data Catalog for
the runtime metadata, for example, the query with variables, the schema with dynamic
columns, etc.
When executing a Standard Job for which runtime lineage is enabled, the information
needed by Talend Data Catalog, for
example, the Job name, the component name, the schema, the query, etc., will be written
into a JSON file.
Note: To fully use this feature, you must install Talend Data Catalog.
For more information on Talend Data Catalog, see
Talend Data Catalog User Guide.
About this task
To enable runtime lineage for Standard Jobs, complete the following:
Procedure
-
Go to the installation directory of your Talend Studio.
-
Add the
-Druntime.lineage=true
attribute in the corresponding
.ini file according to your operating system to enable
the runtime lineage feature in Talend Studio.
-
Save the file and start your Talend Studio.
-
Click
on the toolbar of the Studio main window or click from the menu bar to open the Project
Settings dialog box.
-
In the tree view of the dialog box, expand the Job
Settings node and then click Runtime
lineage to display the corresponding view.
-
Enable runtime lineage for Standard Jobs via either of the following two
ways:
- To enable runtime lineage for all Standard Jobs, select the
Use runtime lineage for all Jobs check
box.
- To enable runtime lineage for specific Standard Jobs, select the check
boxes corresponding to the Jobs in the Use runtime lineage
for selected Jobs area.
-
In the Output path field, specify the path where you
want to save the JSON files used by Talend Data Catalog.
Later, each time you execute a Standard Job for which runtime lineage is
enabled, a JSON file will be saved under a directory with the format
<output_path>/<project>/<jobname>/<version>/runtime_log_<timestamp>.json, where
- <output_path> is the path specified in the
Output path field,
- <project> is the name of the project,
- <jobname> is the name of the Job,
- <version> is the version of the Job, and
- <timestamp> is the timestamp when the JSON file
is generated.
You can also set the output path by adding a JVM parameter
-Druntime.lineage.outputpath=<output_path>
for the
Job via one of the following ways:
Note: The output path must be specified for saving the JSON files. If the
output path value is specified in multiple places, one of them will take
effect according to the following precedence: 1) the value of the JVM
parameter for specific Job, 2) the value of the Output path field, 3) the
value of the JVM parameter for all Jobs, 4) the value of the JVM parameter
in the shell script.
-
Click Apply and Close to apply your changes and close
the dialog box.