tRunJob - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tRunJob executes the Job called in the component's properties, in the frame of the context defined.

Purpose

tRunJob helps mastering complex Job systems which need to execute one Job after another.

If you have subscribed to one of the Talend solutions with Big Data, this component is available in the following types of Jobs:

tRunJob Properties

Component family

System

 

Basic settings

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

This component offers the advantage of the dynamic schema feature. This allows you to retrieve unknown columns from source files or to copy batches of columns from a source without mapping each column individually. For further information about dynamic schemas, see Talend Studio User Guide.

This dynamic schema feature is designed for the purpose of retrieving unknown columns of a table and is recommended to be used for this purpose only; it is not recommended for the use of creating tables.

 

 

Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.

 

Copy Child Job Schema

Click to fetch the child Job schema.

 

Use dynamic job

Select this check box to allow multiple Jobs to be called and processed. When this option is enabled, only the latest version of the Jobs can be called and processed. An independent process will be used to run the subjob. The Context and the Use an independent process to run subjob options disappear.

Warning

The Use dynamic job option is not compatible with the Jobserver cache. Therefore, the execution may fail if you run a Job that contains tRunjob with this check box selected in Talend Administration Center.

Warning

This option is incompatible with the Use or register a shared DB Connection option of database connection components. When tRunJob works together with a database connection component, enabling both options will cause your Job to fail.

 

Context job

This field is visible only when the Use dynamic job option is selected. Enter the name of the Job that you want to call from the list of Jobs selected.

 

Job

Select the Job to be called in and processed. Make sure you already executed once the Job called, beforehand, in order to ensure a smooth run through tRunJob.

 

Version

Select the child Job version that you want to use.

 

Context

If you defined contexts and variables for the Job to be run by the tRunJob, select the applicable context entry on the list.

 

Use an independent process to run subjob

Select this check box to use an independent process to run the subjob. This helps in solving issues related to memory limits.

Warning

This option is not compatible with the Jobserver cache. Therefore, the execution may fail if you run a Job that contains tRunjob with this check box selected in Talend Administration Center.

Warning

This option is incompatible with the Use or register a shared DB Connection option of database connection components. When tRunJob works together with a database connection component, enabling both options will cause your Job to fail.

 

Die on child error

Clear this check box to execute the parent Job even though there is an error when executing the child Job.

 

Transmit whole context

Select this check box to get all the context variables from the parent Job. Deselect it to get all the context variables from the child Job.

If this check box is selected when the parent and child Jobs have the same context variables defined:

  • variable values for the parent Job will be used during the child Job execution if no relevant values are defined in the Context Param table.

  • otherwise, values defined in the Context Param table will be used during the child Job execution.

 

Context Param

You can change the value of selected context parameters. Click the [+] button to add the parameters defined in the Context tab of the child Job. For more information on context parameters, see Talend Studio User Guide.

The values defined here will be used during the child Job execution even if Transmit whole context is selected.

Advanced settings

Propagate the child result to the output schema

Select this check box to propagate the output data stored in the buffer memory via the tBufferOutput component in the child Job to the output component in the parent Job.

This check box is cleared by default. It is invisible when the Use dynamic job or Use an independent process to run subjob check box is selected.

Print Parameters

Select this check box to display the internal and external parameters in the Console.

 

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

CHILD_RETURN_CODE: the return code of a child Job. This is an After variable and it returns an integer.

CHILD_EXCEPTION_STACKTRACE: the exception stack trace from a child Job. This is an After variable and it returns a string.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

 Connections

Outgoing links (from this component to another):

Row: Main.

Trigger: On Subjob Ok; On Subjob Error; Run if; On Component Ok; On Component Error

Incoming links (from one component to this one):

Row: Main; Reject; Iterate.

Trigger: On Subjob Ok; On Subjob Error; Run if; On Component Ok; On Component Error; Synchronize; Parallelize.

For further information regarding connections, see Talend Studio User Guide.

Usage

This component can be used as a standalone Job or can help clarifying complex Job by avoiding having too many sub-jobs all together in one Job.

If you want to create a reusable group of components to be inserted in several Jobs or several times in the same Job, you can use a Joblet. Unlike the tRunJob, the Joblet uses the context variables of the Job in which it is inserted. For more information on Joblets, see Talend Studio User Guide.

This component also allows you to call a Job of a different framework, such as a Spark Batch Job or a Spark Streaming Job.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

The Use dynamic job option of tRunJob allows the main Job to call the child Job dynamically at runtime. However, this option does not work if the child Job had been exported with the Axis WebService (WAR) type, and been deployed to Tomcat as a web service.

For an alternative solution, see https://help.talend.com/pages/viewpage.action?pageId=27029205 for details.

Scenario 1: Executing a child Job

This scenario describes a two-component Job that calls another Job, which is the child Job, to display the content of files specified in the parent Job on the Run log console.

Creating the child Job

Dropping and linking components

  1. Drop a tFileInputDelimited and a tLogRow from the Palette to the design workspace.

  2. Connect the two components together using a Row > Main link.

Configuring the components

  1. Double-click tFileInputDelimited to open its Basic settings view and define its properties.

  2. Click in the File Name field and then press F5 to open the [New Context Parameter] dialog box and configure the context variable.

  3. In the Name field, enter a name for this new context variable, File in this example.

  4. In the Default value field, enter the full path to the default input file.

  5. Click Finish to validate the context parameter setup and fill the File Name field with the context variable.

    Note

    You can also create or edit a context parameter in the Context tab view beneath the design workspace. For more information, see Talend Studio User Guide.

  6. Click the [...] button next to Edit schema to open the [Schema] dialog box where you can configure the schema manually.

  7. In the dialog box, click the [+] button to add columns and name them according to the input file structure.

    In this example, this component will actually read files defined in the parent Job, and these files contain up to five columns. Therefore, add five string type columns and name them col_1, col_2, col_3, col_4, and col_5 respectively, and then click OK to validate the schema configuration and close the [Schema] dialog box.

  8. Double-click tLogRow to display its Basic settings view and define its properties.

  9. Select the Table option to view displayed content in table cells.

Creating the parent Job

Dropping and linking components

  1. Drop a tFileList and a tRunJob from the Palette to the design workspace.

  2. Connect the two components together using an Iterate link.

Configuring the components

  1. Double-click tFileList to open its Basic settings view and define its properties.

  2. In the Directory field, specify the path to the directory that holds the files to be processed, or click the [...] button next to the field to browse to the directory.

    In this example, the directory is called tRunJob and it holds three delimited files with up to five columns.

  3. In the FileList Type list, select Files.

  4. Check that the Use Glob Expressions as Filemask check box is selected, and then click the [+] button to add a line in the Files area and define a filter to match files. In this example, enter "*.csv" to retrieve all delimited files.

  5. Double-click tRunJob to display its Basic settings view and define its properties.

  6. Click the [...] button next to the Job field to open the [Find a Job] dialog box.

  7. Select the child Job you want to execute and click OK to close the dialog box. The name of the selected Job appears in the Job field.

  8. In the Context Param area, click the plus button to add a line and define the context parameter. The only context parameter defined in the child Job, named File, appears in the Parameter cell.

  9. Click in the Values cell, press Ctrl+Space on your keyboard to access the list of context variables, and select tFileList-1.CURRENT_FILEPATH.

    The corresponding context variable appears in the Values cell: ((String)globalMap.get("tFileList-1.CURRENT_FILEPATH")).

    Note

    For more information on context variables, see Talend Studio User Guide.

Executing the parent Job

  1. Press Ctrl+S to save your Job.

  2. Press F6 to execute the Job.

    The parent Job calls the child Job, which reads the files defined in the parent Job, and the content of the files is displayed on the Run console.

Related topic: tLoop, and Scenario 1: Buffering data of the tBufferOutput component.

Scenario 2: Running a list of child Jobs dynamically

This scenario describes a Job that calls two child Jobs dynamically. When called from the parent Job, each of these simple child Jobs displays a message on the console.

Setting up the child Jobs

  1. Create a Job named child_1, and add two components by typing their names on the design workspace or dropping them from the Palette:

    • a tFixedFlowInput, to generate a message

    • a tLogRow, to display the generated message on the console.

  2. Connect the tFixedFlowInput component tLogRow using a Row > Main connection.

  3. Double-click the tFixedFlowInput component to open its Basic settings view.

  4. Click the [...] button next to Edit schema to open the [Schema] dialog box and define the schema of the input data.

    In this example, the schema has only one column: Message (type string).

    When done, click OK to close the dialog box and click Yes when prompted to propagate the schema to the next component.

  5. Select the Use Inline Content option, and enter the message you want to show on the console in the Content field, Hello World! in this example.

  6. In the Basic settings view of the tLogRow component, select the Table mode to display the execution result in table cells.

  7. Create copy of this Job and name it child_2, and enter another message in the Content field of the tFixedFlowInput component, Hello Talend! in this example.

  8. Press Ctrl+S to save the Jobs.

Setting up the parent Job

Adding and link components

  1. Create a new Job and add the following components by typing their names on the design workspace or dropping them from the Palette:

    • a tFixedFlowInput, to specify a list of Jobs to be called dynamically,

    • a tFlowToIterate, to iterate on the input rows and store the content into an iterative global variable,

    • a tRunJob, to load and run the child Jobs

  2. Connect the tFixedFlowInput component to the tFlowToIterate component using a Row > Main connection.

  3. Connect the tFlowToIterate component to the tRunJob component using a Row > Iterate connection.

Configuring the components

  1. Double-click the tFixedFlowInput component to open its Basic settings view.

  2. Click the [...] button next to Edit schema to open the [Schema] dialog box and define the schema of the input data.

    In this example, the schema has only one column: Job (type string).

    When done, click OK to close the dialog box and click Yes when prompted to propagate the schema to the next component.

  3. Select the Use Inline Content option, and specify the names of the child Jobs to call from the parent Job in the Content field:

    child_1
    child_2
  4. Double-click the tRunJob component to open its Basic settings view.

  5. Select the Use dynamic job check box.

  6. Click in the Context job field, press Ctrl+Space and from the list of variables select the iterative global variable created by the tFlowToIterate component, Job in this example. The Context job field is then filled with ((String)globalMap.get("row1.Job")). Upon each iteration, this variable will be resolved as the name of the Job to be called.

  7. Click the [...] button next to the Job field to open the [Select Job] dialog box. Select all the Jobs you want to run and click OK to close the dialog box.

  1. Press Ctrl+S to save the Job.

  2. Press F6 or click the Run button on the Run console to execute the Job.

    The child Jobs are called one after another and messages specified in the child Jobs are displayed on the console.

Scenario 3: Propagating the buffered output data from the child Job to the parent Job

In this scenario, a three-component Job calls a two-component child Job and displays the buffered output data of the child Job, instead of the data from the input flow of the parent Job, on the console.

Setting up the child Job

  1. Create a Job named child, and add two components by typing their names on the design workspace or dropping them from the Palette to the design workspace:

    • a tFixedFlowInput, to generate a message

    • a tBufferOutput, to store the generated message in the buffer memory

  2. Connect the tFixedFlowInput component to the tBufferOutput component using a Row > Main connection.

  3. Double-click the tFixedFlowInput component to open its Basic settings view.

  4. Click the [...] button next to Edit schema to open the [Schema] dialog box and define the schema of the input data. In this example, the schema has only one column message of the string type.

    When done, click OK to validate the changes and then click Yes in the pop-up [Propagate] dialog box to propagate the schema to the next component.

  5. In the Mode area, select Use Single Table option, and define the corresponding value for the message column in the Values table. In this example, the value is "message from the child job".

Setting up the parent Job

  1. Create a Job, and add three components by typing their names on the design workspace or dropping them from the Palette to the design workspace:

    • a tFixedFlowInput, to generate a message

    • a tRunJob, to call the Job named child

    • a tLogRow, to display the execution result on the console

  2. Connect the tFixedFlowInput component to the tRunJob component and the tRunJob component to the tLogRow component using the Row > Main connections.

  3. Double-click the tFixedFlowInput component to open its Basic settings view.

  4. Click the [...] button next to Edit schema to open the [Schema] dialog box and define the schema of the input data. In this example, the schema has only one column message of the string type.

    When done, click OK to validate the changes.

  5. In the Mode area, select the Use Single Table option, and define the corresponding value for the message column in the Values table. In this example, the value is "message from the parent job".

  6. Click the tRunJob component and then click the Component tab to open its Basic settings view.

  7. Click the Sync columns button and then click Yes in the pop-up [Propagate] dialog box to retrieve the schema from the preceding component.

  8. Click the [...] button next to the Job field to open the [Repository Content] dialog box.

    In the [Repository Content] dialog box, select the Job named child and then click OK to close the dialog box.

  9. In the Advanced settings view of the tRunJob component, select the Propagate the child result to the output schema check box. With this check box selected, the buffered output of the child Job will be propagated to the output component.

Executing the parent Job

  1. Press Ctrl+S to save the Job.

  2. Press F6 or click the Run button on the Run console to execute the Job.

    The child Job is called and the message specified in the child Job, rather than the message defined in the parent Job, is displayed on the console.

tRunJob in Talend Map/Reduce Jobs

Warning

The information in this section is only for users that have subscribed to one of the Talend solutions with Big Data and is not applicable to Talend Open Studio for Big Data users.

In a Talend Map/Reduce Job, tRunJob, as well as the other Map/Reduce components preceding it, generates native Map/Reduce code. This section presents the specific properties of tRunJob when it is used in that situation. For further information about a Talend Map/Reduce Job, see Talend Big Data Getting Started Guide.

Component family

System

 

Basic settings 

Use dynamic job

Select this check box to allow multiple Jobs to be called and processed. When this option is enabled, only the latest version of the Jobs can be called and processed. An independent process will be used to run the subjob. The Context and the Use an independent process to run subjob options disappear.

Warning

The Use dynamic job option is not compatible with the Jobserver cache. Therefore, the execution may fail if you run a Job that contains tRunjob with this check box selected in Talend Administration Center.

Warning

This option is incompatible with the Use or register a shared DB Connection option of database connection components. When tRunJob works together with a database connection component, enabling both options will cause your Job to fail.

 

Context job

This field is visible only when the Use dynamic job option is selected. Enter the name of the Job that you want to call from the list of Jobs selected.

 

Job

Select the Job to be called in and processed. Make sure you already executed once the Job called, beforehand, in order to ensure a smooth run through tRunJob.

 

Version

Select the child Job version that you want to use.

 

Context

If you defined contexts and variables for the Job to be run by the tRunJob, select the applicable context entry on the list.

 

Die on child error

Clear this check box to execute the parent Job even though there is an error when executing the child Job.

 

Transmit whole context

Select this check box to get all the context variables from the parent Job. Deselect it to get all the context variables from the child Job.

If this check box is selected when the parent and child Jobs have the same context variables defined:

  • variable values for the parent Job will be used during the child Job execution if no relevant values are defined in the Context Param table.

  • otherwise, values defined in the Context Param table will be used during the child Job execution.

 

Context Param

You can change the value of selected context parameters. Click the [+] button to add the parameters defined in the Context tab of the child Job. For more information on context parameters, see Talend Studio User Guide.

The values defined here will be used during the child Job execution even if Transmit whole context is selected.

Advanced settings

Print Parameters

Select this check box to display the internal and external parameters in the Console.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

CHILD_RETURN_CODE: the return code of a child Job. This is an After variable and it returns an integer.

CHILD_EXCEPTION_STACKTRACE: the exception stack trace from a child Job. This is an After variable and it returns a string.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage in Map/Reduce Jobs

In a Talend Map/Reduce Job, this component is used standalone. It generates native Map/Reduce code that can be executed directly in Hadoop.

You need to use the Hadoop Configuration tab in the Run view to define the connection to a given Hadoop distribution for the whole Job.

This connection is effective on a per-Job basis.

For further information about a Talend Map/Reduce Job, see the sections describing how to create, convert and configure a Talend Map/Reduce Job of the Talend Big Data Getting Started Guide.

Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs, and non Map/Reduce Jobs.

Limitation

The Use dynamic job option of tRunJob allows the main Job to call the child Job dynamically at runtime. However, this option does not work if the child Job had been exported with the Axis WebService (WAR) type, and been deployed to Tomcat as a web service.

For an alternative solution, see https://help.talend.com/pages/viewpage.action?pageId=27029205 for details.

Related scenarios

No scenario is available for the Map/Reduce version of this component yet.

tRunJob properties in Spark Batch Jobs

Component family

System

 

Basic settings 

Use dynamic job

Select this check box to allow multiple Jobs to be called and processed. When this option is enabled, only the latest version of the Jobs can be called and processed. An independent process will be used to run the subjob. The Context and the Use an independent process to run subjob options disappear.

Warning

The Use dynamic job option is not compatible with the Jobserver cache. Therefore, the execution may fail if you run a Job that contains tRunjob with this check box selected in Talend Administration Center.

 

Context job

This field is visible only when the Use dynamic job option is selected. Enter the name of the Job that you want to call from the list of Jobs selected.

 

Job

Select the Job to be called in and processed. Make sure you already executed once the Job called, beforehand, in order to ensure a smooth run through tRunJob.

 

Version

Select the child Job version that you want to use.

 

Context

If you defined contexts and variables for the Job to be run by the tRunJob, select the applicable context entry on the list.

 

Die on child error

Clear this check box to execute the parent Job even though there is an error when executing the child Job.

 

Transmit whole context

Select this check box to get all the context variables from the parent Job. Deselect it to get all the context variables from the child Job.

If this check box is selected when the parent and child Jobs have the same context variables defined:

  • variable values for the parent Job will be used during the child Job execution if no relevant values are defined in the Context Param table.

  • otherwise, values defined in the Context Param table will be used during the child Job execution.

 

Context Param

You can change the value of selected context parameters. Click the [+] button to add the parameters defined in the Context tab of the child Job. For more information on context parameters, see Talend Studio User Guide.

The values defined here will be used during the child Job execution even if Transmit whole context is selected.

Advanced settings

Print Parameters

Select this check box to display the internal and external parameters in the Console.

Usage in Spark Batch Jobs

In a Talend Spark Batch Job, this component is used standalone. It generates native Spark code that can be executed directly in a Spark cluster.

This component, along with the Spark Batch component Palette it belongs to, appears only when you are creating a Spark Batch Job.

Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs.

Spark Connection

You need to use the Spark Configuration tab in the Run view to define the connection to a given Spark cluster for the whole Job. In addition, since the Job expects its dependent jar files for execution, one and only one file system related component from the Storage family is required in the same Job so that Spark can use this component to connect to the file system to which the jar files dependent on the Job are transferred:

This connection is effective on a per-Job basis.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

The Use dynamic job option of tRunJob allows the main Job to call the child Job dynamically at runtime. However, this option does not work if the child Job had been exported with the Axis WebService (WAR) type, and been deployed to Tomcat as a web service.

For an alternative solution, see https://help.talend.com/pages/viewpage.action?pageId=27029205 for details.

Related scenarios

No scenario is available for the Spark Batch version of this component yet.