tStatCatcher - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

Based on the pre-defined schema, tStatCatcher gathers the Job processing metadata at the Job level and at the component level when the tStatCatcher Statistics check box is selected.

Purpose

Gathers the Job processing metadata at the Job level and at the component level when the tStatCatcher Statistics check box is selected and transfers the log data to the subsequent component for display or storage.

tStatCatcher Properties

Component family

Logs & Errors

 

Basic settings

Schema

A schema is a row description, it defines the fields to be processed and passed on to the next component. In this particular case, the schema is read-only, as this component gathers standard log information including:

 

 

Moment: Processing time and date

 

 

Pid: Process ID

 

 

Father_pid: Process ID of the father Job if applicable. If not applicable, Pid is duplicated.

 

 

Root_pid: Process ID of the root Job if applicable. If not applicable, pid of current Job is duplicated.

 

 

System_pid: Thread ID.

 

 

Project: Project name, which the Job belongs to.

 

 

Job: Name of the current Job

 

 

Job_repository_id: ID of the Job's .item file stored in the repository.

 

 

Job_version: Version of the current Job.

 

 

Context: Name of the current context

 

 

Origin: Name of the component if any

 

 

Message_type: Begin or End.

 

 

Message: Success or Failure.

 

 

Duration: Time for the execution of a Job or a component with the tStatCatcher Statistics check box selected.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component is the start component of a secondary Job which triggers automatically at the end of the main Job. The processing time is also displayed at the end of the log.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Scenario: Displaying the statistics log of Job execution

This scenario collects the statistics log for the Job execution and displays it on the Run console. Note that, since the tStatCatcher Statistics check box is not selected for the components, the statistics log applies solely to this specific Job.

Linking the components

  1. Drop tFixedFlowInput, tFileOutputDelimited, tStatCatcher and tLogRow onto the workspace.

  2. Link tFixedFlowInput to tFileOutputDelimited using a Row > Main connection.

  3. Link tStatCatcher to tLogRow using a Row > Main connection.

Configuring the components

  1. Double-click tFixedFlowInput to open its Basic settings view.

  2. Click the Edit schema button to open the schema editor.

  3. Click the [+] button to add three columns, namely ID_Owners, Name_Customer and ID_Insurance, of the Integer and String types respectively.

  4. Click Ok to validate the setup and close the editor.

  5. In the dialog box that appears, click Yes to propagate the changes to the subsequent component.

  6. Select the Use Inline Content (delimited file) option.

  7. In the Content box, enter 1;Andrew;888.

  8. Double-click tFileOutputDelimited to open its Basic settings view.

  9. In the File Name field, enter the full name of the file to save the statistics data.

  10. Double-click tLogRow to open its Basic settings view.

  11. Select Vertical (each row is a key/value list) for a better display of the results.

Executing the Job

  1. Press Ctrl + S to save the Job.

  2. Press F6 to run the Job.

    As shown above, the statistics log of the Job execution is correctly generated.