tApacheLogInput - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tApacheLogInput properties

Component family

File/Input

 

Function

tApacheLogInput reads the access-log file for an Apache HTTP server.

Purpose

tApachLogInput helps to effectively manage the Apache HTTP Server,. It is necessary to get feedback about the activity and performance of the server as well as any problems that may be occurring.

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the repository file where the properties are stored. The fields that follow are completed automatically using the data retrieved.

 

Schema and Edit Schema

A schema is a row description, it defines the number of fields to be processed and passed on to the next component. The schema is either Built-in or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

In the context of tApacheLogInput usage, the schema is read-only.

 

 

Built-in: You can create the schema and store it locally for this component. Related topic: see Talend Studio User Guide.

 

 

Repository: You have already created and stored the schema in the Repository. You can reuse it in various projects and Job flowcharts. Related topic: see Talend Studio User Guide.

 

File Name

Name of the file and/or the variable to be processed.

For further information about how to define and use a variable in a Job, see Talend Studio User Guide.

 

Die on error

Select this check box to stop the execution of the Job when an error occurs. Clear the check box to skip the row on error and complete the process for error-free rows. If needed, you can collect the rows on error using a Row > Reject link.

Advanced settings

Encoding

Select the encoding type from the list or select Custom and define it manually. This field is compulsory for DB data handling.

 

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

tApacheLogInput can be used with other components or as a standalone component. It allows you to create a data flow using a Row > Main connection, or to create a reject flow to filter specified data using a Row > Reject connection. For an example of how to use these two links, see Scenario 2: Extracting correct and erroneous data from an XML field in a delimited file.

Limitation

n/a

Scenario: Reading an Apache access-log file

The following scenario creates a two-component Job, which aims at reading the access-log file for an Apache HTTP server and displaying the output in the Run log console.

  1. Drop a tApacheLogInput component and a tLogRow component from the Palette onto the design workspace.

  2. Right-click on the tApacheLogInput component and connect it to the tLogRow component using a Main Row link.

  3. In the design workspace, select tApacheLogInput.

  4. Click the Component tab to define the basic settings for tApacheLogInput.

  5. If desired, click the Edit schema button to see the read-only columns.

  6. In the File Name field, enter the file path or browse to the access-log file you want to read.

  7. In the design workspace, select tLogRow and click the Component tab to define its basic settings. For more information, see tLogRow

  8. Press F6 to execute the Job.

The log lines of the defined file are displayed on the console.