tFileInputMSDelimited properties - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Component family

File/Input

 

Basic settings

Multi Schema Editor

The [Multi Schema Editor] helps to build and configure the data flow in a multi-structure delimited file to associate one schema per output.

For more information, see The Multi Schema Editor.

 

Output

Lists all the schemas you define in the [Multi Schema Editor], along with the related record type and the field separator that corresponds to every schema, if different field separators are used.

 

Die on error

Select this check box to stop the execution of the Job when an error occurs. Clear the check box to skip the row on error and complete the process for error-free rows.

Advanced settings

Trim all column

Select this check box to remove leading and trailing whitespaces from defined columns.

 

Validate date

Select this check box to check the date format strictly against the input schema.

 

Advanced separator (for numbers)

Select this check box to modify the separators used for numbers:

Thousands separator: define separators for thousands.

Decimal separator: define separators for decimals.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Use this component to read multi-structured delimited files and separate fields contained in these files using a defined separator.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see the article Installing External Modules on Talend Help Center (https://help.talend.com) how to configure the Studio in the Talend Installation and Upgrade Guide.

The Multi Schema Editor

The [Multi Schema Editor] enables you to:

  • set the path to the source file,

  • define the source file properties,

  • define data structure for each of the output schemas.

Note

When you define data structure for each of the output schemas in the [Multi Schema Editor], column names in the different data structures automatically appear in the input schema lists of the components that come after tFileInputMSDelimited. However, you can still define data structures directly in the Basic settings view of each of these components.

The [Multi Schema Editor] also helps to declare the schema that should act as the source schema (primary key) from the incoming data to insure its unicity.The editor uses this mapping to associate all schemas processed in the delimited file to the source schema in the same file.

Note

The editor opens with the first column, that usually holds the record type indicator, selected by default. However, once the editor is open, you can select the check box of any of the schema columns to define it as a primary key.

The below figure illustrates an example of the [Multi Schema Editor].

For detailed information about the usage of the Multi Schema Editor, see Scenario: Reading a multi structure delimited file.