tFileInputMSXML Standard properties

MS XML connectors

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Big Data Platform
Talend Real-Time Big Data Platform
Talend Big Data
Talend ESB
Talend Open Studio for MDM
Talend Data Management Platform
Talend Open Studio for Big Data
Talend Open Studio for ESB
Talend Data Fabric
Talend Data Integration
Talend Data Services Platform
Talend Open Studio for Data Integration
Talend MDM Platform
task
Design and Development > Third-party systems > XML components > MS XML connectors
Data Quality and Preparation > Third-party systems > XML components > MS XML connectors
Data Governance > Third-party systems > XML components > MS XML connectors
EnrichPlatform
Talend Studio

These properties are used to configure tFileInputMSXML running in the Standard Job framework.

The Standard tFileInputMSXML component belongs to the File and the XML families.

The component in this framework is available in all Talend products.

Basic settings

File Name

Name of the file and/or the variable to be processed.

For further information about how to define and use a variable in a Job, see Talend Studio User Guide.

Root XPath query

The root of the XML tree, which the query is based on.

Enable XPath in column "Schema XPath loop" but lose the order

Select this check box if you want to define a XPath path in the Schema XPath loop field of the Outputs table while not keeping the order of the data shown in the source XML file.

Warning:

This options takes effect only if you select the Dom4j generation mode in the Advanced settings view.

Outputs

Schema: Define as many schemas as needed.

Schema XPath loop: Enter the node of the XML tree or XPath path which the loop is based on.

XPath Queries: Enter the fields to be extracted from the structured input.

Create empty row: Select this check box if you want to create empty rows for the empty field(s) in the schema.

Die on error

Select this check box to stop the execution of the Job when an error occurs. Clear the check box to skip the row on error and complete the process for error-free rows.

Advanced settings

Trim all column

Select this check box to remove leading and trailing whitespaces from defined columns.

Validate date

Select this check box to check the date format strictly against the input schema.

Ignore DTD file Select this check box to ignore the DTD file indicated in the XML file being processed.

Generation mode

Select the appropriate generation mode according to your memory availability. The available modes are:

  • Slow and memory-consuming (Dom4j)

    Note:

    This option allows you to use dom4j to process the XML files of high complexity.

  • Fast with low memory consumption (SAX)

Encoding

Select the encoding type from the list or select CUSTOM and define it manually. This field is compulsory for DB data handling.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

Global Variables

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.