tFileOutputMSXML Properties - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Component family

File/Output

 

Function

tFileOutputMSXML writes multiple schema within an XML structured file.

Purpose

tFileOutputMSXML creates a complex multi-structured XML file, using data structures (schemas) coming from several incoming Row flows.

Basic settings

File Name

Name and path to the file to be created and or the variable to be used.

For further information about how to define and use a variable in a Job, see Talend Studio User Guide.

 

Configure XML tree

Opens the dedicated interface to help you set the XML mapping. For details about the interface, see Defining the MultiSchema XML tree.

Advanced settings

Create directory only if not exists

This check box is selected by default. It creates the directory that holds the output delimited file, if it does not already exist.

 

Advanced separator (for numbers)

Select this check box to modify the separators used for numbers:

Thousands separator: define separators for thousands.

Decimal separator: define separators for decimals.

 

Encoding

Select the encoding from the list or select Custom and define it manually. This field is compulsory for DB data handling.

 

Don't generate empty file

Select this check box if you do not want to generate empty files.

 

Trim the whitespace characters

Select this check box to remove leading and trailing whitespace from the columns.

 

Escape text

Select this check box to escape special characters.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Defining the MultiSchema XML tree

Double-click on the tFileOutputMSXML component to open the dedicated interface or click on the three-dot button on the Basic settings vertical tab of the Component tab.

To the left of the mapping interface, under Linker source, the drop-down list includes all the input schemas that should be added to the multi-schema output XML file (on the condition that more than one input flow is connected to the tFileOutputMSXML component).

And under Schema List, are listed all columns retrieved from the input data flow in selection.

To the right of the interface, are expected all XML structures you want to create in the output XML file.

You can create manually or easily import the XML structures. Then map the input schema columns onto each element of the XML tree, respectively for each of the input schemas in selection under Linker source.

Importing the XML tree

The easiest and most common way to fill out the XML tree panel, is to import a well-formed XML file.

  1. Rename the root tag that displays by default on the XML tree panel, by clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Import XML tree.

  4. Browse to the file to import and click OK.

    The XML Tree column is hence automatically filled out with the correct elements. You can remove and insert elements or sub-elements from and to the tree:

  5. Select the relevant element of the tree.

  6. Right-click to display the contextual menu

  7. Select Delete to remove the selection from the tree or select the relevant option among: Add sub-element, Add attribute, Add namespace to enrich the tree.

Creating manually the XML tree

If you don't have any XML structure already defined, you can manually create it.

  1. Rename the root tag that displays by default on the XML tree panel, by clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Add sub-element to create the first element of the structure.

    You can also add an attribute or a child element to any element of the tree or remove any element from the tree.

  4. Select the relevant element on the tree you just created.

  5. Right-click to the left of the element name to display the contextual menu.

  6. On the menu, select the relevant option among: Add sub-element, Add attribute, Add namespace or Delete.

Mapping XML data from multiple schema sources

Once your XML tree is ready, select the first input schema that you want to map.

You can map each input column with the relevant XML tree element or sub-element to fill out the Related Column:

  1. Click on one of the Schema column name.

  2. Drag it onto the relevant sub-element to the right.

  3. Release the mouse button to implement the actual mapping.

    A light blue link displays that illustrates this mapping. If available, use the Auto-Map button, located to the bottom left of the interface, to carry out this operation automatically.

    You can disconnect any mapping on any element of the XML tree:

  4. Select the element of the XML tree, that should be disconnected from its respective schema column.

  5. Right-click to the left of the element name to display the contextual menu.

  6. Select Disconnect link.

The light blue link disappears.

Defining the node status

Defining the XML tree and mapping the data is not sufficient. You also need to define the loop elements for each of the source in selection and if required the group element.

Loop element

The loop element allows you to define the iterating object. Generally the Loop element is also the row generator.

To define an element as loop element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual menu.

  3. Select Set as Loop Element.

    The Node Status column shows the newly added status.

    Note

    There can only be one loop element at a time.

Group element

The group element is optional, it represents a constant element where the Groupby operation can be performed. A group element can be defined on the condition that a loop element was defined before.

When using a group element, the rows should be sorted, in order to be able to group by the selected node.

To define an element as group element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual menu.

  3. Select Set as Group Element.

The Node Status column shows the newly added status and any group status required are automatically defined, if needed.

Click OK once the mapping is complete to validate the definition for this source and perform the same operation for the other input flow sources.