tAdvancedFileOutputXML properties - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Component family

XML or File/Output

 

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the Repository file where Properties are stored. The following fields are pre-filled in using fetched data.

Use Output Stream

Select this check box process the data flow of interest. Once you have selected it, the Output Stream field displays and you can type in the data flow of interest.

The data flow to be processed must be added to the flow in order for this component to fetch these data via the corresponding representative variable.

This variable could be already pre-defined in your Studio or provided by the context or the components you are using along with this component; otherwise, you could define it manually and use it according to the design of your Job, for example, using tJava or tJavaFlex.

In order to avoid the inconvenience of hand writing, you could select the variable of interest from the auto-completion list (Ctrl+Space) to fill the current field on condition that this variable has been properly defined.

For further information about how to use a stream, see Scenario 2: Reading data from a remote file in streaming mode.

 

File name

Name or path to the output file and/or the variable to be used.

This field becomes unavailable once you have selected the Use Output Stream check box.

For further information about how to define and use a variable in a Job, see Talend Studio User Guide.

 

Configure XML tree

Opens the dedicated interface to help you set the XML mapping. For details about the interface, see Defining the XML tree.

 

Schema and Edit Schema

A schema is a row description, it defines the number of fields that will be processed and passed on to the next component. The schema is either built-in or remote in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema will be created and stored locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the Repository, hence can be reused in various projects and job designs. Related topic: see Talend Studio User Guide.

 

Sync columns

Click to synchronize the output file schema with the input file schema. The Sync function only displays once the Row connection is linked with the Output component.

 

Append the source xml file

Select this check box to add the new lines at the end of your source XML file.

 

Generate compact file

Select this check box to generate a file that does not have any empty space or line separators. All elements then are presented in a unique line and this will reduce considerably file size.

 

Include DTD or XSL

Select this check box to to add the DOCTYPE declaration, indicating the root element, the access path and the DTD file, or to add the processing instruction, indicating the type of stylesheet used (such as XSL types), along with the access path and file name.

Advanced settings

Split output in several files

If the XML file output is big, you can split the file every certain number of rows.

 

Trim data

This check box is activated when you are using the dom4j generation mode. Select this check box to trim the leading or trailing whitespace from the value of a XML element.

 

Create directory only if not exists

This check box is selected by default. It creates a directory to hold the output XML files if required.

 

Create empty element if needed

This box is selected by default. If no column is associated to an XML node, this option will create an open/close tag in place of the expected tag.

 

Create attribute even if its value is NULL

Select this check box to generate XML tag attribute for the associated input column whose value is null.

 

Create attribute even if it is unmapped

Select this check box to generate XML tag attribute for the associated input column that is unmapped.

 

Create associated XSD file

If one of the XML elements is defined as a Namespace element, this option will create the corresponding XSD file.

Note

To use this option, you must select Dom4J as the generation mode.

 

Add Document type as node

Select this check box to add column(s) of the Document type as node(s) instead of escaped string(s) in the output XML file.

This check box appears only when the generation mode is set to Slow and memory-consuming (Dom4j) in the Advanced settings tab.

 

Advanced separator (for number)

Select this check box to change the expected data separator.

Thousands separator: define the thousands separator, between inverted commas

Decimal separator: define the decimals separator between inverted commas

 

Generation mode

Select the appropriate generation mode according to your memory availability. The available modes are:

  • Slow and memory-consuming (Dom4j)

    Note

    This option allows you to use dom4j to process the XML files of high complexity.

  • Fast with low memory consumption

Once you select Append the source xml file in the Basic settings view, this field disappears because in this situation, your generation mode is set automatically as dom4j.

 

Encoding

Select the encoding from the list or select Custom and define it manually. This field is compulsory for DB data handling.

 

Don't generate empty file

Select the check box to avoid the generation of an empty file.

 

tStatCatcher Statistics

Select the check box to collect the log data at a Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Use this component to write an XML file with data passed on from other components using a Row link.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Defining the XML tree

Double-click on the tAdvancedFileOutputXML component to open the dedicated interface or click on the three-dot button on the Basic settings vertical tab of the Component Settings tab.

To the left of the mapping interface, under Schema List, all of the columns retrieved from the incoming data flow are listed (only if an input flow is connected to the tAdvancedFileOutputXML component).

To the right of the interface, define the XML structure you want to obtain as output.

You can easily import the XML structure or create it manually, then map the input schema columns onto each corresponding element of the XML tree.

Importing the XML tree

The easiest and most common way to fill out the XML tree panel, is to import a well-formed XML file.

  1. Rename the root tag that displays by default on the XML tree panel, by clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Import XML tree.

  4. Browse to the file to import and click OK.

    Note

    • You can import an XML tree from files in XML, XSD and DTD formats.

    • When importing an XML tree structure from an XSD file, you can choose an element as the root of your XML tree.

The XML Tree column is hence automatically filled out with the correct elements. You can remove and insert elements or sub-elements from and to the tree:

  1. Select the relevant element of the tree.

  2. Right-click to display the contextual menu

  3. Select Delete to remove the selection from the tree or select the relevant option among: Add sub-element, Add attribute, Add namespace to enrich the tree.

Creating the XML tree manually

If you don't have any XML structure defined as yet, you can create it manually.

  1. Rename the root tag that displays by default on the XML tree panel, by clicking on it once.

  2. Right-click on the root tag to display the contextual menu.

  3. On the menu, select Add sub-element to create the first element of the structure.

You can also add an attribute or a child element to any element of the tree or remove any element from the tree.

  1. Select the relevant element on the tree you just created.

  2. Right-click to the left of the element name to display the contextual menu.

  3. On the menu, select the relevant option among: Add sub-element, Add attribute, Add namespace or Delete.

Mapping XML data

Once your XML tree is ready, you can map each input column with the relevant XML tree element or sub-element to fill out the Related Column:

  1. Click on one of the Schema column name.

  2. Drag it onto the relevant sub-element to the right.

  3. Release to implement the actual mapping.

A light blue link displays that illustrates this mapping. If available, use the Auto-Map button, located to the bottom left of the interface, to carry out this operation automatically.

You can disconnect any mapping on any element of the XML tree:

  1. Select the element of the XML tree, that should be disconnected from its respective schema column.

  2. Right-click to the left of the element name to display the contextual menu.

  3. Select Disconnect linker.

The light blue link disappears.

Defining the node status

Defining the XML tree and mapping the data is not sufficient. You also need to define the loop element and if required the group element.

Loop element

The loop element allows you to define the iterating object. Generally the Loop element is also the row generator.

To define an element as loop element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual menu.

  3. Select Set as Loop Element.

The Node Status column shows the newly added status.

Note

There can only be one loop element at a time.

Group element

The group element is optional, it represents a constant element where the groupby operation can be performed. A group element can be defined on the condition that a loop element was defined before.

When using a group element, the rows should sorted, in order to be able to group by the selected node.

To define an element as group element:

  1. Select the relevant element on the XML tree.

  2. Right-click to the left of the element name to display the contextual menu.

  3. Select Set as Group Element.

The Node Status column shows the newly added status and any group status required are automatically defined, if needed.

Click OK once the mapping is complete to validate the definition and continue the job configuration where needed.