Using the document type to create the XML tree

Talend ESB Studio User Guide

EnrichVersion
6.4
EnrichProdName
Talend ESB
task
Design and Development
EnrichPlatform
Talend Studio

The Document data type fits perfectly the conception of defining XML structure as easily as possible. When you need the XML tree structure to map the input or output flow or both, use this type. Then you can import the XML tree structure from various XML sources and edit the tree directly in the mapping editor, thus saving the manual efforts.

How to set up the Document type

The Document data type is one of the data types provided by Talend. This Document type is set up when you edit the schema for the corresponding data in the Schema editor. For further information about the schema editor, see Using the Schema Editor.

The following figure presents an example in which the input flow, Customer, is set up as the Document type. To replicate it, in the Map editor, you can simply click the [+] button to add one row on the input side of the Schema editor, rename it and select Document from the drop-down list of the given data types.

In practice for most cases, tXMLMap retrieves the schema of its preceding or succeeding components, for example, from a tFileInputXML component or in the ESB use case, from a tESBProviderRequest component. This avoids many manual efforts to set up the Document type for the XML flow to be processed. However, to continue to modify the XML structure as the content of a Document row, you need still to use the given Map editor.

Note

Be aware that a Document flow carries a user-defined XML tree and is no more than one single field of a schema, which, same as the other schemas, may contain different data types between each field. For further information about how to set a schema, see Basic Settings tab.

Once the Document type is set up for a row of data, in the corresponding data flow table in the map editor, a basic XML tree structure is created automatically to reflect the details of this structure. This basic structure represents the minimum element required by a valid XML tree in using tXMLMap:

  • The root element: it is the minimum element required by an XML tree to be processed and when needs be, the foundation to develop a sophisticated XML tree.

  • The loop element: it determines the element over which the iteration takes place to read the hierarchical data of an XML tree. By default, the root element is set as loop element.

This figure gives an example with the input flow, Customer. Based on this generated XML root tagged as root by default, you can develop the XML tree structure of interest.

To do this, you need to:

  1. Import the custom XML tree structure from one of the following types of sources:

    Note

    If needs be, you can develop the XML tree of interest manually using the options provided on the contextual menu.

  2. Reset the loop element for the XML tree you are creating, if needs be. You can set as many loops as you need to. At this step, you may have to consider the following situations:

    • If you have to create several XML trees, you need to define the loop element for each of them.

    • If you import the XML tree from the Repository, the loop element will have been set depending on the set of the source structure. But you can still reset the loop element.

      For further details, see How to set or reset a loop element for an imported XML structure

If needed, you can continue to modify the imported XML tree using the options provided in the contextual menu. The following table presents the operations you can perform through the available options.

Options

Operations

Create Sub-element and Create Attribute

Add elements or attributes to develop an XML tree. Related topic: How to add a sub-element or an attribute to an XML tree structure

Set a namespace

Add and manage given namespaces on the imported XML tree. Related topic: How to manage a namespace

Delete

Delete an element or an attribute. Related topic: How to delete an element or an attribute from the XML tree structure

Rename

Rename an element or an attribute.

As loop element

Set or reset an element as loop element. Multiple loop elements and optional loop element are supported.

As optional loop

This option is not available unless to the loop element you have defined.

When the corresponding element exists in the source file, an optional loop element works the same way as a normal loop element; otherwise, it resets automatically its parent element as loop element or in absence of parent element in the source file, it takes the element of the higher level until the root element. But in the real-world practice, with such differences between the XML tree and the source file structure, we recommend adapting the XML tree to the source file for better performance.

As group element

On the XML tree of the output side, set an element as group element. Related topic: How to group the output data

As aggregate element

On the XML tree of the output side, set an element as aggregate element. Related topic: How to aggregate the output data

Add Choice

Set the Choice element. Then all of its child elements developed underneath will be contained in this declaration. This Choice element originates from one of the XSD concepts. It enables tXMLMap to perform the function of the XSD Choice element to read or write a Document flow.

When tXMLMap processes a choice element, the elements contained in its declaration will not be outputted unless their mapping expressions are appropriately defined.

Note

The tXMLMap component declares automatically any Choice element set in the XSD file it imports.

Set as Substitution

Set the Substitution element to specify the element substitutable for a given head element defined in the corresponding XSD. The Substitution element enables tXMLMap to perform the function of the XSD Substitution element to read or write a Document flow

When tXMLMap processes a substitution element, the elements contained in its declaration will not be outputted unless their mapping expressions are appropriately defined.

Note

The tXMLMap component declares automatically any Substitution element set in the XSD file it imports.

The following sections present more details about the process of creating the XML tree.

How to import the XML tree structure from XML and XSD files

To import the XML tree structure from an XML file, proceed as follows:

  1. In the input flow table of interest, right-click the column name to open the contextual menu. In this example, it is Customer.

  2. From this menu, select Import From File.

  3. In the pop-up dialog box, browse to the XML file you need to use to provide the XML tree structure of interest and double-click the file.

To import the XML tree structure from an XSD file, proceed as follows:

  1. In the input flow table of interest, right-click the column name to open the contextual menu. In this example, it is Customer.

  2. From this menu, select Import From File.

  3. In the pop-up dialog box, browse to the XSD file you need to use to provide the XML tree structure of interest and double-click the file.

  4. In the dialog box that appears, select an element from the Root list as the root of your XML tree, and click OK. Then the XML tree described by the XSD file imported is established.

    Note

    The root of the imported XML tree is adaptable:

    • When importing either an input or an output XML tree structure from an XSD file, you can choose an element as the root of your XML tree.

    • Once an XML structure is imported, the root tag is renamed automatically with the name of the XML source. To change this root name manually, you need use the tree schema editor. For further information about this editor, see Editing the XML tree schema.

Then, you need to define the loop element in this XML tree structure. For further information about how to define a loop element, see How to set or reset a loop element for an imported XML structure.

How to import the XML tree structure from the Repository

To do this, proceed as follows:

  1. In any input flow table, right click the column name to open the contextual menu. In this example, it is Customer.

  2. From this menu, select Import From Repository.

  3. In the pop-up repository content list, select the XML connection or the MDM connection of interest to import the corresponding XML tree structure.

    This figure presents an example of this Repository-stored XML connection.

    Note

    To import an XML tree structure from the Repository, the corresponding XML connection should have been created. For further information about how to create a file XML connection in the Repository, see Centralizing XML file metadata.

  4. Click OK to validate this selection.

The XML tree structure is created and a loop is defined automatically as this loop was already defined during the creation of the current Repository-stored XML connection.

How to set or reset a loop element for an imported XML structure

You need to set at least one loop element for each XML tree if it does not have any. If it does, you may have to reset the existing loop element when needs be.

Whatever you need to set or reset a loop element, proceed as follows:

  1. In the created XML tree structure, right-click the element you need to define as loop. For example, you need to define the Customer element as loop in the following figure.

  2. From the pop-up contextual menu, select As loop element to define the selected element as loop.

    Once done, this selected element is marked with the text: loop.

Note

If you close the Map Editor without having set the required loop element for a given XML tree, its root element will be set automatically as loop element.

How to add a sub-element or an attribute to an XML tree structure

In the XML tree structure view, you are able to manually add a sub-element or an attribute to the root or to any of the existing elements when needs be.

To do either of these operations, proceed as follows:

  1. In the XML tree you need to edit, right-click the element to which you need to add a sub-element or an attribute underneath and select Create Sub-Element or Create Attribute according to your purpose.

  2. In the pop-up [Create New Element] wizard, type in the name you need to use for the added sub-element or attribute.

  3. Click OK to validate this creation. The new sub-element or attribute displays in the XML tree structure you are editing.

How to delete an element or an attribute from the XML tree structure

From an established XML tree, you may need to delete an element or an attribute. To do this, proceed as follows:

  1. In the XML tree you need to edit, right-click the element or the attribute you need to delete.

  2. In the pop-up contextual menu, select Delete.

    Then the selected element or attribute is deleted, including all of the sub-elements or the attributes attached to it underneath.

How to manage a namespace

When necessary, you are able to set and edit namespace for each of the element in the a created XML tree of the input or the output data flow.

Defining a namespace

To do this, proceed as follows:

  1. In the XML tree of the input or the output data flow you need to edit, right click the element for which you need to declare a namespace. For example, in a Customer XML tree of the output flow, you need to set a namespace for the root.

  2. In the pop-up contextual menu, select Set a namespace. Then the [Namespace dialog] wizard displays.

  3. In this wizard, type in the URI you need to use.

  4. If you need to set a prefix for this namespace you are editing, select the Prefix check box in this wizard and type in the prefix you need. In this example, we select it and type in xhtml.

  5. Click OK to validate this declaration.

Modifying the default value of a namespace

To do this, proceed as follows:

  1. In the XML tree that the namespace you need to edit belongs to, right-click this namespace to open the contextual menu.

  2. In this menu, select Change Namespace to open the corresponding wizard.

  3. Type in the new default value you need in this wizard.

  4. Click OK to validate this modification.

Deleting a namespace

To do this, proceed as follows:

  1. In the XML tree that the namespace you need to edit belongs to, right-click this namespace to open the contextual menu.

  2. In this menu, click Delete to validate this deletion

How to group the output data

The tXMLMap component uses a group element to group the output data according to a given grouping condition. This allows you to wrap elements matching the same condition with this group element.

To set a group element, two restrictions must be respected:

  1. the root node cannot be set as group element;

  2. the group element must be the parent of the loop element.

Note

The option of setting group element is not visible until you have set the loop element; this option is also invisible if an element is not allowed to be set as group element.

Once the group element is set, all of its sub-elements except the loop one are used as conditions to group the output data.

You have to carefully design the XML tree view for the optimized usage of a given group element. For further information about how to use a group element, see tXMLMap at https://help.talend.com.

Note

tXMLMap provides group element and aggregate element to classify data in the XML tree structure. When handling a row of XML data flow, the behavioral difference between them is:

  • The group element processes the data always within one single flow.

  • The aggregate element splits this flow into separate and complete XML flows.

Setting a group element

To set a group element, proceed as follows:

  1. In the XML tree view on the output side of the Map editor, right-click the element you need to set as group element.

  2. From the opened contextual menu, select As group element.

    Then this element of selection becomes the group element. The following figure presents an example of an XML tree with the group element.

Revoking a defined group element

To revoke a defined group element, proceed as follows:

  1. In the XML tree view on the output side of the Map editor, right-click the element you have defined as group element.

  2. From the opened contextual menu, select Remove group element.

    Then the defined group element is revoked.

How to aggregate the output data

With tXMLMap, you can define as many aggregate elements as required in the output XML tree to class the XML data accordingly. Then this component outputs these classes, each as one complete XML flow.

  1. To define an element as aggregate element, simply right-click this element of interest in the XML tree view on the output side of the Map editor and from the contextual menu, select As aggregate element.

    Then this element becomes the aggregate element. Texts in red are added to it, reading aggregate. The following figure presents an example.

  2. To revoke the definition of the aggregate element, simply right-click the defined aggregate element and from the contextual menu, select Remove aggregate element.

Note

To define an element as aggregate element, ensure that this element has no child element and the All in one feature is being disabled. The As aggregate element option is not available in the contextual menu until both of the conditions are respected. For further information about the All in one feature, see How to output elements into one document.

For an example about how to use the aggregate element with tXMLMap, see the tXMLMap documentation at https://help.talend.com.

Note

tXMLMap provides group element and aggregate element to classify data in the XML tree structure. When handling one row of data ( one complete XML flow), the behavioral difference between them is:

  • The group element processes the data always within one single flow.

  • The aggregate element splits this flow into separate and complete XML flows.