Setting up the Document type - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29

About this task

The Document data type is one of the data types provided by Talend . This Document type is set up when you edit the schema for the corresponding data in the Schema editor. For further information about the schema editor, see Using the Schema Editor.

The following figure presents an example in which the input flow, Customer, is set up as the Document type. To replicate it, in the Map editor, you can simply click the [+] button to add one row on the input side of the Schema editor, rename it and select Document from the drop-down list of the given data types.

Document type from the Schema editor.

In practice for most cases, tXMLMap retrieves the schema of its preceding or succeeding components, for example, from a tFileInputXML component or in the ESB use case, from a tESBProviderRequest component. This avoids many manual efforts to set up the Document type for the XML flow to be processed. However, to continue to modify the XML structure as the content of a Document row, you need still to use the given Map editor.

Note: Be aware that a Document flow carries a user-defined XML tree and is no more than one single field of a schema, which, same as the other schemas, may contain different data types between each field. For further information about how to set a schema, see Basic settings tab.
Once the Document type is set up for a row of data, in the corresponding data flow table in the map editor, a basic XML tree structure is created automatically to reflect the details of this structure. This basic structure represents the minimum element required by a valid XML tree in using tXMLMap:
  • The root element: it is the minimum element required by an XML tree to be processed and when needs be, the foundation to develop a sophisticated XML tree.

  • The loop element: it determines the element over which the iteration takes place to read the hierarchical data of an XML tree. By default, the root element is set as loop element.

The root and loop elements.

This figure gives an example with the input flow, Customer. Based on this generated XML root tagged as root by default, you can develop the XML tree structure of interest.

To do this, you need to:

Procedure

  1. Import the custom XML tree structure from one of the following types of sources:
    Note: If needs be, you can develop the XML tree of interest manually using the options provided on the contextual menu.
  2. Reset the loop element for the XML tree you are creating, if needs be. You can set as many loops as you need to. At this step, you may have to consider the following situations:
    • If you have to create several XML trees, you need to define the loop element for each of them.

    • If you import the XML tree from the Repository, the loop element will have been set depending on the set of the source structure. But you can still reset the loop element.

      For further details, see Setting or resetting a loop element for an imported XML structure

  3. Optional: If needed, you can continue to modify the imported XML tree using the options provided in the contextual menu. The following table presents the operations you can perform through the available options.
    Options Operations
    Create Sub-element and Create Attribute Add elements or attributes to develop an XML tree. Related topic: Adding a sub-element or an attribute to an XML tree structure
    Set a namespace Add and manage given namespaces on the imported XML tree. Related topic: Managing a namespace
    Delete Delete an element or an attribute. Related topic: Deleting an element or an attribute from the XML tree structure
    Rename Rename an element or an attribute.
    As loop element Set or reset an element as loop element. Multiple loop elements and optional loop element are supported.
    As optional loop This option is not available unless to the loop element you have defined.

    When the corresponding element exists in the source file, an optional loop element works the same way as a normal loop element; otherwise, it resets automatically its parent element as loop element or in absence of parent element in the source file, it takes the element of the higher level until the root element. But in the real-world practice, with such differences between the XML tree and the source file structure, we recommend adapting the XML tree to the source file for better performance.

    As group element On the XML tree of the output side, set an element as group element. Related topic: Grouping the output data
    As aggregate element

    On the XML tree of the output side, set an element as aggregate element. Related topic: Aggregating the output data

    Add Choice Set the Choice element. Then all of its child elements developed underneath will be contained in this declaration. This Choice element originates from one of the XSD concepts. It enables tXMLMap to perform the function of the XSD Choice element to read or write a Document flow.

    When tXMLMap processes a choice element, the elements contained in its declaration will not be outputted unless their mapping expressions are appropriately defined.

    Note:

    The tXMLMap component declares automatically any Choice element set in the XSD file it imports.

    Set as Substitution Set the Substitution element to specify the element substitutable for a given head element defined in the corresponding XSD. The Substitution element enables tXMLMap to perform the function of the XSD Substitution element to read or write a Document flow

    When tXMLMap processes a substitution element, the elements contained in its declaration will not be outputted unless their mapping expressions are appropriately defined.

    Note:

    The tXMLMap component declares automatically any Substitution element set in the XSD file it imports.

    The following sections present more details about the process of creating the XML tree.