Setting up XML metadata for an output file - 6.5

Talend Big Data Studio User Guide

EnrichVersion
6.5
EnrichProdName
Talend Big Data
task
Design and Development
EnrichPlatform
Talend Studio

This section describes how to define a file connection and upload an XML schema for an output file. To define and upload an XML schema for an input file, see Setting up XML metadata for an input file.

Defining the general properties

In this step, the general metadata properties such as the Name, Purpose and Description are set.

  1. In the file metadata setup wizard, fill in the Name field, which is mandatory, and the Purpose and Description fields if you choose to do so. The information you provide in the Description field will appear as a tooltip when you move your mouse pointer over the file connection.

    Note

    When you enter the general properties of the metadata to be created, you need to define the type of connection as either input or output. It is therefore advisable to enter information that will help you distinguish between your input and output schemas.

  2. If needed, set the version and status in the Version and Status fields respectively. You can also manage the version and status of a repository item in the [Project Settings] dialog box. For more information, see Version management and Status management respectively.

  3. If needed, click the Select button next to the Path field to select a folder under the File XML node to hold your newly created file connection. Note that you cannot select a folder if you are editing an existing connection, but you can drag and drop it to a new folder whenever you want.

  4. Click Next to select the type of metadata.

Setting the type of metadata (output)

In this step, the type of metadata is set as either input or output. For this procedure, the metadata of interest is output.

  1. From the dialog box, select Output XML.

  2. Click Next to define the output file, either from an XML or XSD file or from scratch.

Defining the output file structure using an existing XML file

In this step, you will choose whether to create your file manually or from an existing XML or XSD file. If you choose the Create manually option you will have to configure your schema, source and target columns yourself at step 4 in the wizard. The file will be created in a Job using a an XML output component such as tAdvancedFileOutputXML.

In this procedure, we will create the output file structure by loading an existing XML. To create the output XML structure from an XSD file, see Defining the output file structure using an XSD file.

To create the output XML structure from an XML file, do the following:

  1. Select the Create from a file option.

  2. Click the Browse... button next to the XML or XSD File field, browse to the access path to the XML file the structure of which is to be applied to the output file, and double-click the file.

    The File Viewer area displays a preview of the XML structure, and the File Content area displays a maximum of the first 50 rows of the file.

  3. Enter the Encoding type in the corresponding field if the system does not detect it automatically.

  4. In the Limit field, enter the number of columns on which the XPath query is to be executed, or enter 0 if you want it to be run against all of the columns.

  5. In the Output File field, in the Output File Path zone, browse to or enter the path to the output file. If the file does not exist as yet, it will be created during the execution of a Job using a tAdvancedFileOutputXML component. If the file already exists, it will be overwritten.

  6. Click Next to define the schema.

Defining the output file structure using an XSD file

This procedure describes how to define the output XML file structure from an XSD file. To define the XML structure from an XML file, see Defining the output file structure using an existing XML file.

Note

When loading an XSD file,

  • the data will be saved in the Repository, and therefore the metadata will not be affected by the deletion or displacement of the file.

  • you can choose an element as the root of your XML tree.

To create the output XML structure from an XSD file, do the following:

  1. Select the Create from a file option.

  2. Click the Browse... button next to the XML or XSD File field, browse to the access path to the XSD file the structure of which is to be applied to the output file, and double-click the file.

  3. In the dialog box the appears, select an element from the Root list as the root of your XML tree, and click OK.

    The File Viewer area displays a preview of the XML structure, and the File Content area displays a maximum of the first 50 rows of the file.

  4. Enter the Encoding type in the corresponding field if the system does not detect it automatically.

  5. In the Limit field, enter the number of columns on which the XPath query is to be executed, or enter 0 if you want it to be run against all of the columns.

  6. In the Output File field, in the Output File Path zone, browse to or enter the path to the output file. If the file does not exist as yet, it will be created during the execution of a Job using a tAdvancedFileOutputXML component. If the file already exists, it will be overwritten.

  7. Click Next to define the schema.

Defining the schema

Upon completion of the previous operations, the columns in the Linker Source area are automatically mapped to the corresponding ones in the Linker Target area, as indicated by blue arrow links.

In this step, you need to define the output schema. The following table describes how:

To...Perform the following...

Create a schema from scratch or edit the source schema columns to pass to the output schema

In the Linker Source area, click the Schema Management button to open the schema editor.

Define a loop element

In the Linker Target area, right-click the element of interest and select Set As Loop Element from the contextual menu.

Note

It is a mandatory operation to define an element to run a loop on.

Define a group element

In the Linker Target area, right-click the element of interest and select Set As Group Element from the contextual menu.

Note

You can set a parent element of the loop element as a group element on the condition that the parent element is not the root of the XML tree.

Create a child element for an element

In the Linker Target area,

  • Right-click the element of interest and select Add Sub-element from the contextual menu, enter a name for the sub-element in the dialog box that appears, and click OK,

  • Select the element of interest, click the [+] button at the bottom, select Create as sub-element in the dialog box that appears, and click OK. Then, enter a name for the sub-element in the next dialog box and click OK.

Create an attribute for an element

In the Linker Target area,

  • Right-click the element of interest and select Add Attribute from the contextual menu, enter a name for the attribute in the dialog box that appears, and click OK,

  • Select the element of interest, click the [+] button at the bottom, select Create as attribute in the dialog box that appears, and click OK. Then, enter a name for the attribute in the next dialog box and click OK.

Create a name space for an element

In the Linker Target area,

  • Right-click the element of interest and select Add Name Space from the contextual menu, enter a name for the name space in the dialog box that appears, and click OK,

  • Select the element of interest, click the [+] button at the bottom, select Create as name space in the dialog box that appears, and click OK. Then, enter a name for the name space in the next dialog box and click OK.

Delete one or more elements/attributes/name spaces

In the Linker Target area,

  • Right-click the element(s)/attribute(s)/name space(s) of interest and select Delete from the contextual menu

  • Select the element(s)/attribute(s)/name space(s) of interest and click the [x] button at the bottom

  • Select the element(s)/attribute(s)/name space(s) of interest and press the Delete key.

    Note

    Deleting an element will also delete its children, if any.

Adjust the order of one or more elements

In the Linker Target area, select the element(s) of interest and click the and buttons.

Set a static value for an element/attribute/name space

In the Linker Target area, right-click the element/attribute/name space of interest and select Set A Fix Value from the contextual menu.

Note

  • The value you set will replace any value retrieved for the corresponding column from the incoming data flow in your Job.

  • You can set a static value for a child element of the loop element only, on the condition that the element does not have its own children and does not have a source-target mapping on it.

Create a source-target mapping

Select the column of interest in the Linker Source area, drop it onto the node of interest in the Linker Target area, and select Create as sub-element of target node, Create as attribute of target node, or Add linker to target node according to your need in the dialog box that appears, and click OK.

If you choose an option that is not permitted for the target node, you will see a warning message and your operation will fail.

Remove a source-target mappingIn the Linker Target area, right-click the node of interest and select Disconnect Linker from the contextual menu.
Create an XML tree from another XML or XSD fileRight-click any schema item in the Linker Target area and select Import XML Tree from the contextual menu to load another XML or XSD file. Then, you need to create source-target mappings manually and define the output schema all again.

Note

You can select and drop several fields at a time, using the Ctrl + Shift technique to make multiple selections, therefore making mapping faster. You can also make multiple selections for right-click operations.

  1. In the Linker Target area, right-click the element you want to run a loop on and select Set As Loop Element from the contextual menu.

  2. Define other output file properties as needed, and then click Next to view and customize the end schema.

Finalizing the end schema

Step 5 of the wizard displays the end schema generated and allows you to further define the schema.

  1. If needed, rename the metadata in the Name field (metadata, by default), add a Comment, and make further modifications, for example:

    • Redefine the columns by editing the relevant fields.

    • Add or delete a column using the [+] and [x] buttons.

    • Change the order of the columns using the and buttons.

  2. If the XML file which the schema is based on has been changed, click the Guess button to generate the schema again. Note that if you have customized the schema, the Guess feature does not retain these changes.

  3. Click Finish. The new file connection, along with its schema, is displayed under the relevant File XML metadata node in the Repository tree view.

Now you can drag and drop the file connection or any schema of it from the Repository tree view onto the design workspace as a new tAdvancedFileOutputXML component or onto an existing component to reuse the metadata.

To modify an existing file connection, right-click it from the Repository tree view, and select Edit file xml to open the file metadata setup wizard.

To add a new schema to an existing file connection, right-click the connection from the Repository tree view and select Retrieve Schema from the contextual menu.

To edit an existing file schema, right-click the schema from the Repository tree view and select Edit Schema from the contextual menu.