Scenario: Extracting the structure of an XML file and inserting it into the fields of a database table - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This three-component scenario allows to read an XML file, extract the XML structure, and finally outputs the structure to the fields of a database table.

  1. Drop the following components from the Palette onto the design workspace: tFileInputXml, tWriteXMLField, and tMysqlOutput.

    Connect the three components using Main links.

  2. Double-click tFileInputXml to open its Basic settings view and define its properties.

  3. If you have already stored the input schema in the Repository tree view, select Repository first from the Property Type list and then from the Schema list to display the [Repository Content] dialog box where you can select the relevant metadata.

    For more information about storing schema metadata in the Repository tree view, see Talend Studio User Guide.

  4. If you have not stored the input schema locally, select Built-in in the Property Type and Schema fields and fill in the fields that follow manually. For more information about tFileInputXML properties, see tFileInputXML.

    If you have selected Built-in, click the three-dot button next to the Edit schema field to open a dialog box where you can manually define the structure of your file.

  5. In the Look Xpath query field, enter the node of the structure where the loop is based. In this example, the loop is based on the customer node. Column in the Mapping table will be automatically populated with the defined file content.

    In the Xpath query column, enter between inverted commas the node of the XML file that holds the data corresponding to each of the Column fields.

  6. In the design workspace, click tWriteXMLField and then in the Component view, click Basic settings to open the relevant view where you can define the component properties.

  7. Click the three-dot button next to the Edit schema field to open a dialog box where you can add a line by clicking the plus button.

  8. Click in the line and enter the name of the output column where you want to write the XML content, CustomerDetails in this example.

    Define the type and length in the corresponding fields, String and 255in this example.

    Click Ok to validate your output schema and close the dialog box.

    In the Basic settings view and from the Output Column list, select the column you already defined where you want to write the XML content.

  9. Click the three-dot button next to Configure Xml Tree to open the interface that helps to create the XML structure.

  10. In the Link Target area, click rootTag and rename it as CustomerDetails.

    In the Linker source area, drop CustomerName and CustomerAddress to CustomerDetails. A dialog box displays asking what type of operation you want to do.

    Select Create as sub-element of target node to create a sub-element of the CustomerDetails node.

    Right-click CustomerName and select from the contextual menu Set As Loop Element.

    Click OK to validate the XML structure you defined.

  11. Double-click tMysqlOutput to open its Basic settings view and define its properties.

  12. If you have already stored the schema in the DB Connection node in the Repository tree view, select Repository from the Schema list to display the [Repository Content] dialog box where you can select the relevant metadata.

    For more information about storing schema metadata in the Repository tree view, see Talend Studio User Guide.

    If you have not stored the schema locally, select Built-in in the Property Type and Schema fields and enter the database connection and data structure information manually. For more information about tMysqlOutput properties, see tMysqlOutput.

    In the Table field, enter the name of the database table to be created, where you want to write the extracted XML data.

    From the Action on table list, select Create table to create the defined table.

    From the Action on data list, select Insert to write the data.

    Click Sync columns to retrieve the schema from the preceding component. You can click the three-dot button next to Edit schema to view the schema.

  13. Save your Job and click F6 to execute it.

tWriteXMLField fills every field of the CustomerDetails column with the XML structure of the input file: the XML processing instruction <?xml version=""1.0"" encoding=""ISO-8859-15""?>, the first node that separates each client <CustomerDetails> and finally customer information <CustomerAddress> and <CustomerName>.