Scenario 1: Extracting XML data from a field in a database table - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This three-component scenario allows to read the XML structure included in the fields of a database table and then extracts the data.

  1. Drop the following components from the Palette onto the design workspace: tMysqlInput, tExtractXMLField, and tFileOutputDelimited.

    Connect the three components using Main links.

  2. Double-click tMysqlInput to display its Basic settings view and define its properties.

  3. If you have already stored the input schema in the Repository tree view, select Repository first from the Property Type list and then from the Schema list to display the [Repository Content] dialog box where you can select the relevant metadata.

    For more information about storing schema metadata in the Repository tree view, see Talend Studio User Guide.

    If you have not stored the input schema locally, select Built-in in the Property Type and Schema fields and enter the database connection and the data structure information manually. For more information about tMysqlInput properties, see tMysqlInput.

  4. In the Table Name field, enter the name of the table holding the XML data, customerdetails in this example.

    Click Guess Query to display the query corresponding to your schema.

  5. Double-click tExtractXMLField to display its Basic settings view and define its properties.

  6. Click Sync columns to retrieve the schema from the preceding component. You can click the three-dot button next to Edit schema to view/modify the schema.

    The Column field in the Mapping table will be automatically populated with the defined schema.

  7. In the Xml field list, select the column from which you want to extract the XML data. In this example, the filed holding the XML data is called CustomerDetails.

    In the Loop XPath query field, enter the node of the XML tree on which to loop to retrieve data.

    In the Xpath query column, enter between inverted commas the node of the XML field holding the data you want to extract, CustomerName in this example.

  8. Double-click tFileOutputDelimited to display its Basic settings view and define its properties.

  9. In the File Name field, define or browse to the path of the output file you want to write the extracted data in.

    Click Sync columns to retrieve the schema from the preceding component. If needed, click the three-dot button next to Edit schema to view the schema.

  10. Save your Job and click F6 to execute it.

tExtractXMLField read and extracted the clients names under the node CustomerName of the CustomerDetails field of the defined database table.