Scenario: Extracting information from an MDM record in XML - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The following scenario describes a simple Job which will extract the information of interest from an MDM record in XML and display it on the console.

Scenario prerequisites

A data container Product and a data model Product are created and deployed to MDM server. The Product and Store data entities are defined and some data records already exist in them.

The entities Product and Store are linked by a foreign key which is the Name of the Store.

This example is designed to obtain the store information for a new product. Therefore, when you create a new Product record, make sure that the Store information is also added for the new Product record.

The entities and their attributes are shown below.

For more information about MDM working principles, see the MDM part in Talend Studio User Guide.

Dropping and linking the components

  1. Drop the following components from the Palette onto the design workspace: tMDMReceive, and tLogRow.

  2. Connect tMDMReceive to tLogRow using a Row > Main link.

  3. Rename the components to better identify their functions.

Configuring the components

Defining a context variable

  1. From the Contexts tab, click the [+] button to add one variable and name it exchangeMessage.

  2. Fill in the variable value in the Value field.

    Note that the XML record must conform to a particular schema. For more information about the schema, see the description of processes and schemas used in MDM processes to call Jobs in Talend Studio User Guide.

    One sample of XML record from the Update Report is as follows:

    <exchange xmlns:mdm="java:com.amalto.core.plugin.base.xslt.MdmExtension">
    <report>
    <Update>
    <UserName>administrator</UserName>
    <Source>genericUI</Source>
    <TimeInMillis>1381486872930</TimeInMillis>
    <OperationType>ACTION</OperationType>
    <RevisionID>null</RevisionID>
    <DataCluster>Product</DataCluster>
    <DataModel>Product</DataModel>
    <Concept>Product</Concept>
    <Key>2</Key>
    </Update>
    </report>
    <item><Product><Id>001</Id><Name>Computer</Name><Description>Laptop series</Description><Availability>true</Availability><Price>400</Price><OnlineStore>TalendShop@@http://www.cafepress.com/Talend.2</OnlineStore><Stores><Store>[Dell]</Store><Store>[Lenovo]</Store></Stores></Product></item>
    </exchange>

    In this example, the XML record is trimmed like this:

    <exchange><report/><item><Product><Id>001</Id><Name>Computer</Name><Description>Laptop series</Description><Availability>true</Availability><Price>400</Price><OnlineStore>TalendShop@@http://www.cafepress.com/Talend.2</OnlineStore><Stores><Store>[Dell]</Store><Store>[Lenovo]</Store></Stores></Product></item></exchange>
  3. Press Ctrl+S to save your changes.

Configuring tMDMReceive and tLogRow

  1. Double-click the tMDMReceive component to open its Basic settings view in theComponent tab.

  2. Click the [...] button next to Edit schema to define the desired data structure. In this example, three columns are added: Product_ID, Product_Name, and Store_Name.

  3. In the XML Record field, fill in the context variable context.exchangeMessage.

  4. From the XPath Prefix list, select "/exchange/item".

  5. In the Loop XPath query field, type in the name of the XML tree root tag. In this example, type in "/Product/Stores/Store".

  6. The Column column in the Mapping table is populated with the columns defined in the schema. In the XPath query column, enter the XPath query accordingly. In this example, the information of product ID, product name and store name will be extracted.

  7. Double-click the tLogRow component to open its Basic settings view in the Component tab.

  8. Select Table (print values in cells of a table) in the Mode area.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. Execute the Job by pressing F6 or clicking Run on the Run tab.

    The product information of interest extracted from the XML record is displayed on the console.