Defining MDM schema

Talend Open Studio for Data Integration User Guide

EnrichVersion
6.5
EnrichProdName
Talend Open Studio for Data Integration
task
Design and Development
EnrichPlatform
Talend Studio

Defining Input MDM schema

This section describes how to define and download an input MDM XML schema. To define and download an output MDM XML schema, see Defining output MDM schema.

To set the values to be fetched from one or more entities linked to a specific MDM connection, complete the following:

  1. In the Repository tree view, expand Metadata and right-click the MDM connection for which you want to retrieve the entity values.

  2. Select Retrieve Entity from the contextual menu.

    A dialog box pops up.

  3. Select the Input MDM option in order to download an input XML schema and then click Next to proceed to the following step.

  4. From the Entities field, select the business entity (XML schema) from which you want to retrieve values.

    The name is displayed automatically in the Name field.

    Note

    You are free to enter any text in this field, although you would likely put the name of the entity from which you are retrieving the schema.

  5. Click Next to proceed to the next step.

    Note

    The schema of the entity you selected is automatically displayed in the Source Schema panel.

    Here, you can set the parameters to be taken into account for the XML schema definition.

    The schema dialog box is divided into four different panels as the following:

    Panel

    Description

    Source Schema

    Tree view of the uploaded entity.

    Target schema

    Extraction and iteration information.

    Preview

    Target schema preview.

    File viewer

    Raw data viewer.

  6. In the Xpath loop expression area, enter the absolute XPath expression leading to the XML structure node on which to apply the iteration. Or, drop the node from the source schema to the target schema Xpath field. This link is orange in color.

    Note

    The Xpath loop expression field is compulsory.

  7. If required, define a Loop limit to restrict the iteration to a number of nodes.

    In the capture above, we use Features as the element to loop on because it is repeated within the Product entity as follows:

    <Product>
        <Id>1</Id>
        <Name>Cup</Name>
        <Description/>
        <Features>
             <Feature>Color red</Feature>
             <Feature>Size maxi</Feature
        <Features>
        ...
    </Product>
    <Product>
        <Id>2</Id>
        <Name>Cup</Name>
        <Description/>
        <Features>
             <Feature>Color blue</Feature>
             <Feature>Thermos</Feature>
        <Features>
        ...
    </Product>

    By doing so, the tMDMInput component that uses this MDM connection will create a new row for every item with different feature.

  8. To define the fields to extract, drop the relevant node from the source schema to the Relative or absolute XPath expression field.

    Note

    Use the [+] button to add rows to the table and select as many fields to extract as necessary. Press the Ctrl or the Shift keys for multiple selection of grouped or separate nodes and drop them to the table.

  9. If required, enter a name to each of the retrieved columns in the Column name field.

    Note

    You can prioritize the order of the fields to extract by selecting the field and using the up and down arrows. The link of the selected field is blue, and all other links are grey.

  10. Click Finish to validate your modifications and close the dialog box.

    The newly created schema is listed under the corresponding MDM connection in the Repository tree view.

To modify the created schema, complete the following:

  1. In the Repository tree view, expand Metadata and Talend MDM and then browse to the schema you want to modify.

  2. Right-click the schema name and select Edit Entity from the contextual menu.

    A dialog box is displayed.

  3. Modify the schema as needed.

    You can change the name of the schema according to your needs, you can also customize the schema structure in the schema panel. The tool bar allows you to add, remove or move columns in your schema.

    Make sure the data type in the Type column is correctly defined.

    For more information regarding Java data types, including date pattern, see Java API Specification.

    Below are the commonly used Talend data types:

    • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

    • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

    • Document: a data type that allows processing an entire XML document without regarding to its content.

  4. Click Finish to close the dialog box.

    The MDM input connection (tMDMInput) is now ready to be dropped in any of your Jobs.

Defining output MDM schema

This section describes how to define and download an output MDM XML schema. To define and download an input MDM XML schema, see Setting up the connection.

To set the values to be written in one or more entities linked to a specific MDM connection, complete the following:

  1. In the Repository tree view, expand Metadata and right-click the MDM connection for which you want to write the entity values.

  2. Select Retrieve Entity from the contextual menu.

    A dialog box pops up.

  3. Select the Output MDM option in order to define an output XML schema and then click Next to proceed to the following step.

  4. From the Entities field, select the business entity (XML schema) in which you want to write values.

    The name is displayed automatically in the Name field.

    Note

    You are free to enter any text in this field, although you would likely put the name of the entity from which you are retrieving the schema.

  5. Click Next to proceed to the next step.

    Note

    Identical schema of the entity you selected is automatically created in the Linker Target panel, and columns are automatically mapped from the source to the target panels. The wizard automatically defines the item Id as the looping element. You can always select to loop on another element.

    Here, you can set the parameters to be taken into account for the XML schema definition.

  6. Click Schema Management to display a dialog box.

  7. Do necessary modifications to define the XML schema you want to write in the selected entity.

    Your Linker Source schema must corresponds to the Linker Target schema, that is to say define the elements in which you want to write values.

  8. Click OK to close the dialog box.

    The defined schema is displayed under Schema list.

  9. In the Linker Target panel, right-click the element you want to define as a loop element and select Set as loop element. This will restrict the iteration to one or more nodes.

    By doing so, the tMDMOutput component that uses this MDM connection will create a new row for every item with different feature.

    Note

    You can prioritize the order of the fields to write by selecting the field and using the up and down arrows.

  10. Click Finish to validate your modifications and close the dialog box.

    The newly created schema is listed under the corresponding MDM connection in the Repository tree view.

To modify the created schema, complete the following:

  1. In the Repository tree view, expand Metadata and Talend MDM and then browse to the schema you want to modify.

  2. Right-click the schema name and select Edit Entity from the contextual menu.

    A dialog box is displayed.

  3. Modify the schema as needed.

    You can change the name of the schema according to your needs, you can also customize the schema structure in the schema panel. The tool bar allows you to add, remove or move columns in your schema.

  4. Click Finish to close the dialog box.

    The MDM output connection (tMDMOutput) is now ready to be dropped in any of your Jobs.

Defining Receive MDM schema

This section describes how to define a receive MDM XML schema based on the MDM connection.

To set the XML schema you want to receive in accordance with a specific MDM connection, complete the following:

  1. In the Repository tree view, expand Metadata and right-click the MDM connection for which you want to retrieve the entity values.

  2. Select Retrieve Entity from the contextual menu.

    A dialog box displays.

  3. Select the Receive MDM option in order to define a receive XML schema and then click Next to proceed to the following step.

  4. From the Entities field, select the business entity (XML schema) according to which you want to receive the XML schema.

    The name displays automatically in the Name field.

    Note

    You can enter any text in this field, although you would likely put the name of the entity according to which you want to receive the XML schema.

  5. Click Next to proceed to the next step.