Removing master data partially from the MDM hub - 6.1

Talend Open Studio for Big Data Components Reference Guide

Talend Open Studio for Big Data
Data Governance
Data Quality and Preparation
Design and Development
Talend Studio

The scenario describes how to partially remove the master data which has been written into the MDM server in the scenario Scenario: Writing master data in an MDM hub.

In this example, one agency office will be removed from the Agency business entity. The agency now has an id, a name and three offices located in different cities.

For more information about entities, see Talend Studio User Guide.

Dropping and linking the components

  1. From the Palette, drop tFixedFlowInput and tMDMOutput onto the design workspace.

  2. Connect the components using a Row Main link.

Configuring the components

Specifying the data to be removed from the MDM server

  1. Double-click tFixedFlowInput to view its Basic settings in the Component tab.

  2. In the Schema list, select Built-In and then click the three-dot button next to Edit schema to open a dialog box in which you can define the structure of the master data to be used for partially removing the master data on the MDM server.

  3. Click the [+] button and add three columns of the type String.

    In this example, name the columns Id, Name, and Remove_Office.

  4. Click OK to save your changes.

  5. In the Number of rows field, enter the number of rows you want to generate.

  6. In the Mode area, select the Use Single Table option.

  7. In the Value fields, enter values which correspond to each of the schema columns.

    In this example, the office in Paris will be removed.

Basic settings of tMDMOutput

  1. In the design workspace, click tMDMOutput to open its Basic settings view.

  2. In the Input Schema list, select Built-In and then click Sync columns.

    After receiving data from the previous component, the tMDMOutput component basically generates an XML document, writes it in an output field, and then sends it to the MDM server.

  3. Click OK to proceed to the next step.

    The Result of the XML serialization list in the Basic settings view is automatically filled in with the output xml column.

  4. In the URL field, enter the URL to access the MDM server.

  5. In the Username and Password fields, enter the authentication information required to connect to the MDM server.

  6. In the Data Model field, enter between quotes the name of the data model against which you want to validate the master data you want to write.

  7. In the Data Container, enter between quotes the name of the data container into which you want to write the master data.

  8. In the Partial Update area, select the Use Partial Update check box.

    In the Source Name field that pops up with your selection, enter the name to be used in the modification report.

  9. In the Pivot field, enter the xpath to the multi-occurrence sub-element where data need to be removed.

    In this example, enter "Agency/Offices/Office".

  10. Select the Delete check box, and then enter "." in the Key field.

Advanced settings of tMDMOutput

  1. In the Component view, click Advanced settings to set the advanced parameters for the tMDMOutput component.

  2. Click the [...] next to Configure XML Tree to open the tMDMOutput editor.

    Alternatively, double-click tMDMOutput to open the editor.

  3. In the Link target area to the right, click in the XML Tree field and then replace rootTag with the name of the business entity in which you want to remove data partially, Agency in this example.

  4. In the Linker source area, select the two schema columns Id and Name and drop them on the Agency node.

    The [Selection] dialog box is displayed.

    Select the Create as sub-element of target node option so that the two columns are linked to the two XML sub-elements of the Agency node.

  5. Right-click the root node Agency and then select Add Sub-element.

    In the dialog box that pops up, enter a name for the new sub-element, Offices in this example.

    Repeat the same procedure to create a new sub-element Office for the Offices node which corresponds to the multi-occurence element Offices of the business entity Agency.

  6. In the Linker source area, select the schema column whose corresponding data entry you want to remove, Remove_Office in this example, and drop it on the new Office node.

    The [Selection] dialog box is displayed.

    Select the Create as sub-element of target node option so that the column is linked to the XML sub-element of the Offices node.

  7. Click Ok to proceed to the next step.

  8. Right-click the element in the Link Target area you want to set as a loop element and select Set As Loop Element from the contextual menu.

    In this example, Id is the iterating object.

  9. Click OK to validate your changes and close the dialog box.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. Execute the Job by pressing F6 or clicking Run on the Run tab.

    The agency office located in Paris with the agency Id PA05 is removed from the Agency business entity in the DStar data container on the MDM server.