Scenario 5: Mapping data using a group element - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Based on Scenario 2: Launching a lookup flow to join complementary data, this scenario presents how to set up an element as group element in the Map Editor of tXMLMap to group the output data. For more information about how to group the output data using tXMLMap, see Talend Studio User Guide.

The objective of this scenario is to group the customer id and the customer name information according to the states the customers come from. You need to reconstruct the XML tree view of the Customer output table by considering the following factors:

  • The elements tagging the customer id and the customer name information should be located under the loop element. Thus they are the sub-elements of the loop element.

  • The loop element and its sub-elements should be dependent directly on the group element.

  • The element tagging the state information used as the grouping condition should be dependent directly on the group element.

  • The group element cannot be the root element.

Based on this analysis, the XML structure of the output data should read as follows. The customers node is the root element, the customer node is set as the group element and the output data is grouped according to the LabelState element.

To put a group element into effect, the XML data to be processed should have been sorted, for example via your XML tools, around the element that will be used as the grouping condition. In this example, the customers possessing the same state id should be put together. The input data in the XML file Customer.xml should read as follows:

<?xml version="1.0" encoding="ISO-8859-15"?>
<Customers>
	<Customer RegisterTime="2001-01-17 06:26:40.000">
		<Name>
			<id>1</id>
			<CustomerName>Griffith Paving and Sealcoatin</CustomerName>
		</Name>
		<Address>
			<CustomerAddress>talend@apres91</CustomerAddress>
			<idState>2</idState>
		</Address>
		<Revenue>
			<Sum1>67852</Sum1>
		</Revenue>
	</Customer>
	<Customer RegisterTime="1987-02-23 17:33:20.000">
		<Name>
			<id>3</id>
			<CustomerName>Glenn Oaks Office Supplies</CustomerName>
		</Name>
		<Address>
			<CustomerAddress>1859 Green Bay Rd.</CustomerAddress>
			<idState>2</idState>
		</Address>
		<Revenue>
			<Sum1>1225.</Sum1>
		</Revenue>
	</Customer>
	<Customer RegisterTime="2002-06-07 09:40:00.000">
		<Name>
			<id>2</id>
			<CustomerName>Bill's Dive Shop</CustomerName>
		</Name>
		<Address>
			<CustomerAddress>511 Maple Ave. Apt. 1B</CustomerAddress>
			<idState>3</idState>
		</Address>
		<Revenue>
			<Sum1>88792</Sum1>
		</Revenue>
	</Customer>
	<Customer RegisterTime="1992-04-28 23:26:40.000">
		<Name>
			<id>4</id>
			<CustomerName>DBN Bank</CustomerName>
		</Name>
		<Address>
			<CustomerAddress>456 Grossman Ln.</CustomerAddress>
			<idState>3</idState>
		</Address>
		<Revenue>
			<Sum1>64493</Sum1>
		</Revenue>
	</Customer>
</Customers>
  1. In your Studio, open the Job used in Scenario 2: Launching a lookup flow to join complementary data to display it in the design workspace, and double-click the tXMLMap component to open its Map Editor.

  2. In the XML tree view of the Customer output table, right-click the customer (loop) node and select Delete from the contextual menu. Thus all of the elements under the customers root node are removed, then you can reconstruct the XML tree view that can be used to group the output data of interest.

  3. Right-click the customers root node and select Create Sub-Element from the contextual menu. In the pop-up dialog box, enter the name of the new sub-element. In this example, it is customer.

    Click OK to validate the changes and close the dialog box. A customer node is added under the customers root node in the output table.

  4. In the row2 lookup input table, select the LabelState node and drop it onto the customer node in the output table. In the pop-up dialog box, select Create as sub-element of target node and click OK to close the dialog box. A LabelState node is added under the customer node in the output table.

  5. Right-click the customer node in the output table and select Create Sub-Element from the contextual menu. In the pop-up dialog box, enter the name of the new sub-element. In this example, it is Name.

    Click OK to validate the changes and close the dialog box. A Name node is added under the customer node in the output table.

  6. In the row1 main input table, select the id and CustomerName nodes and drop them onto the Name node in the output table. In the pop-up dialog box, select Create as sub-element of target node and click OK to close the dialog box. A id node and a CustomerName node are added under the Name node in the output table.

  7. In the output table, right-click the Name node and from the contextual menu select As loop element to set it as the loop element, then right-click the customer node and from the contextual menu select As group element to group the output data according to the LabelState element.

  8. Click OK to validate the changes and close the map editor.

  9. Press Ctrl+S to save the Job and then F6 to run the Job.

    As shown above, the id element and the CustomerName element contained in the loop are grouped according to the LabelState element. The group element customer tags the start and the end of each group.