Scenario: Creating a dimension with elements - 6.3

Talend Open Studio for Big Data Components Reference Guide

Talend Open Studio for Big Data
Data Governance
Data Quality and Preparation
Design and Development
Talend Studio

The Job in this scenario creates a date dimension with simple element hierarchy composed of three levels: Year, Month, Date.

To replicate this scenario, proceed as follows:

Setting up the Job

  1. Drop tPaloConnection, tRowGenerator, tMap, tPaloDimension from the component Palette onto the design workspace.

  2. Right-click tPaloConnection to open the contextual menu and select Trigger > On Subjob Ok to link it to tRowGenerator.

  3. Right-click tRowGenerator to open the contextual menu and select Row > Main to link it to tMap.


    tRowGenerator is used to generate rows at random in order to simplify this process. In the real case, you can use one of the other input components to load your actual data.

  4. Right-click tMap to open the contextual menu and select Row > New output to link to tPaloDimension, then name it as out1 in the dialog box that pops up.

Setting up the DB connection

  1. Double-click the tPaloConnection component to open its Component view.

  2. In the Host name field, type in the host name or the IP address of the host server, localhost for this example.

  3. In the Server Port field, type in the listening port number of the Palo server. In this scenario, it is 7777.

  4. In the Username field and the Password field, type in the authentication information. In this example, both of them are admin.

Configuring the input component

  1. Double-click tRowGenerator to open its editor.

  2. On the upper part of the editor, click the plus button to add one column and rename it as random_date in the Column column.

  3. In the newly added row, select Date in the Type column and getRandomDate in the Functions column.

  4. In the Function parameters view on the lower part of this editor, type in the new minimum date and maximum date values in the Value column. In this example, the minimum is 2010-01-01, the maximum is 2010-12-31.

  5. Click OK to validate your modifications and close the editor.

  6. On the dialog box that pops up, click OK to propagate your changes.

Configuration in the tMap editor

  1. Double-click tMap to open its editor.

  2. On the Schema editor view on the lower part of the tMap editor, under the out1 table, click the plus button to add three rows.

  3. In the Column column of the out1 table, type in the new names for the three newly added rows. They are Year, Month, and Date. These rows are then added automatically into the out1 table on the upper part of the tMap editor.

  4. In the out1 table on the upper part of the tMap editor, click the Expression column in the Year row to locate the cursor.

  5. Press Ctrl+space to open the drop-down variable list.

  6. Double-click TalendDate.formatDate to select it from the list. The expression to get the date displays in the Year row under the Expression column. The expression is TalendDate.formatDate("yyyy-MM-dd HH:mm:ss",myDate).

  7. Replace the default expression with TalendDate.formatDate("yyyy",row1.random_date) .

  8. Do the same for the Month row and the Date row to add this default expression and to replace it with TalendDate.formatDate("MM",row1.random_date) for the Month row and with TalendDate.formatDate("dd-MM-yyyy", row1.random_date) for the Date row.

  9. Click OK to validate this modification and accept the propagation by clicking OK in the dialog box that pops up.

Configuring the tPaloDimension component

  1. On the workspace, double-click tPaloDimension to open its Component view.

  2. Select the Use an existing connection check box. Then tPaloConnection_1 displays automatically in the Connection configuration field.

  3. In the Database field, type in the database in which the new dimension is created, talendDatabase for this scenario.

  4. In the Dimension field, type in the name you want to use for the dimension to be created, for example, Date.

  5. In the Action on dimension field, select the action to be performed. In this scenario, select Create dimension if not exist.

  6. Select the Create dimension elements check box.

  7. In the Consolidation Type area, select the Normal check box.

  8. Under the element hierarchy table in the Consolidation Type area, click the plus button to add three rows into the table.

  9. In the Input column column of the element hierarchy table, select Year from the drop-down list for the first row, Month for the second and Date for the third. This determinates levels of elements from different columns of the input schema.

Job execution

Press F6 to run the Job.

A new dimension is then created in your Palo database talendDatabase.