tPaloDimension - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

This component creates, drops or recreates dimensions with or without dimension elements inside a Palo database.

Purpose

This component manages Palo dimensions, even elements inside a database

tPaloDimension Properties

Component family

Business Intelligence/Cube OLAP/Palo

 

Basic settings

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

Note that when a Job contains the parent Job and the child Job, Component List presents only the connection components in the same Job level.

Connection configuration

Note

Unavailable when using an existing connection.

Host Name

Enter the host name or the IP address of the host server.

 

Server Port

Type in the listening port number of the Palo server.

 

Username and Password

Enter the Palo user authentication data.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Database

Type in the name of the database in which the dimensions are managed.

 

Dimension

Type in the name of the dimension on which the given operation should take place.

 

Action on dimension

Select the operation you want to perform on the dimension of interest:

- None: no action is taken on this dimension.

- Create dimension: the dimension does not exist and will be created.

- Create dimension if not exists: this dimension is created only when it does not exist.

- Delete dimension if exists and create: this dimension is deleted if exist and then a new one will be created.

- Delete dimension: this dimension is removed from the database.

 

Create dimension elements

Select this check box to activate the dimension management fields and create dimension elements along with the creation of this dimension.

Note

The below fields are available only when the Create dimension elements check box is selected

Dimension type

Note

Available only when the action on dimension is None.

Select the type of the dimension to be created. The type may be:

- Normal

- User info

- System

- Attribute

 

Commit size

Type in the number of elements which will be created before saving them inside the dimension.

 

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema is created and stored locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the Repository, hence can be reused. Related topic: see Talend Studio User Guide.

 

Consolidation type - None

Note

With this option, you activate the corresponding parameter fields to be completed.

Select this check box to move directly the incoming elements into the given dimension. With this option, you will not define any consolidations or hierarchy.

 

 

Input Column: select a column from the drop-down list. The columns in the drop-down list are those you defined for the schema. The values from this selected column would be taken to process dimension elements.

 

 

Element type: Select the type of elements. It may be:

- Numeric

- Text

 

 

Creation mode: Select creation mode for elements to be processed. This mode may be:

- Add: add simply an element to the dimension.

- Force add: force the creation of this element. If exist this element will be recreated.

- Update: updates this element if it exists.

- Add or Update: if this element does not exist, it will be created otherwise it will be updated. This is the default option.

- Delete: delete this element from the dimension

 

Consolidation type - Normal

Note

With this option, you activate the corresponding parameter fields to be completed.

Select this check box to create elements and consolidate them inside the given dimension. This consolidation structures the created elements in different levels.

 

 

Input Column: select a column from the drop-down list. The columns in the drop-down list are those you defined for the schema. The values from this selected column would be taken to process dimension elements.

 

 

Element type: Select the type of elements. It may be:

- Numeric

- Text

 

 

Creation mode: Select creation mode for elements to be created. This mode may be

- Add: add simply an element to the dimension.

- Force add: force the creation of this element. If the element exists, it will be recreated.

- Update: updates this element if it exists.

- Add or Update: if this element does not exist, it will be created, otherwise it will be updated. This is the default option.

 

Consolidation type - Self-referenced

Note

With this option, you activate the corresponding parameter fields to be completed.

Select this check box to create elements and structure them based on a parent-child relationship. The input stream is responsible for the grouping of the consolidation.

 

Element's type

Select the type of elements. It may be:

- Numeric

- Text

 

Creation mode

Select creation mode for elements to be created. This mode may be

- Add: add simply an element to the dimension.

- Force add: force the creation of this element. If exist this element will be recreated.

- Update: update this element if it exists.

- Add or Update: if this element does not exist, it will be created otherwise it will be updated. This is the default option.

 

 

Input Column: select a column from the drop-down list. The columns in the drop-down list are those you defined for the schema. The values from this selected column would be taken to process dimension elements.

 

 

Hierarchy Element: select the type and the relationship of this input column in the consolidation.

- Parent: set the input value as parent element.

- Child: relate the input value to the parent value and build the consolidation.

- Factor: define the factor for this consolidation.

Advanced settings

tStat Catcher Statistics

Select this check box to collect log data at the component level.

Global Variables

DIMENSIONNAME: the name of the dimension. This is an After variable and it returns a string.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Connections

Outgoing links (from this component to another):

Trigger: Run if; On Subjob Ok; On Subjob Error; On Component Ok; On Component Error.

Incoming links (from one component to this one):

Row: Main; Iterate

Trigger: Run if; On Subjob Ok; On Subjob Error; On Component Ok; On Component Error.

For further information regarding connections, see Talend Studio User Guide.

Usage

This component can be used in standalone or as end component of a process.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Deletion of dimension elements is only possible with the consolidation type None. Only consolidation type Self-Referenced allows the placing of an factor on this consolidation.

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see the article Installing External Modules on Talend Help Center (https://help.talend.com) how to configure the Studio in the Talend Installation Guide.

Scenario: Creating a dimension with elements

The Job in this scenario creates a date dimension with simple element hierarchy composed of three levels: Year, Month, Date.

To replicate this scenario, proceed as follows:

Setting up the Job

  1. Drop tPaloConnection, tRowGenerator, tMap, tPaloDimension from the component Palette onto the design workspace.

  2. Right-click tPaloConnection to open the contextual menu and select Trigger > On Subjob Ok to link it to tRowGenerator.

  3. Right-click tRowGenerator to open the contextual menu and select Row > Main to link it to tMap.

    Note

    tRowGenerator is used to generate rows at random in order to simplify this process. In the real case, you can use one of the other input components to load your actual data.

  4. Right-click tMap to open the contextual menu and select Row > New output to link to tPaloDimension, then name it as out1 in the dialog box that pops up.

Setting up the DB connection

  1. Double-click the tPaloConnection component to open its Component view.

  2. In the Host name field, type in the host name or the IP address of the host server, localhost for this example.

  3. In the Server Port field, type in the listening port number of the Palo server. In this scenario, it is 7777.

  4. In the Username field and the Password field, type in the authentication information. In this example, both of them are admin.

Configuring the input component

  1. Double-click tRowGenerator to open its editor.

  2. On the upper part of the editor, click the plus button to add one column and rename it as random_date in the Column column.

  3. In the newly added row, select Date in the Type column and getRandomDate in the Functions column.

  4. In the Function parameters view on the lower part of this editor, type in the new minimum date and maximum date values in the Value column. In this example, the minimum is 2010-01-01, the maximum is 2010-12-31.

  5. Click OK to validate your modifications and close the editor.

  6. On the dialog box that pops up, click OK to propagate your changes.

Configuration in the tMap editor

  1. Double-click tMap to open its editor.

  2. On the Schema editor view on the lower part of the tMap editor, under the out1 table, click the plus button to add three rows.

  3. In the Column column of the out1 table, type in the new names for the three newly added rows. They are Year, Month, and Date. These rows are then added automatically into the out1 table on the upper part of the tMap editor.

  4. In the out1 table on the upper part of the tMap editor, click the Expression column in the Year row to locate the cursor.

  5. Press Ctrl+space to open the drop-down variable list.

  6. Double-click TalendDate.formatDate to select it from the list. The expression to get the date displays in the Year row under the Expression column. The expression is TalendDate.formatDate("yyyy-MM-dd HH:mm:ss",myDate).

  7. Replace the default expression with TalendDate.formatDate("yyyy",row1.random_date) .

  8. Do the same for the Month row and the Date row to add this default expression and to replace it with TalendDate.formatDate("MM",row1.random_date) for the Month row and with TalendDate.formatDate("dd-MM-yyyy", row1.random_date) for the Date row.

  9. Click OK to validate this modification and accept the propagation by clicking OK in the dialog box that pops up.

Configuring the tPaloDimension component

  1. On the workspace, double-click tPaloDimension to open its Component view.

  2. Select the Use an existing connection check box. Then tPaloConnection_1 displays automatically in the Connection configuration field.

  3. In the Database field, type in the database in which the new dimension is created, talendDatabase for this scenario.

  4. In the Dimension field, type in the name you want to use for the dimension to be created, for example, Date.

  5. In the Action on dimension field, select the action to be performed. In this scenario, select Create dimension if not exist.

  6. Select the Create dimension elements check box.

  7. In the Consolidation Type area, select the Normal check box.

  8. Under the element hierarchy table in the Consolidation Type area, click the plus button to add three rows into the table.

  9. In the Input column column of the element hierarchy table, select Year from the drop-down list for the first row, Month for the second and Date for the third. This determinates levels of elements from different columns of the input schema.

Job execution

Press F6 to run the Job.

A new dimension is then created in your Palo database talendDatabase.