tMondrianInput - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tMondrianInput reads data from relational databases and produces multidimensional data sets based on an MDX query.

Purpose

tMondrianInput executes a multi-dimensional expression (MDX) query corresponding to the dataset structure and schema definition. Then it passes on the multidimensional dataset obtained to the next component via a Main row link.

tMondrianInput Properties

Component family

Business Intelligence/OLAP Cube

 

Basic settings

Mondrian Version

Select the Mondrian version you are using.

 

DB type

Select the relevant type of relational database

 

Property type

Either Built-in or Repository.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the Repository file where Properties are stored. The following fields are pre-filled in using fetched data.

 

Datasource

Name and path of the file containing the data.

 

Username and Password

DB user authentication data.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema is created and stored locally for this component only. Related topic: see Talend Studio User Guide .

 

 

Repository: The schema already exists and is stored in the Repository, hence can be reused. Related topic: see Talend Studio User Guide .

 

Catalog

Path to the catalog (structure of the data warehouse).

 

MDX Query

Type in the MDX query paying particularly attention to properly sequence the fields in order to match the schema definition and the data warehouse structure.

 

Encoding

Select the encoding from the list or select Custom and define it manually. This field is compulsory for DB data handling.

Advanced settings

tStat Catcher Statistics

Select this check box to collect log data at the component level.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is a Flow variable and it returns an integer.

QUERY: the query statement being processed. This is a Flow variable and it returns a string.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component covers MDX queries for multi-dimensional datasets.

Limitation

This component requires installation of its related jar files. For more information about the installation of these missing jar files, see the section describing how to configure the Studio of the Talend Installation and Upgrade Guide.

Scenario: Cross-join tables

This Job extracts multi-dimensional datasets from relational database tables stored in a MySQL base. The data are retrieved using a multidimensional expression (MDX query). Obviously you need to have to know the structure of your data, or at least have a structure description (catalog) as a reference for the dataset to be retrieved in the various dimensions.

Setting up the Job

  1. Drop tMondrianInput and tLogRow from the Palette to the design workspace.

  2. Connect the Mondrian connector to the output component using a Row Main connection.

Setting up the DB connection

  1. Double-click the tMondrianInput component to display its Basic settingsview.

  2. In DB type field, select the relational database you are using with Mondrian.

  3. Select the relevant Repository entry as Property type, if you store your DB connection details centrally. In this example the properties are built-in.

  4. Fill out the details of connection to your DB: Host, Port, Database name, User Name and Password.

  5. Select the relevant Schema in the Repository if you store it centrally. In this example, the schema is to be set (built-in).

Configuring the DB query

  1. The relational database we want to query contains five columns: media, drink, unit_sales, store_cost and store_sales.

  2. The query aims at retrieving the unit_sales, store_cost and store_sales figures for various media / drink using an MDX query such as in the example below:

  3. Back on the Basic settings tab of the tMondrianInput component, set the Catalog path to the data warehouse. This catalog describes the structure of the warehouse.

  4. Then type in the MDX query such as:

    "select
       {[Measures].[Unit Sales], [Measures].[Store Cost], [Measures].[Store
    Sales]} on columns,
       CrossJoin(
         { [Promotion Media].[All Media].[Radio],
           [Promotion Media].[All Media].[TV],
           [Promotion Media].[All Media].[Sunday Paper],
           [Promotion Media].[All Media].[Street Handout] },
         [Product].[All Products].[Drink].children) on rows
     from Sales
     where ([Time].[1997])"

  5. Eventually, select the Encoding type on the list.

Job execution

  1. Select the tLogRow component and select the Print header check box to display the column names on the console.

  2. Then press F6 to run the Job.

The console shows the result of the unit_sales, store_cost and store_sales for each type of Drink (Beverages, Dairy, Alcoholic beverages) crossed with each media (TV, Sunday Paper, Street handout) as shown previously in a table form.