tELTMysqlMap - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tELTMysqlMap properties

The three ELT Mysql components are closely related, in terms of their operating conditions. These components should be used to handle Mysql DB schemas to generate Insert statements, including clauses, which are to be executed in the DB output table defined.

Component family

ELT/Map/Mysql

 

Function

Helps to graphically build the SQL statement using the table provided as input.

Purpose

Uses the tables provided as input, to feed the parameter in the built statement. The statement can include inner or outer joins to be implemented between tables or between one table and its aliases.

Basic settings

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

Note

When a Job contains the parent Job and the child Job, if you need to share an existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared in the Basic settings view of the connection component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read that registered database connection.

For an example about how to share a database connection across Job levels, see Talend Studio User Guide.

 

ELT Mysql Map editor

The ELT Map editor allows you to define the output schema as well as build graphically the SQL statement to be executed. The column names of schema can be different from the column names in the database.

 

Style link

Select the way in which links are displayed.

Auto: By default, the links between the input and output schemas and the Web service parameters are in the form of curves.

Bezier curve: Links between the schema and the Web service parameters are in the form of curve.

Line: Links between the schema and the Web service parameters are in the form of straight lines.

This option slightly optimizes performance.

 

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the Repository file where Properties are stored. The following fields are pre-filled in using fetched data.

 

Host

Database server IP address.

 

Port

Listening port number of DB server.

 

Database

Name of the database.

 

Username and Password

DB user authentication data.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

Dynamic settings

Click the [+] button to add a row in the table and fill the Code field with a context variable to choose your database connection dynamically from multiple connections planned in your Job. This feature is useful when you need to access database tables having the same data structure but in different databases, especially when you are working in an environment where you cannot change your Job settings, for example, when your Job has to be deployed and executed independent of Talend Studio.

The Dynamic settings table is available only when the Use an existing connection check box is selected in the Basic settings view. Once a dynamic parameter is defined, the Component List box in the Basic settings view becomes unusable.

For examples on using dynamic parameters, see Scenario 3: Reading data from MySQL databases through context-based dynamic connections and Scenario: Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic settings and context variables, see Talend Studio User Guide.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

tELTMysqlMap is used along with a tELTMysqlInput and tELTMysqlOutput. Note that the Output link to be used with these components must correspond strictly to the syntax of the table name.

Note

The ELT components do not handle actual data flow but only schema information.

Connecting ELT components

The ELT components do not handle any data as such but table schema information that will be used to build the SQL query to execute.

Therefore the only connection required to connect these components together is a simple link.

Note

The output name you give to this link when creating it should always be the exact name of the table to be accessed as this parameter will be used in the SQL statement generated.

Related topic: see Talend Studio User Guide.

Mapping and joining tables

In the ELT Mapper, you can select specific columns from input schemas and include them in the output schema.

  • As you would do it in the regular Map editor, simply drag & drop the content from the input schema towards the output table defined.

  • Use the Ctrl and Shift keys for multiple selection of contiguous or non contiguous table columns.

You can implement explicit joins to retrieve various data from different tables.

  • Select the Explicit join check box for the relevant column, and select a type of join from the Join list.

  • Possible joins include: Inner Join, Left Outer Join, Right Outer Join or Full Outer Join and Cross Join.

  • By default the Inner Join is selected.

You can also create Alias tables to retrieve various data from the same table.

  • In the Input area, click on the [+] button to create an Alias.

  • Define the table to base the alias on.

  • Type in a new name for the alias table, preferably not the same as the main table.

Adding where and other clauses

You can also restrict the Select statement based on a Where clause and/or other clauses such as Group By, Order By, etc. by clicking the Add filter row button at the top of the output table in the map editor.

To add a restriction based on a Where clause, click the Add filter row button and select Add a WHERE clause from the pop-up menu.

To add a restriction based on Group By, Order By etc., click the Add filter row button and select Add an other(GROUP...) clause from the pop-up menu.

Make sure that all input components are linked correctly to the ELT Map component to be able to implement all inclusions, joins and clauses.

Generating the SQL statement

The mapping of elements from the input schemas to the output schemas create instantly the corresponding Select statement.

The clause are also included automatically.

Scenario 1: Aggregating table columns and filtering

This scenario describes a Job that gathers together several input DB table schemas and implementing a clause to filter the output using an SQL statement.

  • Drop the following components from the Palette onto the design workspace: three tELTMysqlInput components, a tELTMysqlMap, and a tELTMysqlOutput. Label these components to best describe their functionality.

  • Double-click the first tELTMysqlInput component to display its Basic settings view.

  • Select Repository from the Schema list, click the three dot button preceding Edit schema, and select your DB connection and the desired schema from the [Repository Content] dialog box.

    The selected schema name appears in the Default Table Name field automatically.

    In this use case, the DB connection is Talend_MySQL and the schema for the first input component is owners.

  • Set the second and third tELTMysqlInput components in the same way but select cars and resellers respectively as their schema names.

Note

In this use case, all the involved schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further information concerning metadata, see Talend Studio User Guide.

You can also select the three input components by dropping the relevant schemas from the Metadata area onto the design workspace and double-clicking tELTMysqlInput from the [Components] dialog box. Doing so allows you to skip the steps of labeling the input components and defining their schemas manually.

  • Connect the three tELTMysqlInput components to the tELTMysqlMap component using links named following strictly the actual DB table names: owners, cars and resellers.

  • Connect the tELTMysqlMap component to the tELTMysqlOutput component and name the link agg_result, which is the name of the database table you will save the aggregation result to.

  • Click the tELTMysqlMap component to display its Basic settings view.

  • Select Repository from the Property Type list, and select the same DB connection that you use for the input components.

    All the database details are automatically retrieved.

  • Leave all the other settings as they are.

  • Double-click the tELTMysqlMap component to launch the ELT Map editor to set up joins between the input tables and define the output flow.

  • Add the input tables by clicking the green plus button at the upper left corner of the ELT Map editor and selecting the relevant table names in the [Add a new alias] dialog box.

  • Drop the ID_Owner column from the owners table to the corresponding column of the cars table.

  • In the cars table, select the Explicit join check box in front of the ID_Owner column.

    As the default join type, INNER JOIN is displayed on the Join list.

  • Drop the ID_Reseller column from the cars table to the corresponding column of the resellers table to set up the second join, and define the join as an inner join in the same way.

  • Select the columns to be aggregated into the output table, agg_result.

  • Drop the ID_Owner, Name, and ID_Insurance columns from the owners table to the output table.

  • Drop the Registration, Make, and Color columns from the cars table to the output table.

  • Drop the Name_Reseller and City columns from the resellers table to the output table.

  • With the relevant columns selected, the mappings are displayed in yellow and the joins are displayed in dark violet.

  • Set up a filter in the output table. Click the Add filter row button on top of the output table to display the Additional clauses expression field, drop the City column from the resellers table to the expression field, and complete a WHERE clause that reads resellers.City ='Augusta'.

  • Click the Generated SQL Select query tab to display the corresponding SQL statement.

  • Click OK to save the ELT Map settings.

  • Double-click the tELTMysqlOutput component to display its Basic settings view.

  • Select an action from the Action on data list as needed.

  • Select Repository as the schema type, and define the output schema in the same way as you defined the input schemas. In this use case, select agg_result as the output schema, which is the name of the database table used to store the mapping result.

Note

You can also use a built-in output schema and retrieve the schema structure from the preceding component; however, make sure that you specify an existing target table having the same data structure in your database.

  • Leave all the other settings as they are.

  • Save your Job and press F6 to launch it.

    All selected data is inserted in the agg_result table as specified in the SQL statement.

Scenario 2: ELT using an Alias table

This scenario describes a Job that maps information from two input tables and an alias table, serving as a virtual input table, to an output table. The employees table contains employees' IDs, their department numbers, their names, and the IDs of their respective managers. The managers are also considered as employees and hence included in the employees table. The dept table contains the department information. The alias table retrieves the names of the managers from the employees table.

  • Drop two tELTMysqlInput components, a tELTMysqlMap component, and a tELTMysqlOutput component to the design workspace, and label them to best describe their functionality.

  • Double-click the first tELTMysqlInput component to display its Basic settings view.

  • Select Repository from the Schema list, and define the DB connection and schema by clicking the three dot button preceding Edit schema.

    The DB connection is Talend_MySQL and the schema for the first input component is employees.

Note

In this use case, all the involved schemas are stored in the Metadata node of the Repository tree view for easy retrieval. For further information concerning metadata, see Talend Studio User Guide.

  • Set the second tELTMysqlInput component in the same way but select dept as its schema.

  • Double-click the tELTMysqlOutput component to display its Basic settings view.

  • Select an action from the Action on data list as needed, Insert in this use case.

  • Select Repository as the schema type, and define the output schema in the same way as you defined the input schemas. In this use case, select result as the output schema, which is the name of the database table used to store the mapping result.

    The output schema contains all the columns of the input schemas plus a ManagerName column.

  • Leave all the other parameters as they are.

  • Connect the two tELTMysqlInput components to the tELTMysqlMap component using Link connections named strictly after the actual input table names, employees and dept in this use case.

  • Connect the tELTMysqlMap component to the tELTMysqlOutput component using a Link connection. When prompted, click Yes to allow the ELT Mapper to retrieve the output table structure from the output schema.

  • Click the tELTMysqlMap component and select the Component tab to display its Basic settings view.

  • Select Repository from the Property Type list, and select the same DB connection that you use for the input components.

    All the DB connection details are automatically retrieved.

  • Leave all the other parameters as they are.

  • Click the three-dot button next to ELT Mysql Map Editor or double-click the tELTMysqlMap component on the design workspace to launch the ELT Map editor.

    With the tELTMysqlMap component connected to the output component, the output table is displayed in the output area.

  • Add the input tables, employees and dept, in the input area by clicking the green plus button and selecting the relevant table names in the [Add a new alias] dialog box.

  • Create an alias table based on the employees table by selecting employees from the Select the table to use list and typing in Managers in the Type in a valid alias field in the [Add a new alias] dialog box.

  • Drop the DeptNo column from the employees table to the dept table.

  • Select the Explicit join check box in front of the DeptNo column of the dept table to set up an inner join.

  • Drop the ManagerID column from the employees table to the ID column of the Managers table.

  • Select the Explicit join check box in front of the ID column of the Managers table and select LEFT OUTER JOIN from the Join list to allow the output rows to contain Null values.

  • Drop all the columns from the employees table to the corresponding columns of the output table.

  • Drop the DeptName and Location columns from the dept table to the corresponding columns of the output table.

  • Drop the Name column from the Managers table to the ManagerName column of the output table.

  • Click on the Generated SQL Select query tab to display the SQL query statement to be executed.

  • Save your Job and press F6 to run it.

    The output database table result contains all the information about the employees, including the names of their respective managers.

Related scenario

For a related scenario using subquery, see Scenario: Mapping data using a subquery.