tSQLTemplateMerge - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tSQLTemplateMerge properties

Component family

ELT/SQLTemplate

 

Function

This component creates an SQL MERGE statement to merge data into a database table.

Purpose

This component is used to merge data into a database table directly on the DBMS by creating and executing a MERGE statement.

Basic settings

Database Type

Select the type of database you want to work on from the drop-down list.

 

Component list

Select the relevant DB connection component from the list if you use more than one connection in the current Job.

 

Source table name

Name of the database table holding the data you want to merge into the target table.

 

Target table name

Name of the table you want to merge data into.

 

Schema and Edit schema

This component involves two schemas: source schema and target schema.

A schema is a row description, that is to say, it defines the number of fields to be processed and passed on to the next component. The schema is either built-in or remotely stored in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema is created and stored locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the Repository, hence can be reused. Related topic: see Talend Studio User Guide.

 

Merge ON

Specify the target and source columns you want to use as the primary keys.

 

Use UPDATE (WHEN MATCHED)

Select this check box to update existing records. With the check box selected, the UPDATE Columns table appears, allowing you to define the columns in which records are to be updated.

 

Specify additional output columns

Select this check box to update records in additional columns other than those listed in the UPDATE Columns table. With this check box selected, the Additional UPDATE Columns table appears, allowing you to specify additional columns.

 

Specify UPDATE WHERE clause

Select this check box and type in a WHERE clause in the WHERE clause field to filter data during the update operation.

Note

This option may not work with certain database versions, including Oracle 9i.

 

Use INSERT (WHEN MATCHED)

Select this check box to insert new records. With the check box selected, the INSERT Columns table appears, allowing you to specify the columns to be involved in the insert operation.

 

Specify additional output columns

Select this check box to insert records to additional columns other than those listed in the INSERT Columns table. With this check box selected, the Additional INSERT Columns table appears, allowing you to specify additional columns.

 

Specify INSERT WHERE clause

Select this check box and type in a WHERE clause in the WHERE clause field to filter data during the insert operation.

Note

This option may not work with certain database versions, including Oracle 9i.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at component level.

SQL Template

SQL Template List

To add a default system SQL template: Click the Add button to add the default system SQL template(s) in the SQL Template List.

Click in the SQL template field and then click the arrow to display the system SQL template list. Select the desired system SQL template provided by Talend.

Note: You can create your own SQL template and add them to the SQL Template List.

To create a user-defined SQL template:

-Select a system template from the SQL Template list and click on its code in the code box. You will be prompted by the system to create a new template.

-Click Yes to open the SQL template wizard.

-Define your new SQL template in the corresponding fields and click Finish to close the wizard. An SQL template editor opens where you can enter the template code.

-Click the Add button to add the new created template to the SQL Template list.

For more information, see Talend Studio User Guide.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

NB_LINE_MERGED: the number of rows merged. This is an After variable and it returns an integer.

QUERY: the SQL query statement being processed. This is a Flow variable and it returns a string.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component is used as an intermediate component with other relevant DB components, especially the DB connection and commit components.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Scenario: Merging data directly on the DBMS

This scenario describes a simple Job that opens a connection to a MySQL database, merges data from a source table into a target table according to customer IDs, and displays the contents of the target table before and after the merge action. A WHERE clause is used to filter data during the merge operation.

  • Drop a tMysqlConnection component, a tSQLTemplateMerge component, two tMysqlInput components and two tLogRow components from the Palette onto the design workspace.

  • Connect the tMysqlConnection component to the first tMysqlInput component using a Trigger > OnSubjobOK connection.

  • Connect the first tMysqlInput component to the first tLogRow component using a Row > Main connection. This row will display the initial contents of the target table on the console.

  • Connect the first tMysqlInput component to the tSQLTemplateMerge component, and the tSQLTemplateMerge component to the second tMysqlInput component using Trigger > OnSubjobOK connections.

  • Connect the second tMysqlInput component to the second tLogRow component using a Row > Main connection. This row will display the merge result on the console.

  • Double-click the tMysqlConnection component to display its Basic settings view.

  • Set the database connection details manually or select Repository from the Property Type list and select your DB connection if it has already been defined and stored in the Metadata area of the Repository tree view.

    For more information about Metadata, see Talend Studio User Guide.

  • Double-click the first tMysqlInput component to display its Basic settings view.

  • Select the Use an existing connection check box. If you are using more than one DB connection component in your Job, select the component you want to use from the Component List.

  • Click the three-dot button next to Edit schema and define the data structure of the target table, or select Repository from the Schema list and select the target table if the schema has already been defined and stored in the Metadata area of the Repository tree view.

    In this scenario, we use built-in schemas.

  • Define the columns as shown above, and then click OK to propagate the schema structure to the output component and close the schema dialog box.

  • Fill the Table Name field with the name of the target table, customer_info_merge in this scenario.

  • Click the Guess Query button, or type in "SELECT * FROM customer_info_merge" in the Query area, to retrieve all the table columns.

  • Define the properties of the second tMysqlInput component, using exactly the same settings as for the first tMysqlInput component.

  • In the Basic settings view of each tLogRow component, select the Table option in the Mode area so that the contents will be displayed in table cells on the console.

  • Double-click the tSQLTemplateMerge component to display its Basic settings view.

  • Type in the names of the source table and the target table in the relevant fields.

    In this scenario, the source table is new_customer_info, which contains eight records; the target table is customer_info_merge, which contains five records, and both tables have the same data structure.

Note

The source table and the target table may have different schema structures. In this case, however, make sure that the source column and target column specified in each line of the Merge ON table, the UPDATE Columns table, and the INSERT Columns table are identical in data type and the target column length allows the insertion of the data from the corresponding source column.

  • Define the source schema manually, or select Repository from the Schema list and select the relevant table if the schema has already been defined and stored in the Metadata area of the Repository tree view.

    In this scenario, we use built-in schemas.

  • Define the columns as shown above and click OK to close the schema dialog box, and do the same for the target schema.

  • Click the green plus button beneath the Merge ON table to add a line, and select the ID column as the primary key.

  • Select the Use UPDATE check box to update existing data during the merge operation, and define the columns to be updated by clicking the green plus button and selecting the desired columns.

    In this scenario, we want to update all the columns according to the customer IDs. Therefore, we select all the columns except the ID column.

Warning

The columns defined as the primary key CANNOT and MUST NOT be made subject to updates.

  • Select the Specify UPDATE WHERE clause check box and type in customer_info_merge.ID >= 4 within double quotation marks in the WHERE clause field so that only those existing records with an ID equal to or greater than 4 will be updated.

  • Select the Use INSERT check box and define the columns to take data from and insert data to in the INSERT Columns table.

    In this example, we want to insert all the records that do not exist in the target table.

  • Select the SQL Template view to display and add the SQL templates to be used.

    By default, the SQLTemplateMerge component uses two system SQL templates: MergeUpdate and MergeInsert.

Note

In the SQL Template tab, you can add system SQL templates or create your own and use them within your Job to carry out the coded operation. For more information, see tSQLTemplateFilterColumns Properties.

  • Click the Add button to add a line and select Commit from the template list to commit the merge result to your database.

    Alternatively, you can connect the tSQLTemplateMerge component to a tSQLTemplateCommit or tMysqlCommit component using a Trigger > OnSubjobOK connection to commit the merge result to your database.

  • Save your Job and press F6 to run it.

    Both the original contents of the target table and the merge result are displayed on the console. In the target table, records No. 4 and No. 5 contain the updated information, and records No.6 through No. 8 contain the inserted information.