tMSSqlConnection - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tMSSqlConnection opens a connection to a Microsoft SQL Server database or a Microsoft Azure SQL database.

Purpose

This component is used to open a connection to the specified database that can then be reused in the subsequent subjob or subjobs.

tMSSqlConnection properties

Component family

Databases/MS SQL Server

 

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the repository file in which the properties are stored. The fields that follow are completed automatically using the data retrieved.

 

JDBC Provider

Select the provider of the JDBC driver to be used, either Microsoft (recommended) or Open source JTDS.

Note that when Microsoft is selected, you need to download the Microsoft JDBC Driver for SQL Server on Microsoft Download Center, unpack the downloaded zip file, choose a jar in the unzipped folder based on your JRE version, rename the jar to mssql-jdbc.jar and install it manually. For more information about choosing the jar, see the System Requirements information on Microsoft Download Center.

 

Host

Database server IP address.

 

Port

Listening port number of DB server.

 

Schema

Schema name.

 

Database

Name of the database.

 

Username and Password

DB user authentication data.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Additional JDBC parameters

Specify additional connection properties for the database connection you are creating. The properties are separated by semicolon and each property is a key-value pair. For example, encrypt=true;trustServerCertificate=false; hostNameInCertificate=*.database.windows.net;loginTimeout=30; for Azure SQL database connection.

 

Use or register a shared DB Connection

Select this check box to share your connection or fetch a connection shared by a parent or child Job. This allows you to share one single DB connection among several DB connection components from different Job levels that can be either parent or child.

Warning

This option is incompatible with the Use dynamic job and Use an independent process to run subjob options of the tRunJob component. Using a shared connection together with a tRunJob component with either of these two options enabled will cause your Job to fail.

This check box is not visible when the Specify a data source alias check box is selected.

 

Shared DB Connection Name

Enter the shared connection name.

This field is available only when the Use or register a shared DB Connection check box is selected.

 

Specify a data source alias

Select this check box and specify the alias of a data source created on the Talend Runtime side to use the shared connection pool defined in the data source configuration. This option works only when you deploy and run your Job in Talend Runtime. For a related use case, see Scenario 4: Retrieving data from a MySQL database using the data source on Talend Runtime side to set up the database connection.

This check box is not visible when the Use or register a shared DB Connection check box is selected.

 

Data source alias

Enter the alias of the data source created on the Talend Runtime side.

This field is available only when the Specify a data source alias check box is selected.

Advanced settings

Auto Commit

Select this check box to commit any changes to the database automatically upon the transaction.

With this check box selected, you cannot use the corresponding commit component to commit changes to the database; likewise, when using the corresponding commit component, this check box has to be cleared. By default, the auto commit function is disabled and changes must be committed explicitly using the corresponding commit component.

Note that the auto commit function commits each SQL statement as a single transaction immediately after the statement is executed while the commit component does not commit only until all of the statements are executed. For this reason, if you need more room to manage your transactions in a Job, it is recommended to use the commit component.

 

tStatCatcher Statistics

Select this check box to gather the job processing metadata at a Job level as well as at each component level.

Usage

This component is more commonly used with other tMSSql* components, especially with the tMSSqlCommit and tMSSqlRollback components.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see the article Installing External Modules on Talend Help Center (https://help.talend.com) how to configure the Studio in the Talend Installation and Upgrade Guide.

Scenario: Inserting data into a database table and extracting useful information from it

The scenario describes a Job that reads the employee data from a text file, inserts the data into a table of an MSSQL database, then extracts useful data from the table, and displays the information on the console.

This scenario involves the following components:

  • tMSSqlConnection: establishes a connection to the MSSQL server.

  • tFileInputDelimited: reads the input file, defines the data structure and sends it to the next component.

  • tMSSqlOutput: writes data it receives from the preceding component into a table of an MSSQL database.

  • tMSSqlInput: extracts data from the table based on an SQL query.

  • tLogRow: displays the information it receives from the preceding component on the console.

  • tMSSqlCommit: commits the transaction in the connected MSSQL server.

Setting up the Job

  1. Drop the following components from the Palette onto the design workspace: tMSSqlConnection, tFileInputDelimited, tMSSqlOutput, tMSSqlInput, tLogRow, and tMSSqlCommit.

  2. Connect tMSSqlConnection to tFileInputDelimited using a Trigger > OnSubjobOk link.

  3. Do the same to connect tFileInputDelimited to tMSSqlInput and tMSSqlInput to tMSSqlCommit.

  4. Connect tFileInputDelimited to tMSSqlOutput using a Row > Main link.

  5. Do the same to connect tMSSqlInput to tLogRow.

Configuring the components

Opening a connection to the MSSQL server

  1. Double-click the tMSSqlConnection component to open its Basic settings view in theComponent tab.

  2. In the Host field, type in the IP address or hostname of the MSSQL server, 192.168.30.47 in this example.

  3. In the Port field, type in the port number of the database server, 1433 in this example.

  4. In the Schema field, type in the schema name, dbo in this example.

  5. In the Database field, type in the database name, talend in this example.

  6. In the Username and Password fields, enter the credentials for the MSSQL connection.

Reading the input data

  1. Double-click the tFileInputDelimited component to open its Component view.

  2. Click the [...] button next to the File Name/Stream field to browse to the input file. In this example, it is D:/Input/Employee_Wage.txt. This text file holds three columns: id, name and wage.

    id;name;wage
    51;Harry;2300
    40;Ronald;3796
    17;Theodore;2174
    21;James;1986
    2;George;2591
    89;Calvin;2362
    84;Ulysses;3383
    4;Lyndon;2264
    17;Franklin;1780
    86;Lyndon;3999
  3. In the Header field, type in 1 to skip the first row of the input file.

  4. Click Edit schema to define the data to pass on to the tMSSqlOutput component. In this example, we define id as the key, and specify the length and precision for each column respectively.

    Click OK to close the schema editor. A dialog box opens, and you can choose to propagate the schema to the next component.

    Related topic: tFileInputDelimited.

Writing the data into the database table

  1. Double-click the tMSSqlOutput component to open its Basic settings view in the Component tab.

  2. Type in required information for the connection or use the existing connection you have configured before. In this example, we select the Use an existing connection check box. If multiple connections are available, select the connection you want to use from the Component List drop-down list.

  3. In the Table field, type in the name of the table you want to write the data to: Wage_Info in this example. You can also click the [...] button next to the Table field to open a dialog box and select a proper table.

  4. Select Create table if not exists from the Action on table drop-down list.

  5. Select Insert if not exists from the Action on data drop-down list.

  6. Click Sync columns to retrieve the schema from the preceding component.

Extracting useful information from the table

  1. Double-click the tMSSqlInput component to open its Basic settings view in the Component tab.

  2. Select the Use an existing connection check box. If multiple connections are available, select the connection you want to use from the Component List drop-down list.

  3. Click Edit schema to define the data structure to be read from the table. In this example, we need to read all three columns from the table.

  4. In the Table Name field, type in the name of the table you want to read the data from: Wage_Info in this example.

  5. In the Query field, fill in the SQL query to be executed on the table specified. To obtain the data of employees whose wages are above the average value and order them by id, enter the SQL query as follows:

    SELECT    * 
    FROM      Wage_Info
    WHERE     wage >
    (SELECT   avg(wage)
    FROM      Wage_Info)
    ORDER BY  id
    

Displaying information on the console

  1. Double-click the tLogRow component to open its Basic settings view.

  2. In the Mode area, select Table (print values in cells of a table).

Committing the transaction and closing the connection

  1. Double-click the tMSSqlCommit component to open its Basic settings view.