tMemSQLInput - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Warning

This component will be available in the Palette of Talend Studio on the condition that you have subscribed to one of the Talend solutions with Big Data.

tMemSQLInput properties

Component family

Big Data/MemSQL

 

Function

tMemSQLInput connects to a given database and extracts fields based on a user-defined query.

Purpose

tMemSQLInput executes a DB query with a strictly defined order which must correspond to the schema definition. Then it passes the selected data to the next component via a Main row link.

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the repository file in which the properties are stored. The fields that follow are completed automatically using the data retrieved.

 

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

Note

When a Job contains the parent Job and the child Job, if you need to share an existing connection between the two levels, for example, to share the connection created by the parent Job with the child Job, you have to:

  1. In the parent level, register the database connection to be shared in the Basic settings view of the connection component which creates that very database connection.

  2. In the child level, use a dedicated connection component to read that registered database connection.

For an example about how to share a database connection across Job levels, see Talend Studio User Guide.

 

Host

Enter the location of the server of the MemSQL database to be used.

 

Port

Enter the listening port number of the server of the MemSQL database to be used.

 

Database

Enter the name of the MemSQL database to be used.

 

Username and Password

Enter the authentication data used to connect to the MemSQL database to be used.

 

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

This component offers the advantage of the dynamic schema feature. This allows you to retrieve unknown columns from source files or to copy batches of columns from a source without mapping each column individually. For further information about dynamic schemas, see Talend Studio User Guide.

This dynamic schema feature is designed for the purpose of retrieving unknown columns of a table and is recommended to be used for this purpose only; it is not recommended for the use of creating tables.

 

 

Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.

  

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

Table Name

Name of the table to be read.

 

Query type and Query

Enter your DB query paying particularly attention to properly sequence the fields in order to match the schema definition.

Warning

If using the dynamic schema feature, the SELECT query must include the * wildcard, to retrieve all of the columns from the table selected.

Advanced settings

Additional JDBC parameters

Specify additional connection properties for the DB connection you are creating. This option is not available if you have selected the Use an existing connection check box in the Basic settings.

Note

When you need to handle data of the time-stamp type 0000-00-00 00:00:00 using this component, set the parameter as:

noDatetimeStringSync=true&zeroDa√ā¬≠teTimeBehavior=convertToNull.

 

Trim all the String/Char columns

Select this check box to remove leading and trailing whitespace from all the String/Char columns.

 

Trim column

Remove leading and trailing whitespace from defined columns.

Note

Clear Trim all the String/Char columns to enable Trim columns in this field.

 

tStatCatcher Statistics

Select this check box to collect log data at the component level.

Dynamic settings

Click the [+] button to add a row in the table and fill the Code field with a context variable to choose your database connection dynamically from multiple connections planned in your Job. This feature is useful when you need to access database tables having the same data structure but in different databases, especially when you are working in an environment where you cannot change your Job settings, for example, when your Job has to be deployed and executed independent of Talend Studio.

The Dynamic settings table is available only when the Use an existing connection check box is selected in the Basic settings view. Once a dynamic parameter is defined, the Component List box in the Basic settings view becomes unusable.

For examples on using dynamic parameters, see Scenario 3: Reading data from MySQL databases through context-based dynamic connections and Scenario: Reading data from different MySQL databases using dynamically loaded connection parameters. For more information on Dynamic settings and context variables, see Talend Studio User Guide.

Global Variables 

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

QUERY: the SQL query statement being processed. This is a Flow variable and it returns a string.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component covers all possible SQL queries for MemSQL databases.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see https://help.talend.com/display/KB/How+to+install+external+modules+in+the+Talend+products or the section describing how to configure the Studio in the Talend Installation Guide.

Scenario: Writing data to and reading data from a MemSQL database table

In this scenario a Job is built to write a table to a MemSQL database, and then retrieve the data from the table.

Dragging and dropping components and linking them together

  1. Drop a tMemSQLConnection, a tRowGenerator, a tMemSQLInput, a tMemSQLOutput, a tMemSQLClose, and a tLogRow from the Palette onto the workspace.

  2. Link the tMemSQLConnection to the tRowGenerator using a Row > OnSubjobOk connection.

  3. Link the tRowGenerator to the tMemSQLOutput using a Row > Main connection.

  4. Link the tRowGenerator to the tMemSQLInput using a Row > OnSubjobOk connection.

  5. Link the tMemSQLInput to the tLogRow using a Row > Main connection.

  6. Link the tMemSQLInput to the tMemSQLClose using a Row > OnSubjobOk connection.

Configuring the components

Opening and closing connection to the database

  1. Double-click the tMemSQLConnection component to open its Basic settings in the Component tab.

  2. In the Property Type list, select Built-In. Specify the connection details in the relevant fields, including the host name and listening port number of the database server, the user name and password for your database authentication, and the database name.

  3. Double-click tMemSQLClose to set its Basic settings in the Component tab.

  4. The Component List is filled in automatically with tMemSQLConnection_1.

Writing data in a database table

  1. Double-click the tRowGenerator component to open the Row Generator editor.

  2. Add three columns as shown below to generate 10 data rows:

    • id, Integer type, using Numeric.sequence from the Function list to generate sequence numbers.

    • firstName, String type, using TalendDataGenerator.getFirstName from the Function list to generate random first names.

    • lastName, String type, using TalendDataGenerator.getLastName from the Function list to generate random family names.

  3. In the Basic settings view of tRowGenerator in the Component tab, click the Edit Schema button to open the schema editor. Set the length of the id, firstName, lastName column to 4, 10, and 10.

    When done, click OK to close the editor and click Yes when prompted to propagate the schema to the next component.

    For more information about how to configure the tRowGenerator component, see tRowGenerator.

  4. Double-click tMemSQLOutput to open its Basic Settings view in the Component tab.

  5. Select the Use Existing Connection check box. The Component List is filled in automatically with tMemSQLConnection_1.

    Fill the Table field with the database table, customers in this example.

    Select Drop table if exists and create in the Action on table list, and Insert in the Action on table list.

Reading data from a database table

  1. Double-click tMemSQLInput to open its Basic Settings view in the Component tab.

  2. Select the Use Existing Connection check box. The Component List is filled in automatically with tMemSQLConnection_1.

  3. Set the Schema as Built-in and click Edit schema to define the data structure of the database table to read data from.

    Click the [+] button to add the rows that you will use to define the schema, three columns in this example id, firstName, and lastName.

    Under Column, click in the fields to enter the corresponding column names.

    Click the field under Type to define the type of data.

    Click OK to close the schema editor.

  4. Fill the Table field with the database table, customers in this example.

    Alternatively, click the [...] button next to the Table Name field to select the database table of interest. A dialog box opens showing a tree diagram of all the tables in the selected database. Select customers and click OK to close the dialog box.

  5. Set the Query Type as Built-In. Click the Guess Query button. The Query box is filled in automatically to retrieve all columns from the selected table.

  6. Double-click tLogRow to set its Basic settings in the Component tab.

  7. In the Mode area, select Table (print values in cells of a table) for a better display of the results.

  8. Save the Job.

Executing the Job

Press F6 to run the Job and see the results in the console.

As shown above, the data with the desired column names is retrieved from the database table.