Scenario 2: Updating data in a database table - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This Java scenario describes a two-component Job that updates data in a MySQL table according to that in a delimited file.

Setting up the Job

  • Drop tFileInputDelimited and tAmazonMysqlOutput from the Palette onto the design workspace. Connect the two components together using a Row Main link.

Configuring the input component

  1. Double-click tFileInputDelimited to display its Basic settings view and define the component properties.

  2. From the Property Type list, select Repository if you have already stored the metadata of the delimited file in the Metadata node in the Repository tree view. Otherwise, select Built-In to define manually the metadata of the delimited file.

    For more information about storing metadata, see Talend Studio User Guide.

  3. In the File Name field, click the [...] button and browse to the source delimited file that contains the modifications to propagate in the MySQL table.

    In this example, we use the customer_update file that holds four columns: id, CustomerName, CustomerAddress and idState. Some of the data in these four columns is different from that in the MySQL table.

  4. Define the row and field separators used in the source file in the corresponding fields. If needed, set Header, Footer and Limit.

    In this example, Header is set to 1 since the first row holds the names of columns, therefore it should be ignored. Also, the number of processed lines is limited to 2000.

  5. Select Built in from the Schema list then click the [...] button next to Edit Schema to open a dialog box where you can describe the data structure of the source delimited file that you want to pass to the component that follows.

  6. Select the Key check box(es) next to the column name(s) you want to define as key column(s).

    Note

    It is necessary to define at least one column as a key column for the Job to be executed correctly. Otherwise, the Job is automatically interrupted and an error message displays on the console.

Configuring the output component

  1. In the design workspace, double-click tAmazonMysqlOutput to open its Basic settings view where you can define its properties.

  2. Click Sync columns to retrieve the schema of the preceding component. If needed, click the [...] button next to Edit schema to open a dialog box where you can check the retrieved schema.

  3. From the Property Type list, select Repository if you have already stored the connection metadata in the Metadata node in the Repository tree view. Otherwise, select Built-In to define manually the connection information.

    For more information about storing metadata, see Talend Studio User Guide.

  4. In the Table field, enter the name of the table to update.

  5. From the Action on table list, select the operation you want to perform, None in this example since the table already exists.

  6. From the Action on data list, select the operation you want to perform on the data, Update in this example.

Job execution

Save your Job and press F6 to execute it.

Using your DB browser, you can verify if the MySQL table, customers, has been modified according to the delimited file.

In the above example, the database table has always the four columns id, CustomerName, CustomerAddress and idState, but certain fields have been modified according to the data in the delimited file used.