Scenario: Handling data with Redshift - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario describes a Job that writes the personal information into Redshift, then retrieves the information in Redshift and displays it on the console.

The scenario requires the following six components:

  • tRedshiftConnection: opens a connection to Redshift.

  • tFixedFlowInput: defines the personal information data structure, and sends the data to the next component.

  • tRedshiftOutput: writes the data it receives from the preceding component into Redshift.

  • tRedshiftInput: reads the data from Redshift.

  • tLogRow: displays the data it receives from the preceding component on the console.

  • tRedshiftClose: closes the connection to Redshift.

Dropping and linking the components

  1. Drop the six components listed previously from the Palette onto the design workspace.

  2. Connect tFixedFlowInput to tRedshiftOutput using a Row > Main connection.

  3. Connect tRedshiftInput to tLogRow also using a Row > Main connection.

  4. Connect tRedshiftConnection to tFixedFlowInput using a Trigger > OnSubjobOk connection.

  5. Connect tFixedFlowInput to tRedshiftInput and tRedshiftInput to tRedshiftClose also using a Trigger > OnSubjobOk connection.

Configuring the components

Opening a connection to Redshift

  1. Double-click tRedshiftConnection to open its Basic settings view.

  2. Select Built-In from the Property Type drop-down list.

    In the Host, Port, Database, Schema, Username, and Password fields, enter the information required for the connection to Redshift.

  3. In Advanced settings view, select Auto Commit check box to commit any changes to Redshift upon each transaction.

Defining the input data

  1. Double-click tFixedFlowInput to open its Basic settings view.

  2. Click the [...] button next to Edit schema to open the schema editor.

  3. In the schema editor, click the [+] button to add three columns: id of the interger type, name of the string type, and age of the integer type.

  4. Click OK to validate the changes and accept the propagation prompted by the pop-up [Propagate] dialog box.

  5. In the Mode area, select Use Inline Content (delimited file) and enter the following personal information in the Content field.

    1;Arthur;16
    2;Ford;18
    3;Jackson;17

Writing the data into Redshift

  1. Double-click tRedshiftOutput to open its Basic settings view.

  2. Select the Use an existing connection check box, and then select the connection you have already configured for tRedshiftConnection from the Component List drop-down list.

  3. In the Table field, enter or browse to the table into which you want to write the data, redshiftexample in this scenario.

  4. Select Drop table if exists and create from the Action on table drop-down list, and select Insert from the Action on data drop-down list.

  5. Click Sync columns to retrieve the schema from the preceding component tFixedFlowInput.

Retrieving the data from Redshift

  1. Double-click tRedshiftInput to open its Basic settings view.

  2. Select the Use an existing connection check box, and then select the connection you have already configured for tRedshiftConnection from the Component List drop-down list.

  3. Click the [...] button next to Edit schema to open the schema editor.

  4. In the schema editor, click the [+] button to add three columns: id of the interger type, name of the string type, and age of the integer type. The data structure is same as the structure you have defined for tFixedFlowInput.

  5. Click OK to validate the changes and accept the propagation prompted by the pop-up [Propagate] dialog box.

  6. In the Table Name field, enter or browse to the table into which you write the data, redshiftexample in this scenario.

  7. Click the Guess Query button to generate the query. The Query field will be filled with the automatically generated query.

Displaying the defined information

  1. Double-click tLogRow to open its Basic settings view.

  2. In the Mode area, select Table (print values in cells of a table) for a better view of the results.

Closing the connection to Redshift

  1. Double-click tRedshiftClose to open its Basic settings view.

  2. From Component List, select the connection you have already configured for tRedshiftConnection.

Saving and executing the Job

  1. Press Ctrl+S to save the Job.

  2. Press F6 to execute the Job.

    The personal information is written to the specified target Redshift database, and then the data is retrieved from the database and displayed on the console.