Scenario: Matching first names with a reference index - 6.3

Talend Components Reference Guide

Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
Data Governance
Data Quality and Preparation
Design and Development
Talend Studio

This scenario describes a four-component Job aiming at matching the name column of an input flow with the reference index.

The output of this first name match is displayed in the FIRSTNAMEMATCH output column along with all other columns defined in the input schema of the tFirstnameMatch component.

Dropping the components and linking them together

To drop and link the components of interest, proceed as follows:

  1. Drop the following components from the Palette to the design workspace: tFixedFlowInput, tFilterColumns, tFirstnameMatch and tLogRow.

  2. Connect the first three components using Row > Main links.

  3. Connect tFirstnameMatch to tLogRow using a Row > Output link.

Configuring the input data

To configure the input data, perform the following operations:

  1. Double-click tFixedFlowInput to display the Basic settings view and define the component properties.

  2. From the Schema list, set the schema type to Built-In and click the three-dot button next to Edit Schema. A dialog box displays.

  3. Click the plus button to add as many lines as needed for the input schema you want to create from internal variables.

    In this example, the input data flow is made of several columns including one for first names (name), two for country codes (iso2 and iso3) and one for gender (gender).

  4. Click OK to close the dialog box.

    The defined columns display in the Mode area of the component basic settings view.

  5. In the Mode area, select the Use Inline Content (delimited file) option to display the corresponding view.

  6. Set the row and field separators in the corresponding fields. You want to use these defined separators in your input flow.

  7. In the Content area, type in the data for the input flow according to the schema you defined earlier.

Configuring the process of matching data

To do this, you need to select the data columns of interest and then match them using tFirstnameMatch.

  1. Click the tFilterColumns component to display its Basic settings view and define the component properties.

    The tFilterColumns component enables you to build the output schema based on the column names of the input schema.

  2. Click the three-dot button next to Edit schema to display a dialog box where you can define the output schema.

  3. Select the name and gender columns from the input schema and move them to the output schema.

  4. Click OK to validate your changes and close the dialog box.

  5. Click tFirstnameMatch to display the Basic settings view and define the component properties.

  6. If required, click the three-dot button next to Edit schema to view the input and output schemas, and then click OK to close the dialog box.


    The output schema of this component is the same as the input schema plus one fixed column: FIRSTNAMEMATCH.