The differences between Unique match, First match and All matches

author
Talend Documentation Team
EnrichVersion
6.4
6.3
6.2
6.1
EnrichProdName
Talend Open Studio for ESB
Talend Data Fabric
Talend ESB
Talend Big Data Platform
Talend Open Studio for MDM
Talend Big Data
Talend Open Studio for Data Integration
Talend Real-Time Big Data Platform
Talend Data Integration
Talend MDM Platform
Talend Open Studio for Big Data
Talend Data Services Platform
Talend Data Management Platform
task
Data Governance > Third-party systems > Processing components (Integration)
Data Quality and Preparation > Third-party systems > Processing components (Integration)
Design and Development > Third-party systems > Processing components (Integration)
EnrichPlatform
Talend Studio

The differences between Unique match, First match and All matches

This article uses examples to illustrate the differences of the three match models.

These three match models mentioned in the title are provided by the tMap component when this component is used to perform the JOIN (Inner Join / Left Outer Join) operation over the data from homogeneous or heterogeneous sources.

The following example shows how to use the match models with the tMap component.

When writing this article, we assume that you have been familiar with Talend intuitive interface and thus is able to create Talend Jobs with components and links.

For further information about how to create a Talend Job, see the Getting started with a basic Job section of the Talend Studio User Guide.

For further information about the tMap component, see tMap and the tMap operation section of the Talend Studio User Guide.

Example Job implementing the different match models

Source data

The main source reads like:

ID Name
1 Shong
2 Elisa
3 Sabrina

The Lookup source reads as follows:

ID Email
1 Shong1@talend.com
1 Shong2@talend.com
2 Elisa@talend.com
3 Sabrina@talend.com

Now we plan to perform an inner join between the main source and the lookup source, and to produce the data structure as below based on the two sources.

ID Name Email

The result varies depending on the match model to be used.

Creating the Job

We use a tFixedFlowInput component to generate the main source.

And use a second tFixedFlowInput component to generate the lookup source.

Use tMap to perform the inner join, and output the result to a tLogRow component (with Table mode) that prints the result on the console.

Using the match models to generate different results

Unique match: this is the default option for the JOIN operation. It outputs the last matching record of the lookup source.

The result of the JOIN by the Unique match model reads as follows:

Starting job tMap_Match_modes at 17:46 25/09/2013.

[statistics] connecting to socket on port 3367
[statistics] connected
.--+-------+-------------------.
|          tLogRow_2           |
|=-+-------+------------------=|
|ID|Name   |Email              |
|=-+-------+------------------=|
|1 |Shong  |Shong2@talend.com  |
|2 |Elisa  |Elisa@talend.com   |
|3 |Sabrina|Sabrina1@talend.com|
'--+-------+-------------------'
[statistics] disconnected
Job tMap_Match_modes ended at 17:46 25/09/2013. [exit code=0]

First match: it outputs the first matching record of the lookup source.

The result of the JOIN by the First match model reads as follows:

Starting job tMap_Match_modes at 17:51 25/09/2013.

[statistics] connecting to socket on port 3942
[statistics] connected
.--+-------+-------------------.
|          tLogRow_2           |
|=-+-------+------------------=|
|ID|Name   |Email              |
|=-+-------+------------------=|
|1 |Shong  |Shong1@talend.com  |
|2 |Elisa  |Elisa@talend.com   |
|3 |Sabrina|Sabrina1@talend.com|
'--+-------+-------------------'
[statistics] disconnected
Job tMap_Match_modes ended at 17:51 25/09/2013. [exit code=0]

All match: it outputs all matching records of the lookup source.

The result of the JOIN by the All match model reads as follows:

Starting job tMap_Match_modes at 17:58 25/09/2013.

[statistics] connecting to socket on port 3381
[statistics] connected
.--+-------+-------------------.
|          tLogRow_2           |
|=-+-------+------------------=|
|ID|Name   |Email              |
|=-+-------+------------------=|
|1 |Shong  |Shong1@talend.com  |
|1 |Shong  |Shong2@talend.com  |
|2 |Elisa  |Elisa@talend.com   |
|3 |Sabrina|Sabrina1@talend.com|
'--+-------+-------------------'
[statistics] disconnected
Job tMap_Match_modes ended at 17:58 25/09/2013. [exit code=0]