Defining a matching key with the VSR algorithm - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Procedure

  1. In the Record linkage algorithm section, select Simple VSR if it is not selected by default.
  2. In the Data section, click the Select Matching Key tab and click the name of the columns on which you want to apply the match algorithm.
    Matching keys that have the exact names of the selected input columns are listed in the Matching Key table.
    Examples of matching keys and their parameters in the Matching Key section.
    To remove a column from this table, right-click it and select Delete or click on its name in the Data table.
  3. Select the match algorithms you want to use from the Matching Function column and the null operator from the Handle Null column.
    In this example two match keys are defined, you want to use the Levenshtein and Jaro-Winkler match methods on first names and last names respectively and get the duplicate records.
    If you want to use an external user-defined matching algorithm, select Custom and use the Custom Matcher column to load the Jar file of the user-defined algorithm.