Defining a matching key with the VSR algorithm - 7.3

Talend Open Studio User Guide

Version
7.3
Language
English
Product
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Module
Talend Studio
Content
Design and Development
Last publication date
2023-10-11
Available in...

Open Studio for Data Quality

Procedure

  1. In the Record linkage algorithm section, select Simple VSR Matcher if it is not selected by default.
  2. In the Data section, click the Select Matching Key tab and then click the name of the column(s) on which you want to apply the match algorithm.
    Matching keys that have the exact names of the selected input columns are listed in the Matching Key table.
    To remove a column from this table, right-click it and select Delete or click on its name in the Data table.
  3. Select the match algorithms you want to use from the Matching Function column and the null operator from the Handle Null column.
    In this example two match keys are defined, you want to use the Levenshtein and Jaro-Winkler match methods on first names and last names respectively and get the duplicate records.
    If you want to use an external user-defined matching algorithm, select Custom and use the Custom Matcher column to load the Jar file of the user-defined algorithm.