Defining a matching key with the VSR algorithm - 7.1

Talend Real-time Big Data Platform Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Real-Time Big Data Platform
task
Design and Development
EnrichPlatform
Talend Studio

Procedure

  1. In the Record linkage algorithm section, select Simple VSR Matcher if it is not selected by default.
  2. In the Data section, click the Select Matching Key tab and then click the name of the column(s) on which you want to apply the match algorithm.
    Matching keys that have the exact names of the selected input columns are listed in the Matching Key table.
    To remove a column from this table, right-click it and select Delete or click on its name in the Data table.
  3. Select the match algorithms you want to use from the Matching Function column and the null operator from the Handle Null column.
    In this example two match keys are defined, you want to use the Levenshtein and Jaro-Winkler match methods on first names and last names respectively and get the duplicate records.
    If you want to use an external user-defined matching algorithm, select Custom and use the Custom Matcher column to load the Jar file of the user-defined algorithm.
    For further information about the match rule algorithms and parameters, see the tMatchGroup documentation in the Talend Components Reference Guide.