Creating and defining a match rule - 8.0

Master Data Management Examples

Version
8.0
Language
English
Product
Talend Data Fabric
Talend MDM Platform
Module
Talend Data Stewardship
Talend MDM Server
Talend MDM Web UI
Talend Studio
Content
Data Governance > Validating data
Data Quality and Preparation > Deduplicating data
Data Quality and Preparation > Matching data
Last publication date
2023-09-19

In this scenario, you need to create and define a match rule MatchCustomer to match the staging data records that belong to the Customer entity based on the fname and lname fields.

In MDM, match rules are used to decide whether two or more data records match, and how to handle them if they do.

Procedure

  1. In the MDM Repository tree view, right-click Match Rule and then select New from the contextual menu.
  2. In the dialog box that opens, define a name for the new match rule.
    If needed, enter information in the Purpose and Description fields to better describe your match rule.
  3. Click Finish to close the dialog box.
    The newly created match rule is displayed under the Match Rule node. You need to further define the characteristics of the match rule in the Match Rule Editor that opens.
  4. In the Record linkage algorithm section, select T-Swoosh.
    You can use the T-Swoosh algorithm to find duplicates and to define how two similar records are merged to create a master record, using a survivorship function.
  5. In the Match and Survivor section, define the criteria to use when matching staging data records.
    In this example, add two match keys Firstname and Lastname, select Jaro-Winkler as the matching function, set both thresholds to 0.8, and select Longest (for strings) as the survivorship function.
  6. In the Default Survivorship Rules section, define how to survive matches for certain data types: Boolean, Number and Date.
    If you do not specify the behavior for any or all data types, the default behavior is applied.
    Once you define the match rule, you must attach it to a specific entity of a data model.
    You cannot deploy a match rule directly to the MDM server. Rather, match rules are deployed along with the data model to which they are attached.