Skip to main content Skip to complementary content

Configuring the tBlockedFuzzyJoin component

Availability-noteDeprecated

Procedure

  1. Double-click tBlockedFuzzyJoin to display its Basic settings view and define its properties.
  2. Click the Edit schema button to open a dialog box. Here you can define the data you want to pass to the output components.

    In this example we want to pass the four input columns to the output components in addition to the new column ref_firstname.

  3. Click OK to close the dialog box and proceed to the next step.
  4. In the Key definition area of the Basic settings view of tBlockedFuzzyJoin, click the plus button to add two columns to the list.
  5. Select the input columns and the output columns you want to do the fuzzy matching on from the Input key attribute and Lookup key attribute lists respectively, grp and firstname in this example.
  6. Click in the first cell of the Matching type column and select from the list the method to be used to check the incoming data against the reference data, Exact match in this example. There is no minimum nor maximum distance to set.
  7. Set the matching type for the second column, Levenshtein in this example.
  8. Then set the minimum and maximum distances. In this method, the distance is the number of character changes (insertion, deletion or substitution) that needs to be carried out in order for the entry to fully match the reference. In this example, we want the min. distance to be 0 and the max. distance to be 2. This will output all entries in the firstname column that exactly match or that have maximum two character changes.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!