Skip to main content

Working principle

This component implements the MapReduce model, based on the blocking keys defined in the Blocking definition table of the Basic settings view.

This implementation proceeds as follows:

  1. Splits the input rows in groups of a given size.

  2. Implements a Map Class that creates a map between each key and a list of records.

  3. Shuffles the records to group those with the same key together.

  4. Applies, on each key, the algorithm defined in the Key definition table of the Basic settings view.

    Then accordingly, this component reads the records, compares them with the master records, groups the similar ones, and classes each of the rest as a master record.

  5. Outputs the groups of similar records with their group IDs, group sizes, matching distances and scores.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!