Extracting matching features using tMatchModel

You can use the labeled sample of suspect duplicate pairs as the input of the tMatchModel component.

You have to specify the set of columns the model will be built on and the column specifying the label. The algorithm will compute different measures, called features, to catch as much information as possible on this set of columns.

The tMatchModel component uses the Random forest algorithm to build the model. This algorithm is a generalization of decision trees.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here