For more technologies supported by Talend, see Talend components.
tMatchModel reads the sample of suspect pairs computed on a list of duplicate childhood education centers and labeled by data stewards in Talend Data Stewardship. It generates several matching models, searches the best combination of the learning parameters and keeps the best matching model which comes out as the result of cross validation.
- You have generated the suspect data pairs by using the tMatchPairing component and labeled them in
Talend Data Stewardship. For further
information, see Computing suspect pairs and writing a sample in Talend Data Stewardship.
For further information about handling grouping tasks to decide on relationship among pairs of records, see Talend Data Stewardship Examples.