Setting up the Job
Procedure
- Drop the following components from the Palette onto the design workspace: tFileInputDelimited, tMatchPredict and tFileOutputDelimited.
- Connect tFileInputDelimited to tMatchPredict using the Main link.
- Connect tMatchPredict to tFileOutputDelimited using the Suspect duplicates link.
- Check that you have defined the connection to the Spark cluster and activated checkpointing in the Run > Spark Configuration view as described in Computing suspect pairs and suspect sample from source data.
Results
![](/en-US/data-matching/8.0/Content/Resources/images/use_case-tmatchpredict.png)
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!