Big Data Platform
Cloud API Services Platform
Cloud Big Data Platform
Cloud Data Fabric
Cloud Data Management Platform
Data Management Platform
Data Services Platform
Real-Time Big Data Platform
You can use the T-Swoosh algorithm to find duplicates and to define how two similar records are merged to create a master record, using a survivorship function. These new merged records are used to find new duplicates.
The differences between the T-Swoosh and the VSR algorithms are the following. When using the T-Swoosh algorithm:
- The master record is in general a new record that does not exist in the list of input records.
- You can define a survivorship function for each column to create a master record.