tMatchIndexPredict

Continuous matching

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Data Fabric
Talend Big Data Platform
Talend Real-Time Big Data Platform
task
Design and Development > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Continuous matching components
EnrichPlatform
Talend Studio
Talend Data Stewardship

Compares a new data set with a lookup data set stored in ElasticSearch, using tMatchIndex. tMatchIndexPredict outputs unique records and suspect duplicates in separate files.

In the potential duplicates output, each record contains the fields from the source records and the fields from the potentially matching lookup records.

For more information about tMatchIndex, see tMatchIndex.

This component can run only with Spark 2.0+ and ElasticSearch 5+.

For more technologies supported by Talend, see Talend components.