Executing the Job - 7.1

Identification

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Identification components
Data Quality and Preparation > Third-party systems > Data Quality components > Identification components
Design and Development > Third-party systems > Data Quality components > Identification components
EnrichPlatform
Talend Studio

Procedure

Save your Job and press F6 to execute it.
The output columns include the T_GEN_KEY column that holds the functional key generated by the tGenKey component.
You can see that all records that have the same functional key are grouped together in different blocks "groups". The identifier for each group is listed in the GID column next to the corresponding record. The number of records in each of the output blocks is listed in the GRP_SIZE column and computed only on the master record. The MASTER column indicates with true/false if the corresponding record is a master record or not a master record. The SCORE column lists the calculated distance between the input record and the master record according to the Jaro-Winkler matching algorithm.
For an example of creating data partitions based on different blocking keys and using them with multiple tMatchGroup components, see tMatchGroup.