Procedure
-
From the
Profiling
perspective,
right-click Metadata and create a file
connection to the duplicated_records output file generated
by the Job.
For further information, check the Data Profiling part in the Talend Studio User Guide.
- Expand the new file connection under Metadata and select Analyze matches.
- Follow the steps in the wizard to define the analysis metadata and click Finish to open the analysis editor.
-
In the Matching Key table, define a match key
on the Code column to group records by their
identification, records which have the same code are grouped together.
- Click Chart below the table to show the duplicates generated according to the Bernoulli distribution selected previously in the Job.