- In the Profiling perspective, click Analysis Results at the bottom of the editor.
In the Simple Statistics results of the email
column, right-click the duplicate count bar in the chart and select
The Integration perspective opens in Talend Studio showing the generated Job with the corresponding components.
For more information on such components, see Talend Components Reference Guide.
The database input component and the tUniqueRow components are already configured according to your connection and the columns you are analyzing.
- Save the Job and press F6 to execute it.
Duplicate values are written to the specified output database and file.
What to do next
You can follow the same procedure to remove duplicates from the postal column.
For further information on using the Profiling perspective to identify and remove corrupt, incomplete or inaccurate data, see the Data Cleansing chapter in Talend Studio User Guide.