For more technologies supported by Talend, see Talend components.
This scenario describes a basic Job that generates a sample of duplicate data from an input flow by using probability theories and specific criteria on three columns: Name, City and DOB (date of birth).
Below is a capture of a sample data of the input flow: