From the Profiling perspective of the studio, you can generate ready-to-use Jobs on specific files in the studio metadata. You can generate a Job to:
remove all the duplicates from a delimited file,
match the data in a delimited file against the data in another data source.