Configuring the input component - 6.5

Deduplication

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Deduplication components
Data Quality and Preparation > Third-party systems > Data Quality components > Deduplication components
Design and Development > Third-party systems > Data Quality components > Deduplication components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click tFileInputDelimited to open its Basic settings view.

    The input data must be the suspect pairs labeled and grouped by the tMatchPredict component.

  2. Click the [...] button next to Edit schema and use the [+] button in the dialog box to add columns.

    The input schema must be the same as the suspect pairs outputted by the tMatchPredict component.

  3. Click OK in the dialog box and accept to propagate the changes when prompted.
  4. In the Folder/File field, set the path to the input file.
  5. Set the row and field separators in the corresponding fields and the header and footer, if any.