tFileCompare - 6.1

Talend Components Reference Guide

English (United States)
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
Talend Studio
Data Governance
Data Quality and Preparation
Design and Development

tFileCompare properties

Component family




Compares two files and provides comparison data (based on a read-only schema)


Helps at controlling the data quality of files being processed.

Basic settings

Schema and Edit Schema

A schema is a row description, it defines the number of fields to be processed and passed on to the next component.

The schema of this component is read-only.


File to compare

Filepath to the file to be checked.


Reference file

Filepath to the file, the comparison is based on.


If differences are detected, display and If no difference detected, display

Type in a message to be displayed in the Run console based on the result of the comparison.


Print to console

Select this check box to display the message.

Advanced settings


Select the encoding from the list or select Custom and define it manually. This field is compulsory for DB data handling.


tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.


This component can be used as standalone component but it is usually linked to an output component to gather the log data.

Global Variables

DIFFERENCE: the result of the comparison. This is a Flow variable and it returns a boolean.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.


Outgoing links (from this component to another):

Row: Main.

Trigger: On Subjob Ok; On Subjob Error; Run if; On Component Ok; On Component Error.

Incoming links (from one component to this one):

Row: Main; Reject; Iterate.

Trigger: Run if; On Subjob Ok; On Subjob Error; On component Ok; On Component Error; Synchronize; Parallelize.

For further information regarding connections, see Talend Studio User Guide.


If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at



Scenario: Comparing unzipped files

This scenario describes a Job unarchiving a file and comparing it to a reference file to make sure it did not change. The output of the comparison is stored into a delimited file and a message displays in the console.

  1. Drag and drop the following components: tFileUnarchive, tFileCompare, and tFileOutputDelimited.

  2. Link the tFileUnarchive to the tFileCompare with Iterate connection.

  3. Connect the tFileCompare to the output component, using a Main row link.

  4. In the tFileUnarchive component Basic settings, fill in the path to the archive to unzip.

  5. In the Extraction Directory field, fill in the destination folder for the unarchived file.

  6. In the tFileCompare Basic settings, set the File to compare. Press Ctrl+Space bar to display the list of global variables. Select $_globals{tFileUnarchive_1}{CURRENT_FILEPATH} or "((String)globalMap.get("tFileUnarchive_1_CURRENT_FILEPATH"))" according to the language you work with, to fetch the file path from the tFileUnarchive component.

  7. And set the Reference file to base the comparison on it.

  8. In the messages fields, set the messages you want to see if the files differ or if the files are identical, for example: "[job " + JobName + "] Files differ".

  9. Select the Print to Console check box, for the message defined to display at the end of the execution.

  10. The schema is read-only and contains standard information data. Click Edit schema to have a look to it.

  11. Then set the output component as usual with semi-colon as data separators.

  12. Save your Job and press F6 to run it.

    The message set is displayed to the console and the output shows the schema information data.