Avoiding reading the same file more than once (idempotent consumer) - 6.3

Talend ESB Mediation Developer Guide

EnrichVersion
6.3
EnrichProdName
Talend Data Fabric
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
task
Design and Development
EnrichPlatform
Talend ESB

Camel supports Idempotent Consumer directly within the component so it will skip already processed files. This feature can be enabled by setting the idempotent=true option.

from("file://inbox?idempotent=true").to("...");

Camel uses the absolute file name as the idempotent key, to detect duplicate files. From Camel 2.11 onwards you can customize this key by using an expression in the idempotentKey option. For example to use both the name and the file size as the key:

<route>
   <from 
   uri="file://inbox?idempotent=true&idempotentKey=${file:name}-${file-size}"/>
   <to uri="bean:processInbox"/>
   </route>

By default Camel uses a in memory based store for keeping track of consumed files, it uses a least recently used cache holding up to 1000 entries. You can plugin your own implementation of this store by using the idempotentRepository option using the # sign in the value to indicate it is a referring to a bean in the Registry with the specified id .

<!-- Define our store as a plain Spring bean -->
<bean id="myStore" class="com.mycompany.MyIdempotentStore"/>

<route>
   <from uri="file://inbox?idempotent=true&amp;idempotentRepository=#myStore"/>
   <to uri="bean:processInbox"/>
</route>

Camel will log at DEBUG level if it skips a file because it has been consumed before:

DEBUG FileConsumer is idempotent and the file has been consumed before. 
This will skip this file: target\idempotent\report.txt