Skip to main content

Lucene search engine troubleshooting

The Talend Data Catalog search capabilities are implemented by a Lucene search engine with indexes located in <TDC_HOME>\TalendDataCatalog\data\search\.

If the Lucene search index directory has been lost, the Talend Data Catalog server will automatically recreate it.

If the Lucene search index has been corrupted for any reason (such as power outage during indexing, out of memory, concurrent write to the index), you can delete the search index directory and the server will automatically recreate it.

Although not officially supported, the Administrator can attempt to use the Lucene CheckIndex to "exorcise" corrupted documents from the index. You can follow these steps:

  1. Backup your Lucene index in the directory <TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx.
    Information noteNote: Replace lucene-xxxxxxxx with the actual directory name of your search index.
  2. Change the directory to a temporary directory, such as c:\temp.
  3. Run the following command:
    mkdir CheckIndex
    cd CheckIndex
    <TDC_HOME>\TalendDataCatalog\jre\bin\jar -xvf <TDC_HOME>\TalendDataCatalog\tomcat\webapps\MM.war
    cd WEB-INF
    java -classpath "lib/*" -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex <TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx
  4. Verify the output of the above command to see if there is any corrupted segment.
  5. If there is a corrupted segment, run the same command above with an extra option "-exorcise".
    java -classpath "lib/*" -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex <TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx -exorcise
  6. Delete the CheckIndex directory once finished.

For more information, see https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/index/CheckIndex.html.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!