Scenario: Adding contacts to the mailRetrieval index pool - 6.3

Talend Open Studio for Big Data Components Reference Guide

Talend Open Studio for Big Data
Talend Studio
Data Governance
Data Quality and Preparation
Design and Development

This scenario describes a batch job that adds contacts to the index pool of mailRetrieval. Before the addition, it must be checked whether these contacts already exist.

The input file for this scenario is already saved in the Repository, so that all schema metadata is available.


Please note that the data from the input source must be related to the same country.

Dropping and connecting the components

  1. In the Repository view, expand the Metadata node and the directory in which the file is saved. Then drag this file into the design workspace.

    The dialog box below appears.

  2. Select tFileInputDelimited and then click OK to close the dialog box.

    The component is displayed in the workspace.

  3. Drag the following components from the Palette into the design workspace: two tMap components, tUniservRTMailSearch and tUniservRTMailOutput .

  4. Connect tMap with tUniservRTMailSearch first.

    Accept the schema from tUniservRTMailSearch by clicking Yes on the prompt window.

  5. Connect the other components via Row > Main.

Configuring the components

  1. Double-click tMap_1 to open the schema mapping window. On the left is the structure of the input file and on the right is the schema of tUniservRTMailSearch. At the bottom lies the Schema Editor, where you can find the attributes of the individual columns and edit them.

  2. Assign the columns of the input file to the respective columns of tUniservRTMailSearch. For this purpose, select a column of the input source and drag it onto the appropriate column on the right side.

  3. When your input list contains a reference ID, you should adopt it. In order to do so, create a new column IN_DBREF in the Schema Editor and connect it with your reference ID.

    Click OK to close the window.

  4. Double-click tUniservRTMailSearch to open its Basic settings view.

  5. Under Maximum of displayed "duplicates", enter 0 to display all the duplicates.

    Select Define rejects to open the rejects definition window.

  6. Click the [+] button to insert a new line in the window. Select Duplicate count under the element column, > under the operator column, and 0 under the value column. So all the existing contacts are disqualified and only the new contact will be added to the index pool.

  7. Enter the Advanced settings view and check the parameters. Reasonable parameters are preset. Detailed information can be found in the manual mailRetrieval.

  8. Double-click tMap_3 to open schema mapping window. On the left is the schema of tUniservRTMailSearch and on the right is the schema of tUniservRTMailOutput.

  9. Click Auto map! to assign the fields automatically.

  10. The only field that must be assigned manually is the reference ID. In order to do so, drag OUT-DBREF from the left side onto the field IN_DBREF on the right side.