Configuring the key generation for the first pass - 7.0

Data matching

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Matching components > Data matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Data matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Data matching components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click the first tGenKey to open the Component view.
  2. Click and import blocking keys from match rules created and tested in the Profiling perspective of Talend Studio and use them in your Job. Otherwise, define the blocking key parameters as described in the below steps.
  3. Under the Algorithm table, click the [+] button to add two rows in the table.
  4. In the column column, click the newly added row and select from the list the column you want to process using an algorithm. In this example, select lname.
  5. Do the same on the second row to select postal_code.
  6. In the pre-algorithm column, click the newly added row and select from the list the pre-algorithm you want to apply to the corresponding column.
    In this example, select remove diacritical marks and convert to upper case to remove any diacritical mark and converts the fields of the lname column to upper case.
    This conversion does not change your raw data.
  7. In the algorithm column, click the newly added row and select from the list the algorithm you want to apply to the corresponding column. In this example, select N first characters of each word.
    If you select the Show help check box, you can display instructions on how to set algorithms/options parameters.
  8. Do the same for the second row on the algorithm column to select first N characters of the string.
  9. Click in the Value column next to the algorithm column and enter the value for the selected algorithm, when needed.
    In this scenario, enter 1 for both rows. The first letter of each field in the corresponding columns will be used to generate the key.
    Make sure to set a value for the algorithm which need one, otherwise you may have a compilation error when you run the Job.