Editing the mapping of the verification codes from address validation providers to Talend verification levels - 7.3

Address standardization

Version
7.3
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components > Address standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components > Address standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components > Address standardization components
Last publication date
2024-02-21

The tAddressRowCloud and tBatchAddressRowCloud components enable you to verify address using online services such as Melissa Data and Loqate. You can edit the melissaVerifLevelConf.xml and loqateVerifLevelConf.xml files to change the mapping of the verification codes from Melissa Data and Loqate to Talend verification levels.

In a Job using the tAddressRowCloud component to parse addresses against Melissa Data, the values in the VerificationLevel column in the output look like this:

In this example, the values in melissaVerifLevelConf.xml are defaults:
<Provider name="melissadata">
  <VerifyLevel>
    <Verified match="startsWith">AV2</Verified>
    <PartiallyVerified match="startsWith">AV1</PartiallyVerified>
    <Unverified match="startsWith">AE01,AE02,AE03</Unverified>
    <Ambiguous match="startsWith">AE05,AE09,AE11,AE13,AE14,AE17</Ambiguous>
    <Conflict match="startsWith">AE08,AE10,AE12</Conflict>
    <Reverted></Reverted>
  </VerifyLevel>
</Provider>

For more technologies supported by Talend, see Talend components.

Procedure

  1. Go to <StudioPath>\plugins\org.talend.designer.components.tdqprovider\components\tAddressRowCloud, where <StudioPath> is the installation directory of Talend Studio.
  2. Unjar the org.talend.dataquality.address.jar file.
  3. Open melissaVerifLevelConf.xml or loqateVerifLevelConf.xml to manually edit it.
  4. Change the verification codes mapped to the different verification levels. Verification code values are separated by a comma.
    For example, to map verification codes starting with BBB to the verification level PartiallyVerified, replace AV1 with BBB in the PartiallyVerified node of the melissaVerifLevelConf.xml file:
    <PartiallyVerified match="startsWith">BBB</PartiallyVerified>
  5. Update the org.talend.dataquality.address.jar with the modified configuration file, melissaVerifLevelConf.xml in this example.
  6. Delete the cached org.talend.dataquality.address.jar files located in <StudioPath>/configuration/.m2/repository/org/talend/libraries/org.talend.dataquality.address/6.0.0 and <StudioPath>/workspace/.Java.

Results

After restarting Talend Studio, the output result from a Job using the tAddressRowCloud component to parse addresses against Melissa Data looks like this:

For the third and the sixth rows, Ambiguous is returned in the VerificationLevel column because the AE05 verification code returned in the AddressVerificationCode column is mapped to Ambiguous in melissaVerifLevelConf.xml.

For the fourth row, Conflict is returned in the VerificationLevel column because the AE08 verification code returned in the AddressVerificationCode column is mapped to Conflict in melissaVerifLevelConf.xml.

The other verification codes returned for these rows are not mapped to any Talend verification levels.