tLoqateAddressRow Standard properties - 7.2

Loqate address standardization

Version
7.2
Language
English (United States)
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components > Address standardization components > Loqate address standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components > Address standardization components > Loqate address standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components > Address standardization components > Loqate address standardization components

These properties are used to configure tLoqateAddressRow running in the Standard Job framework.

The Standard tLoqateAddressRow component belongs to the Data Quality family.

This component is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.

Basic settings

Schema

A schema is a row description, it defines the number of fields to be processed and passed on to the next component. The schema is either Built-in or stored remotely in the Repository.

 

Built-in: You create the schema and store it locally for this component only. Related topic: see Talend Studio User Guide.

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and job designs. Related topic: see Talend Studio User Guide.

Edit Schema

Click the [...] button and define the input and output schema of the address data.

Make sure to define in the output schema all columns necessary to output the formatted data you want to get from tLoqateAddressRow.

Input Address

Address field: add lines to the table and select from the component predefined list the fields that will hold the input address.

tLoqateAddressRow provides a long list of individual fields because some countries have more complex addressing structures than others. For further information about the input fields, see Address fields in tLoqateAddressRow.

Input Column: add lines to the table and select from the list the columns that hold the input address. The input schema can have one or multiple columns and can have columns that do not represent address data.

Output Address

Address field: add lines to the table and select from the component predefined list the fields that will hold the output address. The component will map the values of these fields to the output columns you set in the table.

tLoqateAddressRow provides a long list of individual fields because some countries have more complex addressing structures than others. For further information about the output fields, see Address fields in tLoqateAddressRow.

Output Column: add lines to the table and select from the list the columns that will hold the output address.

If you select to have an output column in the Output Address table that has the exact name of an input column, the input column value will be overwritten by the value given by tLoqateAddressRow.

In the output schema, there are two output standard columns that are read-only:

-STATUS: returns the status of processing input addresses. For further information about process status, see Process status in tLoqateAddressRow.

-ACCURACYCODE: returns the verification code for the processed address. For further information about what values this code is made up of and the implications of each segment, see Address verification codes in tLoqateAddressRow.

Loqate Data Path

Set the path to the Loqate Global Knowledge Repository provided by Loqate and installed locally.

You must order and download the Loqate Local API and the Global Knowledge Repository from http:// www.loqate.com/. tLoqateAddressRow uses the Q2.2 2016 release.

Advanced settings

Server options

Set the server options as the following:

-Address Line Separator: define the string which will separate the output address components within the output address fields. The default separator is the line break string (<BR>).

-Default Country: select the country name for which the ISO 3166-1 alpha-3 code should be used when parsing data and if no identifiable country is found in an input record.

-Forced Country: select the country name for which the ISO 3166-1 alpha-3 code should be used for all input records when parsing data.

-Output Script: use this option to transliterate the output address.

Select Latin to encode the parsing results in Latin, or western characters.

Select Native to encode the parsing results using the country script.

Below is a list of the character sets (scripts) and languages tLoqateAddressRow can transliterate:

Latn - Latin (Western characters),

Cyrl - Cyrillic (Russia),

Grek - Greek (Greece)

Hebr - Hebrew (Israel),

Hani - Kanji (Japan),

Hans - simplified Chinese (China),

Arab - Arabic (United Arab Emirates),

Thai - Thai (Thailand),

Hang - Hangul (South Korea),

Native - output in the native script wherever possible.

-Minimum match score: specify the minimum match score a record must reach in order not to be reverted. The default value is zero, and valid values are between zero and 100.

This option is very helpful when you want to get, in the output fields, the input data if a specific level of verification (minimum match score) was not reached.

tStat Catcher Statistics

Select this check box to collect log data at the component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usage rule

This component is an intermediary step. It requires an input and output flows.