This scenario describes a Job which uses:
the tFixedFlowInput component to generate the address data to be analyzed,
the tBatchAddressRowCloud component to parse, standardize and format the addresses in the Cloud through the Address Validation API,
the tFileOutputExcel component to output the correct formatted addresses in an .xls file.
You must have internet connection to be able to use tBatchAddressRowCloud.
Drop the following components from the Palette onto the design workspace: tFixedFlowInput, tBatchAddressRowCloud and tFileOutputExcel.
Connect the three components together using the Main links.
Double-click tFixedFlowInput to open its Basic settings view in the Component tab.
Create the schema through the Edit Schema button.
In the open dialog box, click the [+] button and add the columns that will hold the information in the input address. For this example, add ID, Organization, Address1 x 8, Locality, AdministrativeArea, PostalCode and Country.
In the Number of rows field, enter 1.
In the Mode area, select the Use Inline Content option.
In the Content table, enter the address data you want to analyze, for example:
1000 23 girdwood road london sw18 GBR 1001 1111 bayhill drive ste 290 san bruno ca USA 1002 23 girdwood road london sw18 GBR 1003 1111 bayhill drive ste 290 san bruno ca USA 1004 23 girdwood road london sw18 GBR 1005 1111 bayhill drive ste 290 san bruno ca USA 1006 23 girdwood road london sw18 GBR 1007 1111 bayhill drive ste 290 san bruno ca USA 1008 23 girdwood road london sw18 GBR 1009 1111 bayhill drive ste 290 san bruno ca USA 1010 23 girdwood road london sw18 GBR ...
Setting the schema and selecting an address provider
Double-click tBatchAddressRowCloud to display the Basic settings view and define the component properties.
If required, click Sync columns to retrieve the schema defined in the input component.
Click the Edit schema button to open the schema dialog box.
tBatchAddressRowCloud proposes several predefined read-only address columns as shown in the below capture.
STATUScolumn returns the status of processing input addresses. For further information about process status, see Process status in tLoqateAddressRow.
AddressVerificationCodecolumn returns the verification code for the processed address. For further information about what values this code is made up of and the implications of each segment, see Address verification codes in tLoqateAddressRow.
The VerificationLevel output column provides you with a verification status of the processed addresses. For further information, see Address verification levels in tAddressRowCloud.
Move any of the input columns to the output schema if you want to show them in the verification results, click OK and accept to propagate the changes.
You can also add columns directly in the output schema to retrieve additional address information from the provider repository.
Select from the Address Provider list the provider of the reference data against which you want to validate and format input addresses, Loqate in this example.
You can also validate addresses against MelissaData online service.
In the License/API key field, enter the license key provided by Loqate.
In the Batch job name field, enter between quotation marks a name of your choice to give to the batch files that will be generated and saved on the Loqate server.
Set the number of address records you want to group in each batch file in the Number of rows in each batch file field.
Enter the login and password provided by Loqate in the Loqate website login and Loqate website password respectively.
From the Processing Mode list, select:
Verify and Geocode (selected by default)
standardize and correct addresses and enrich them with latitude and longitude.
Combining address verification and geocoding will cost extra credits. For further information, see Cloud Price Card.
standardize and correct addresses without enriching them with latitude and longitude.
Defining address mapping and setting advanced parameters
In the Input Mapping table:
Use the [+] button to add lines in the table.
Click in the Address Field column and select from the predefined list the fields that hold the input address, Address in this example.
The component will map the values of these fields to the input columns you set in this table.
tBatchAddressRowCloud provides a list of individual fields because some countries have more complex addressing structures than others.
Click in the Input Column column and select from the list of the input schema the columns that hold the input address you want to parse, Address1 in this example.
If required, select the Use Additional Output check box and define in the table what extra address fields you want to retrieve from the provider repository and add to the parsing results. For an example on how to use this table, check Defining additional address fields.
The Address field column holds predefined address fields which vary according to the provider you select. The Output Column column holds the fields you want to use to output the extra information. You must first add these additional columns to the component schema through the Edit Schema button.
Click the Advanced settings tab and set the parameters in this view according to your needs.
In this example:
Select the Use mockup mode (no credit consumption) check box.
This check box enables you to simulate execution and responses from the Loqate server by using as input a batch file that has been already processed by the Job and saved on the server.
Access the Loqate server at Online Batch Cleansing and fetch the identifier of the batch file you want to use as output in your Job.
Set the identifier in the Batch ID field.
This option is used only for testing or for development needs.
Leave all other default parameters as they are.
Double-click the tFileOutputExcel component to display the Basic settings view and define the component properties.
Set the destination file name as well as the sheet name and then select the Define all columns auto size check box.
Save your Job and press F6 to execute it.
The tBatchAddressRowCloud component parses addresses using batch processing. It corrects addresses using the online batch service of Loqate and writes the result in batch files on the Loqate server.
Right-click the output component and select Data Viewer to display the formatted address data.
tBatchAddressRowCloud matches input address data against the Loqate repository.
STATUSoutput column returns the
OKstatus for all address rows. This means that the verification process of all address rows could be completed successfully by the component. For further information about process status, see Process status in tLoqateAddressRow.
The VerificationLevel output column provides you with a verification levels defined by Talend. For further information, see Address verification levels in tAddressRowCloud.
AddressVerificationCodeoutput column returns a verification code for each of the processed address rows.
For further information about what values this code is made up of and the implications of each segment, see Address verification codes in tLoqateAddressRow.