tAddLocationFromIP - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tAddLocationFromIP replaces IP addresses with geographical locations.

Purpose

tAddLocationFromIP helps you to geolocate visitors through their IP addresses. It identifies visitors' geographical locations country, region, city, latitude, longitude, ZIP code...etc.using an IP address lookup database file.

tAddLocationFromIP Properties

Component family

Misc

 

Basic settings

Schema and Edit schema

A schema is a row description, it defines the fields to be processed and passed on to the next component. The schema of this component is read-only.

 

 

Built-in: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: Select the Repository file where Properties are stored. When selected, the fields that follow are pre-defined using fetched data.

 

Database Filepath

The path to the IP address lookup database file.

 

Input parameters

Input column: Select the input column from which the input values are to be taken.

 

 

input value is a hostname: Check if the input column holds hostnames.

 

 

input value is an IP address: Check if the input column holds IP addresses.

 

Location type

Country code: Check to replace IP with country code.

 

 

Country name: Check to replace IP with country name.

Global Variables

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component is an intermediary step in the data flow allowing to replace IP with geolocation information. It can not be a start component as it requires an input flow. It also requires an output component.

Limitation

Due to license incompatibility, the following JAR required to use this component is not provided. You can easily add the JAR by following the How to install external modules section of Talend Studio User Guide.

  • geoip.jar

Scenario: Identifying a real-world geographic location of an IP

The following scenario creates a three-component Job that associates an IP with a geographical location. It obtains a site visitor's geographical location based on its IP.

Dropping and linking components

  1. Drop the following components from the Palette onto the design workspace: tFixedFlowInput, tAddLocationFromIP, and tLogRow.

  2. Connect the three components using Row Main links.

Configuring the components

  1. In the design workspace, select tFixedFlowInput, and click the Component tab to define the basic settings for tFixedFlowInput.

  2. Click the [...] button next to Edit Schema to define the structure of the data you want to use as input. In this scenario, the schema is made of one column that holds an IP address.

  3. Click OK to close the dialog box, and accept propagating the changes when prompted by the system. The defined column is displayed in the Values panel of the Basic settings view.

  4. In the Number of rows field, enter the number of rows to be generated, and click in the Value cell and set the value for the IP address.

  5. In the design workspace, select tAddLocationFromIP and click the Component tab to define the basic settings for tAddLocationFromIP.

  6. Click the Sync columns button to synchronize the schema with the input schema set with tFixedFlowInput.

  7. Browse to the GeoIP.dat file to set its path in the Database filepath field.

    Note

    Ensure to download the latest version of the IP address lookup database file from the relevant site as indicated in the Basic settings view of tAddLocationFromIp.

  8. In the Input parameters panel, set your input parameters as needed. In this scenario, the input column is the ip column defined earlier that holds an IP address.

  9. In the Location type panel, set location type as needed. In this scenario, we want to display the country name.

  10. In the design workspace, select tLogRow and click the Component tab and define the basic settings for tLogRow as needed. In this scenario, we want to display values in cells of a table.

Saving and executing the Job

  1. Press Ctrl+S to save your Job.

  2. Press F6 or click Run in the Run tab to execute the Job.

One row is generated to display the country name that is associated with the set IP address.