tPOP - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tPOP properties

Component family

Internet

 

Function

The tPOP component fetches one or more email messages from a server using the POP3 or IMAP protocol.

Purpose

The tPOP component uses the POP or IMAP protocol to connect to a specific email server. Then it fetches one or more email messages and writes the recovered information in specified files. Parameters in the Advanced settings view allows you to use filters on your selection.

Basic settings

Host

IP address of the email server you want to connect to.

 

Port

Port number of the email server.

 

Username and Password

User authentication data for the email server.

Username: enter the username you use to access your email box.

Password: enter the password you use to access your email box.

 

Output directory

Enter the path to the file in which you want to store the email messages you retrieve from the email server, or click the three-dot button next to the field to browse to the file.

 

Filename pattern

Define the syntax of the names of the files that will hold each of the email messages retrieved from the email server, or press Ctrl+Space to display the list of predefined patterns.

 

Retrieve all emails?

Select this check box to retrieve all email messages present on the specified server.

 

Number of emails to retrieve

Enter the number of email messages you want to retrieve.

This field is available only when the Retrieve all emails? check box is cleared.

 

Newer email first

Select this check box to retrieve the most recent email messages according to the number specified in the Number of emails to retrieve field, and the email messages will be returned in chronological order.

This check box is available only when the Retrieve all emails? check box is cleared and by default it is selected.

 

Delete emails from server

Select this check box if you do not want to keep the retrieved email messages on the server.

Note

For Gmail servers, this option does not work for the pop3 protocol. Select the imap protocol and ensure that the Gmail account is configured to use imap.

 

Choose the protocol

From the list, select the protocol to be used to retrieve the email messages from the server. This protocol is the one used by the email server. If you choose the imap protocol, you will be able to select the folder from which you want to retrieve your emails.

 

Use SSL

Select this check box if your email server uses this protocol for authentication and communication confidentiality.

Note

This option is obligatory for users of Gmail.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the job processing metadata at a job level as well as at each component level.

 

Filter

Click the plus button to add as many lines as needed to filter email messages and retrieve only a specific selection:

 

 

Filter item: select one of the following filter types from the list:

From: email messages are filtered according to the sender email address.

To: email messages are filtered according to the recipient email address.

Subject: email messages are filtered according to the message subject matter.

Before date: email messages are filtered by the sending or receiving date. All messages before the set date are retrieved.

After date: email messages are filtered by the sending or receiving date. All messages after the set date are retrieved.

 

 

Pattern: press Ctrl+Space to display the list of available values. Select the value to use for each filter.

 

Filter condition relation

Select the type of logical relation you want to use to combine the specified filters:

and: the conditions set by the filters are combined together, the research is more restrictive.

or: the conditions set by the filters are independent, the research is large.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_EMAIL: the number of emails received. This is an After variable and it returns an integer.

CURRENT_FILE: the current file name. This is a Flow variable and it returns a string.

CURRENT_FILEPATH: the current file path. This is a Flow variable and it returns a string.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component does not handle data flow, it can be used alone.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

When the Use SSL check box or the imap protocol is selected, tPOP cannot work with IBM Java 6.

Scenario: Retrieving a selection of email messages from an email server

This scenario is a one-component Job that retrieves a predefined number of email messages from an email server.

  • Drop the tPOP component from the Palette to the design workspace.

  • Double click tPOP to display the Basic settings view and define the component properties.

  • Enter the email server IP address and port number in the corresponding fields.

  • Enter the username and password for your email account in the corresponding fields. In this example, the email server is called Free.

  • In the Output directory field, enter the path to the output directory manually, or click the three-dot button next to the field and browse to the output directory where the email messages retrieved from the email server are to be stored.

  • In the Filename pattern field, define the syntax you want to use to name the output files that will hold the messages retrieved from the email server, or press Ctrl+Space to display a list of predefined patterns. The syntax used in this example is the following: TalendDate.getDate("yyyyMMdd-hhmmss") + "_" + (counter_tPOP_1 + 1) + ".txt".

    The output files will be stored as .txt files and are defined by date, time and arrival chronological order.

  • Clear the Retrieve all emails? field and in the Number of emails to retrieve field, enter the number of email messages you want to retrieve, 10 in this example.

  • Select the Delete emails from server check box to delete the email messages from the email server once they are retrieved and stored locally.

  • In the Choose the protocol field, select the protocol type you want to use. This depends on the protocol used by the email server. Certain email suppliers, like Gmail, use both protocols. In this example, the protocol used is pop3.

  • Save your Job and press F6 to execute it.

The tPOP component retrieves the 10 recent messages from the specified email server.

In the tPOP directory stored locally, a .txt file is created for each retrieved message. Each file holds the metadata of the email message headings (sender's address, recipient's address, subject matter) in addition to the message content.