tLDAPInput - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tLDAPInput reads a directory and extracts data based on the defined filter.

Purpose

tLDAPInput executes an LDAP query based on the given filter and corresponding to the schema definition. Then it passes on the field list to the next component via a Main row link.

tLDAPInput Properties

Component family

Databases/LDAP

 

Basic settings

Property type

Either Built-in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally.

 

 

Repository: Select the repository file in which the properties are stored. The fields that follow are completed automatically using the data retrieved.

 

Click this icon to open a database connection wizard and store the database connection parameters you set in the component Basic settings view.

For more information about setting up and storing database connection parameters, see Talend Studio User Guide.

 

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

Note that when a Job contains the parent Job and the child Job, Component List presents only the connection components in the same Job level.

 

Host

LDAP Directory server IP address.

 

Port

Listening port number of server.

 

Base DN

Path to the user's authorised tree leaf.

Note

To retrieve the full DN information, enter a field named DN in the schema, in either upper case or lower case.

 

Protocol

Select the protocol type on the list.

LDAP : no encryption is used

LDAPS: secured LDAP. When this option is chosen, the Advanced CA check box appears. Once selected, the advanced mode allows you to specify the directory and the keystore password of the certificate file for storing a specific CA. However, you can still deactivate this certificate validation by selecting the Trust all certs check box.

TLS: certificate is used When this option is chosen, the Advanced CA check box appears and is used the same way as that of the LDAPS type.

 

Authentication User and Password

Select the Authentication check box if LDAP login is required. Note that the login must match the LDAP syntax requirement to be valid. e.g.: "cn=Directory Manager".

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Filter

Type in the filter as expected by the LDAP directory db.

 

Multi valued field separator

Type in the value separator in multi-value fields.

 

Alias dereferencing

Select the option on the list. Never improves search performance if you are sure that no alias is to be dereferenced. By default, Always is to be used:

Always: Always dereference aliases

Never: Never dereferences aliases.

Searching:Dereferences aliases only after name resolution.

Finding: Dereferences aliases only during name resolution

 

Referral handling

Select the option on the list:

Ignore: does not handle request redirections

Follow:does handle request redirections

 

Limit

Fill in a limit number of records to be read If needed.

 

Time Limit

Fill in a timeout period for the directory. access

 

Paging

Specify the number of entries returned at a time by the LDAP server.

 

Die on error

This check box is selected by default. Clear the check box to skip the row on error and complete the process for error-free rows. If needed, you can retrieve the rows on error via a Row > Rejects link.

 

Schema and Edit schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

Warning

Only three data types are supported here: String, byte[], and List. tMap can be used for data type conversion if needed.

 

 

Built-in: The schema is created and stored locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the Repository, hence can be reused. Related topic: see Talend Studio User Guide.

Global Variables

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

RESULT_NAME: the name of the current LDAP entry satisfying the search filter. This is a flow variable, and it returns a string.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component covers all possible LDAP queries.

Note: Press Ctrl + Space bar to access the global variable list, including the GetResultName variable to retrieve automatically the relevant Base.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Scenario: Displaying LDAP directory's filtered content

The Job described below simply filters the LDAP directory and displays the result on the console.

  • Drop the tLDAPInput component along with a tLogRow from the Palette to the design workspace.

  • Set the tLDAPInput properties.

  • Set the Property type on Repository if you stored the LDAP connection details in the Metadata Manager in the Repository. Then select the relevant entry on the list.

  • In Built-In mode, fill in the Host and Port information manually. Host can be the IP address of the LDAP directory server or its DNS name.

  • No particular Base DN is to be set.

  • Then select the relevant Protocol on the list. In this example: a simple LDAP protocol is used.

  • Select the Authentication check box and fill in the login information if required to read the directory. In this use case, no authentication is needed.

  • In the Filter area, type in the command, the data selection is based on. In this example, the filter is: (&(objectClass=inetorgperson)&(uid=PIERRE DUPONT)).

  • Fill in Multi-valued field separator with a comma as some fields may hold more than one value, separated by a comma.

  • As we do not know if some aliases are used in the LDAP directory, select Always on the list.

  • Set Ignore as Referral handling.

  • Set the limit to 100 for this use case.

  • Set the Schema as required by your LDAP directory. In this example, the schema is made of 6 columns including the objectClass and uid columns which get filtered on.

  • In the tLogRow component, no particular setting is required.

Only one entry of the directory corresponds to the filter criteria given in the tLDAPInput component.