tFirstnameMatch - Cloud - 8.0

Name standardization

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components > Name standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components > Name standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components > Name standardization components
Last publication date
2024-02-20

Matches first names against a reference index in order to standardize data.

Warning:

This component is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real-Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and in Talend Data Fabric.

tFirstnameMatch compares the first name column from the input flow with first names in an embedded reference index and outputs the matching first names. It does not support Chinese characters.

This index has first names for about 162 countries, and it has more than 1000 reference first names for some countries.

This component is not shipped with your Talend Studio by default. You need to install it using the Feature Manager. For more information, see Installing features using the Feature Manager.

For more technologies supported by Talend, see Talend components.

tFirstnameMatch checks first names against an index file embedded in the component itself. This component searches first names in the index file according to the input gender and input country you specify in the component settings. When you do not use the gender and country as a search basis, first names are searched throughout all the index, whatever the country is.

The index file has reference first names for about 162 countries. Some of the countries listed in the index have more than 1000 reference first names. Such countries include USA, GBR, AUS, IRL, CAN, FRA, NZL, CHE and NLD. For example, the index file has more than 8000 American first names, more than 4000 British first names, more than 2000 Australian first names and so on.

Some other countries have less than 1000 reference first names stored in the index file. For such countries, it is advisable not to select a country column so that the input first name is checked against all reference first names of all countries in the index file.