Scenario: Extracting name, domain and TLD from e-mail addresses

Data extraction

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Open Studio for MDM
Talend Data Management Platform
Talend ESB
Talend Data Integration
Talend MDM Platform
Talend Real-Time Big Data Platform
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Data Services Platform
Talend Big Data
Talend Open Studio for Big Data
Talend Data Fabric
Talend Big Data Platform
task
Data Quality and Preparation > Third-party systems > Data Quality components > Data extraction components
Data Governance > Third-party systems > Data Quality components > Data extraction components
Design and Development > Third-party systems > Data Quality components > Data extraction components
EnrichPlatform
Talend Studio

This scenario describes a three-component Job where tExtractRegexFields is used to specify a regular expression that corresponds to one column in the input data, email. The tExtractRegexFields component is used to perform the actual regular expression matching. This regular expression includes field identifiers for user name, domain name and Top-Level Domain (TLD) name portions in each e-mail address. If the given e-mail address is valid, the name, domain and TLD are extracted and displayed on the console in three separate columns. Data in the other two input columns, id and age is extracted and routed to destination as well.

For more technologies supported by Talend, see Talend components.