Scenario: Extracting name, domain and TLD from e-mail addresses

Data extraction

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data Platform
Talend MDM Platform
Talend Data Management Platform
Talend Open Studio for MDM
Talend Data Integration
Talend Data Services Platform
Talend Real-Time Big Data Platform
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Big Data
Talend Data Fabric
Talend Open Studio for Big Data
Talend ESB
task
Data Governance > Third-party systems > Data Quality components > Data extraction components
Design and Development > Third-party systems > Data Quality components > Data extraction components
Data Quality and Preparation > Third-party systems > Data Quality components > Data extraction components
EnrichPlatform
Talend Studio

This scenario describes a three-component Job where tExtractRegexFields is used to specify a regular expression that corresponds to one column in the input data, email. The tExtractRegexFields component is used to perform the actual regular expression matching. This regular expression includes field identifiers for user name, domain name and Top-Level Domain (TLD) name portions in each e-mail address. If the given e-mail address is valid, the name, domain and TLD are extracted and displayed on the console in three separate columns. Data in the other two input columns, id and age is extracted and routed to destination as well.

For more technologies supported by Talend, see Talend components.