Extracting only the data that corresponds to a defined pattern from a delimited file - Cloud - 8.0

Data extraction

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Data extraction components
Data Quality and Preparation > Third-party systems > Data Quality components > Data extraction components
Design and Development > Third-party systems > Data Quality components > Data extraction components
Last publication date
2024-02-20

This scenario applies only to Talend Data Management Platform, Talend Big Data Platform, Talend Real-Time Big Data Platform, Talend MDM Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.

For more technologies supported by Talend, see Talend components.

This scenario describes a four-component Job where the tExtractPattern component is used to extract only customers' email addresses (that match the Email address pattern) from a delimited file that holds different customer data. Then it writes the extracted data into another delimited file. A tFilterColumns component is used to adapt the output schema.

In this scenario, the delimited file holds names, email addresses and telephone numbers, all in a single column: Name_Telephone_Address. The following shows an extract of the input file: