Adding a regular expression or an SQL pattern to a column analysis - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

You can add to any column analysis one or more regular expressions or SQL patterns against which you can match the content of the column to be analyzed.

Warning:

If the database you are using does not support regular expressions or if the query template is not defined in Talend Studio, you need first to declare the user defined function and define the query template before being able to add any of the specified patterns to the column analysis.

For more information, see Managing User-Defined Functions in databases.

Before you begin

  • You have selected the Profiling perspective.
  • A column analysis is open in the analysis editor.

Procedure

  1. In the Analyzed Columns section in the analysis editor, click Add pattern icon next to the column name to which you want to add a regular expression or an SQL pattern, the email column in this example.
    The Pattern Selector dialog box opens.
  2. Expand Patterns and browse to the regular expression or/and the SQL patterns you want to add to the column analysis.
  3. Select the check boxes of the expressions or patterns you want to add to the selected column.
  4. Click OK to proceed to the next step.
    The added regular expressions or SQL patterns are displayed under the analyzed column in the Analyzed Column list.
    You can add a regular expression or an SQL pattern to a column simply by a drag and drop operation from the DQ Repository tree view onto the analyzed column.
  5. Save the analysis and press F6 to execute it.
    The editor switches to the Analysis result view. The results of the column analysis include those for pattern matching.
    Graphic showing the non-matching and matching percentage against the SQL pattern or the regex.

Results

If the regular expression you add to the column analysis is defined for a database, you will be able to generate ELT Jobs to recuperate valid and invalid rows.

If the regular expression you add to the column analysis is defined for the Java or the Default language, you will be able to generate an ETL Job to handle rows.