Pattern types - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Two types of patterns are listed under the Patterns folder in the DQ Repository tree view: regular expressions and SQL patterns.

Regular expressions (regex) are predefined patterns that you can use to search and manipulate text in the databases to which you connect. You can also create your own regular expressions and use them to analyze columns.

When selecting a pattern in a Job, the regular expression for the current database type is used:
  • If the regular expression does not exist for this database type, the default regular expression in the selected pattern is used.
  • If you remove the regular expression for this database type in a pattern that is used in Jobs, the Jobs are updated with the default regular expression in the selected pattern.

SQL patterns are a kind of personalized patterns that are used in SQL queries. These patterns usually contain the percent sign (%). For more information on SQL wildcards, see SQL Wildcard.

You can use any of the above two pattern types either with column analyses or with the analyses of a set of columns (simple table analyses). These pattern-based analyses illustrate the frequencies of various data patterns found in the values of the analyzed columns. For more information, see Creating a basic analysis on a database column and Creating an analysis of a set of columns using patterns.

From Talend Studio, you can generate graphs to represent the results of analyses using patterns. You can also view tables in the Analysis Results view that write in words the generated graphs. From those graphs and analysis results you can easily determine the percentage of invalid values based on the listed patterns.

Management processes for SQL patterns and regular expressions, including those for Java, are the same. For more information, see Managing regular expressions and SQL patterns.

Note: Some databases do not support regular expressions. To work with such databases, some configuration is necessary before being able to use regular expressions. For more information, see Managing User-Defined Functions in databases.

The following table shows the patterns that you can select in any database:

Indicator Supported data types with the Java analysis engine Supported data types with the SQL analysis engine
SQL Patterns None All data types
Regex Patterns All data types All data types