Pattern types - 6.3

Talend Real-time Big Data Platform Studio User Guide

EnrichVersion
6.3
EnrichProdName
Talend Real-Time Big Data Platform
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Two types of patterns are listed under the Patterns folder in the DQ Repository tree view in the Profiling perspective: regular expressions and SQL patterns.

Regular expressions (regex) are predefined patterns that you can use to search and manipulate text in the databases to which you connect. You can also create your own regular expressions and use them to analyze columns.

SQL patterns are a kind of personalized patterns that are used in SQL queries. These patterns usually contain the percent sign (%). For more information on SQL wildcards, see http://www.w3schools.com/SQL/sql_wildcards.asp.

You can use any of the above two pattern types either with column analyses or with the analyses of a set of columns (simple table analyses). These pattern-based analyses illustrate the frequencies of various data patterns found in the values of the analyzed columns. For more information, see Creating a basic analysis on a database columnand How to create an analysis of a set of columns using patterns.

From the studio, you can generate graphs to represent the results of analyses using patterns. You can also view tables in the Analysis Results view that write in words the generated graphs. From those graphs and analysis results you can easily determine the percentage of invalid values based on the listed patterns. For more information, see Tab panel of the analysis editors.

Management processes for SQL patterns and regular expressions, including those for Java, are the same. For more information, see Managing regular expressions and SQL patterns.

Warning

Some databases do not support regular expressions. To work with such databases, some configuration is necessary before being able to use regular expressions. For more information, see Managing User-Defined Functions in databases.