Open Studio for Data Quality
Two types of patterns are listed under the Patterns folder in the DQ Repository tree view: regular expressions and SQL patterns.
Regular expressions (regex) are predefined patterns that you can use to search and manipulate text in the databases to which you connect. You can also create your own regular expressions and use them to analyze columns.
- If the regular expression does not exist for this database type, the default regular expression in the selected pattern is used.
- If you remove the regular expression for this database type in a pattern that is used in Jobs, the Jobs are updated with the default regular expression in the selected pattern.
SQL patterns are a kind of personalized patterns that are used in SQL queries. These patterns usually contain the percent sign (%). For more information on SQL wildcards, see http://www.w3schools.com/SQL/sql_wildcards.asp.
You can use any of the above two pattern types either with column analyses or with the analyses of a set of columns (simple table analyses). These pattern-based analyses illustrate the frequencies of various data patterns found in the values of the analyzed columns. For more information, see Creating a basic analysis on a database column and Creating an analysis of a set of columns using patterns.
From Talend Studio, you can generate graphs to represent the results of analyses using patterns. You can also view tables in the Analysis Results view that write in words the generated graphs. From those graphs and analysis results you can easily determine the percentage of invalid values based on the listed patterns.
Management processes for SQL patterns and regular expressions, including those for Java, are the same. For more information, see Managing regular expressions and SQL patterns.
The following table shows the patterns that you can select in any database:
|Analysis engine type||Java||SQL||Java||SQL||Java||SQL||Java||SQL|