The Pattern tab - Cloud

Talend Cloud Data Preparation User Guide

Talend Cloud
Talend Data Preparation
Administration and Monitoring > Managing connections
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Managing datasets
Last publication date

The Pattern tab shows a graphical representation of the type and number of characters your data is made of.

In other words, you will be able to see how the records are structured, with either a word, or character granularity. It is also a quick and easy way to apply filter on your data.

When selecting the content of a column, a horizontal bar chart will display the repartition of the different patterns that are used. According to the type of data that you select, the default displayed patterns will be different:

  • Word-based if the column type is text or boolean
  • Character-based if the column type is date or number

But whatever the type of data, you can switch between the character-based or word-based patterns from the Pattern tab.

Analyzing word-based patterns would be an efficient way to detect data quality issues in first names or last names, for example. Names that are not exclusively made of words, with punctuation or numbers, will immediately stand out. On the other hand, character-based patterns would be more suited in the case of structured data, such as client ids or account numbers. You will be able to tell from the chart if the number of characters or digits is not the right one.

Pattern tab opened.

For more examples, check Filtering values using patterns.