Working with the quality bar - Cloud

Talend Cloud Data Preparation Getting Started Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Preparation
Content
Data Quality and Preparation > Cleansing data

The quickest way to identify incorrect data is to look at the quality bar.

Under each column is a quality bar that displays the amount of fields that have correct data, incorrect data or empty fields. Each category is represented by a color:

  • Green for data that matches the cell format
  • Grey for empty cells
  • Red for data that does not match the cell format

Click any color to select, delete or clear the cells with data in an invalid format. Hovering over the colors allows you to display the exact number of lines for each category, as well as the percentage it represents in a column.

By looking at the quality bar under in the Email column header, you can see that there are empty cells and incorrect values among the data. You are going to remove them.

To use the quality bar to remove the lines containing those incorrect cells, proceed as follows:

Procedure

  1. Click the grey part of the quality bar, in the header of the Email column.
    A drop-down menu opens.
  2. Click Delete the rows with empty cells.
    The empty cells of the Email columns have been deleted and only the invalid values, represented by the red bar, remain.
  3. Repeat the last two steps, but this time, click the orange part of the quality bar, and select Delete the rows with invalid cells.
    The Email column is now cleaned of all invalid data or empty cells.
  4. Use the quality bar to remove the invalid cells from the Zip and Phone columns.

Results

The only remaining column with invalid data is now State, but you are going to treat it in a different way.