Removing empty and invalid rows

Talend Data Preparation Quick Examples

author
Talend Documentation Team
EnrichVersion
6.5
2.3
EnrichProdName
Talend Data Services Platform
Talend Big Data
Talend Real-Time Big Data Platform
Talend Data Integration
Talend Data Fabric
Talend MDM Platform
Talend Big Data Platform
Talend ESB
Talend Data Management Platform
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

You can remove all the empty and invalid entries from a dataset in one go.

As you can see in the quality bar under each colum, the customer_contact_data.csv contains several rows with either empty or invalid cells. You are going to delete all these rows. Using the quality bar is a quick way of removing empty and invalid records for a given column, but you want to perform this on the whole dataset.

Procedure

  1. Click the white arrow on the top left of the grid.
  2. Select Display rows with invalid or empty values.

    You have actually applied a filter on your data, and only the empty and invalid values present in the dataset are displayed.

  3. In the functions panel, type Delete these filtered rows and click the result to apply the associated function.

    Make sure that the Filtered Rows radio button is selected in front of the Apply changes to field.

    The rows containing empty or invalid entries are removed from the dataset.

  4. Click the bin icon in the filter bar to clear the filter and display the whole dataset again.

Results

All the rows containing empty records are removed from the dataset and the quality bar under each column is now fully green.