Using charts to calculate absolute value - 2.3

Talend Data Preparation Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.5
2.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

Calculating the absolute value of a number is one of the various mathematical functions available to use on your data.

If you take a close look at the NUMBER_OF_RENTALS column, you will notice that some of the numbers have a negative value.

These cells are not marked as incorrect in the quality bar because they still fit the semantic type automatically set as integer. Nevertheless, this is unusable data. As a consequence, you are going to apply a function to remove the negative sign for all these numbers.

To calculate the absolute value of your data, proceed as follows:

Procedure

  1. Click the header of the NUMBER_OF_RENTALS column to select its content.

    In the statistics box, you can clearly see that some values range between -10 and 0.

  2. In the vertical bar chart at the bottom right of the screen, click the first bar from the left.

    This bar represents all the occurrences of the values that are equal or below 0.

    A filter has now been applied on your data. Your preparation now only displays the lines with a value equal or below zero for the number of rentals. You can now apply a function only on those cells.

  3. Under the functions list, in front of Apply changes to:, select the Filtered rows radio button.
  4. In the functions list, click Calculate Absolute Value.

    All the negative values have been converted.

  5. To clear the filter, simply click the x icon, on the right of the filter.

Results

Your preparation now displays all your data again. If you take another look at the statistics box for the NUMBER_OF_RENTALS column, you can see that the minimum value is now 0 instead of -10. You have thus improved the quality and usability of your data.