The Chart tab shows a graphical representation of your data. It can also be used as way of aggregating data and previewing some interesting statistics.
The data aggregation in Talend Cloud Data Preparation allows you to easily gather the information of two columns to perform statistical analysis. You can select a first column and compare it with the sum, max, min or average of the second column containing numerical values. The chart will then display more advanced statistics than the ones that are displayed by default.
In this example, you work for an online retail company and the dataset you are working on contains information about your customers, such as their age, gender, and number of purchases. You will use the chart tab to quickly preview the average number of purchases depending on the age group of your customers.
Click the header of the column that will be used as base for the aggregation,
Age group in this example.
A chart showing the number of occurrences of each age group is displayed in the data profiling area.
- In the Chart tab, click the display options menu, set to Row count by default.
In the Column drop-down list, select the
This column contains the information that we want to link to the age groups. The drop-down lists all the columns that are compatible for aggregation, in other words, all other columns that contain numerical data, with the
In the Aggregation drop-down list, select
- Click Ok.
You have quickly gained some insight on your data with these statistics, and you could perform other aggregation operations, like comparing the total purchases depending on the gender of your customers for example, or any other data category of your dataset.
To remove the aggregation information from the charts, click.