Introduction to aggregation

Understand how Data Studio groups and summarizes your data.

Aggregation is the process of reducing and summarizing tabular data. For example, consider the list of numbers below:

100, 200, 300, 400, 500
 

Using this example, you can state the following facts that illustrate the concept of aggregation:

Fact Aggregation
There are 5 numbers. Count
The smallest number is 100 Minimum
The largest number is 500 Maximum
The average of the numbers is 300 Average
The sum of the numbers is 1500 Sum

 

There are other ways you can aggregate data, including calculating the median, count distinct, quartiles, percentiles, etc.

Dimensions and aggregation

The example above is based on a single set of numbers, but that’s almost never what you see in the real world. In the real world, your data is typically organized into dimensions and metrics. Dimensions provide a way to categorize and group your data, while metrics measure that data.

In Data Studio, aggregation always takes place in the context of a set of dimensions. That set of dimensions can be:

  • Every dimension, which lets you see the raw data.
  • A subset of your dimensions, which lets you see data broken down (grouped) by the dimensions you select.
  • The empty set, which lets you see a summary of the entire data set.

For example, here are the same five numbers presented as daily stock prices. Date and Ticker are dimensions, Price is a metric.

Date Ticker Price
January 1 GOOG 100
January 1 AAPL 200
January 2 GOOG 300
January 2 AAPL 400
January 3 GOOG 500


You can now use the dimensions to group the data in different ways. For example:

By Ticker

Ticker Price
GOOG ?
AAPL ?

 

By Date

Date Price
January 1 ?
January 2 ?
January 3 ?

 

In the examples above, the metric values depend on what you want to know about the data. For example, to calculate the average price for each company, apply the Average aggregation in conjunction with the Ticker dimension:

Ticker Average of Price
GOOG (100 + 300 + 500) / 3 = 300
AAPL (200 + 400) / 2 = 300

 

To see how many companies had stock information on a given day, you would use the Date dimension and the Count aggregation:

Date Count of Price
January 1 2
January 2 2
January 3 1

 

Now, consider what happens when you use both the Date and Ticker dimensions in the stock table:

Date Ticker AVG(Price) SUM(Price)
January 1 GOOG 100 100
January 1 AAPL 200 200
January 2 GOOG 300 300
January 2 AAPL 400 400
January 3 GOOG 500 500

 

When a you create a group that includes all the available dimensions, the result is identical to the original data. This is still an aggregation, just not an interesting one, since every aggregation gives the same result. In Data Studio, if you make a table and show every dimension, you’re still able to set the aggregation type but it won’t do anything.

Auto aggregation

Some data sources, such as Analytics and Ads, show Auto as the only available aggregation type for metrics. These data sets are already aggregated, reducing a potentially massive list of numbers to single values. The aggregations used can be simple, like sum or count, or more complex, like percentile. You can't change Auto aggregated metrics to a different aggregation type, since doing so would require Data Studio to have access to the raw, unaggregated data.

You'll also see Auto aggregation applied to calculated fields based on other data sources, such as Sheets. In this case, the aggregation is a direct result of using an aggregation function in the formula, such as AVG(Price). Again, you can't change the Auto aggregation here, because that would conflict with the output of the AVG function.

Was this helpful?
How can we improve it?