Manage data freshness
This article explains how caching works, and how to choose between more frequent data updates and better report performance.
Data freshness vs. report performance
Data freshness refers to how up-to-date the data in a report is. Different types of reports have different requirements or expectations for data freshness. If you are measuring ad performance on your site or app, for example, daily updates might be sufficient. Reports based on social media analytics, on the other hand, may want their data updated multiple times in a day.
Report performance is a measure of how quickly the report loads. Fetching data directly from the underlying data set can be slow, which in turn makes your reports sluggish to load and respond to viewer changes, like applying filters and date ranges. In addition, for some data sources, such as BigQuery, fetching data directly can incur financial costs.
To offset these issues, Data Studio may store some of the data provided by your data sources in a temporary system known as a cache. Retrieving data from a cache can be much faster than fetching it directly from the data set. And it helps reduce the number of queries sent, minimizing costs for paid data access.
How the cache works
Every component in a Data Studio report gets its data from the cache when possible. There are actually two parts to the cache system: the responsive cache, and the predictive cache:
When a component in your report requests data, the responsive cache remembers the response returned by the underlying platform. If a person viewing the report requests the exact same data (for example, the same dimensions and metrics with the same filter conditions and date range) as a previously received query, then the new request is served by the responsive cache.
If the request can't be served by the responsive cache, Data Studio next looks to the predictive cache.
The predictive cache analyzes the dimensions, metrics, and filter controls contained in the report, and predicts the possible queries. Data Studio then executes those queries in the background and stores the responses in the predictive cache. When a query can't be answered by the responsive cache, Data Studio tries to answer it using this predicted data. The predictive cache is limited in size, so it's possible your report can issue queries not already contained in the cache. If the query can't be answered by the predictive cache, Data Studio requests the data from the underlying data set.
Limits of the predictive cache
The predictive cache is only active for data sources that use owner's credentials to access the underlying data.
In addition, the predictive cache is not available for the following connectors:
- Google Analytics
- Cloud SQL
- Community connectors
The cache automatically refreshes at certain intervals, and you can manually refresh reports that you can edit (see below).
When the cache refreshes, all the old cached data is discarded, the predictive cache is rebuilt with anticipated data, while new queries generated by the report go directly to the underlying platform and the responses are added to the responsive cache.
ALERT: If your report uses a BigQuery data source, please be aware that the usual query costs will apply whenever Data Studio queries the underlying project. This includes queries that bypass the cache, as well as manual and automatic cache refreshes.
Set data freshness for a data source
You can control how often some data source types refresh their cache. For example, a Sheets data source can check for fresh data every 15 minutes (the default), every 4 hours, or every 12 hours. Some data sources only support a single default setting (typically, every 12 hours).
To set the refresh time
- Edit the data source.
- View access to the data source is not sufficient to change this setting.
- In the toolbar at the top, click Data freshness.
- Select the desired time option, then click SET DATA FRESHNESS.
Refresh report data manually
You can refresh the cache at any time by viewing or editing the report and clicking Refresh data .
This refreshes the cache for every data source added to the report.
Turn off the predictive cache
As an editor, you can turn the predictive cache on or off for a given report. You might want to do this if:
- your data changes frequently and you want to prioritize freshness over performance.
- you are using a data source that incurs usage costs (e.g., BigQuery) and want to minimize those costs.
Turning the predictive cache on or off applies to the entire report; you can't do this for selected data sources only.
If a report has been unopened for 10 days, automatic refresh of the predictive cache is suspended until the report is viewed again.
You can't disable the responsive cache, as doing so could result in higher data usage costs for paid data sources, such as BigQuery
To turn the predictive cache on or off
- Edit your report.
- Select the File > Report settings menu.
- In the Report Settings panel on the right, check or uncheck the Enable cache checkbox.
How to tell if report data is cached
You can see if data is coming from the cache by viewing the report and looking in the bottom left corner. When all the charts on the current page are being served from the cache, you'll see a lightning bolt icon along with the time and date of the last update .
Blending and cached data
For a blended data source, the cache will use the setting that satisfies the desired refresh times for all of the data sources included in the blend.
For example, if you blend a Sheets data source having a refresh time of 15 minutes, with a BigQuery data source having a refresh time of 4 hours, the resulting blended data source will have a refresh time of 15 minutes.