Search
Clear search
Close search
Google apps
Main menu

Understand Data Sets

Data Sets provide a structure to manage your uploaded data.

A Data Set is a container that holds the data you upload to Analytics. Data Sets control how uploaded data gets joined with existing data. You configure Data Sets at the Property level. Data Sets must be associated with at least one View, and can be associated with multiple Views.

To manage all the Data Sets for a given Property, select Admin > (Property) > Data Import.

In this article:

Data Set types

A Data Set's type corresponds to the specific type of data you want to import. For example, there are Data Set types for User Data, Cost Data, Content Data, etc. Depending on the Data Set type, you'll have different options for the dimensions and metrics (the schema) you can use.

Data Set schema

When you create a Data Set, you define a schema, which is the structure that joins the data you upload with the existing data in your hits. A simple schema consists of a key dimension (the "key") and an import dimension or metric. To import data, Analytics looks for key values in hits that match key values in the uploaded data. When a match is found, the additional dimension and metric values associated with that key are added to the existing hit data. The import dimension (or metric) receives this additional data you are uploading. Some Data Set types let you use multiple dimensions to define the key, and most can use multiple dimensions/metrics for the import fields.

Key and import dimensions

You can use many of the available dimensions and metrics as your import key and targets, but not all. You’ll see the full list of the available dimensions and metrics in the user interface when you create your Data Set. Currently, you cannot use the following for key or target dimensions:
  • custom variables
  • time-based dimension like hour, minute, etc.
  • geographic dimensions like country, city, etc.

Using custom dimensions and metrics

If you have implemented Universal Analytics (analytics.js), many (but not all) import types support uploading to custom dimensions. Custom dimensions and metrics must be of the appropriate scope for the import type.

Learn more about custom dimensions.

If you’re still using the ga.js based snippet on your site, and you want to import to dimensions and metrics that don’t exist as standard dimensions or metrics in Analytics (e.g., Author), you will have to overwrite an existing dimension (e.g., Page Title). It is strongly recommended that you set up a new view if you’re going to go take that approach. Overwriting in this way will permanently replace the data in that dimension with your custom data for the view that you select.

More about keys and import dimensions
  • You may include up to a total of nine dimensions and/or metrics in your schema, including key and target dimensions and metrics.
  • The key is composed of at least one dimension or metric, and can be composed of up to three dimensions and/or metrics.

    Example: Page (URL) could be a key dimension, as could a custom dimension that you have defined.
  • Key dimensions can be default dimensions already defined by Analytics, as well as custom dimensions that you have defined, or a combination of Page (URL) and one or two custom dimensions.
  • The default key dimensions will vary based on the type of data set you choose. A list of available key dimensions is provided in the drop-down menu in the data set schema builder.
  • You can refine the key using a regular expression. You can also refine the Page dimension by query parameters. See the Content Data import example article for an example.
  • The available target metrics will vary based on what you chose as your Key dimensions.

    Example: If you intend to provide author name and subject for each article ID, you would specify Author and Subject as custom import dimensions.

    Example: If the key is SKU, and you intend to provide a price for each value of SKU, you would specify Price as the metric.

Overwrite behavior

All of the Data Set types except Refund Data allow to you specify how you want to handle uploading data that duplicates data already contained in the selected Analytics property. "Duplicate data" in this case means any record where the key dimensions match values previously imported into or collected for that property. Here are your options for handling duplicate data:

  • For Cost Data, you can choose whether duplicate data is added to (summed with) or replaces previously uploaded data.
  • For Campaign, Content, Custom, Product and User Data Set types, you can choose whether duplicate data overwrites previously collected or imported hits, or is discarded in favor of the existing data.
  • For Refund Data, once the data has been imported it is treated as regular hit traffic and cannot be modified (so make sure your refund data is correct before importing!).

Learn more about how Data Import handles duplicate data.

Was this article helpful?
How can we improve it?
Google Analytics training and support resources

Check out our comprehensive list to learn more about Analytics solutions.