Search
Clear search
Close search
Google apps
Main menu

Overview of Data Transfer reports

Access event-level data related to your DFP network

The Data Transfer feature offers an alternative in the form of non-aggregated, event-level data from your ad campaigns. This data is essentially raw content from the DoubleClick ad server logs, with a separate file generated for each type of event. Data transfer files contain event data that is accurate to the second, and you can choose to include other information in the files such as user ID, event time, and click time.

To get started with Data Transfer, contact your Account Manager about providing you with a setup form, which you can use to customize which data fields are available in your reports. We’ll notify you once Data transfer is fully enabled for your account, then you can begin retrieving your files from Google Cloud Storage. (The price of the storage solution is included in the monthly Data Transfer fee.)

DFP has transitioned to downloaded impressions
Starting October 2, 2017, Data Transfer impression files only include downloaded impressions. New files will soon be available for requests and code serves. Watch the release notes for more details.

Learn more about the transition to downloaded impressions.

How you can use Data Transfer

After the data is exported from the ad servers, you can use it to do the following:

  • Data warehousing: Build robust data warehouses that enable you to perform a variety of analyses outside of the types offered by DoubleClick.
  • Data analysis: Download Data Transfer files and perform additional data analysis.

Data Transfer files come in a raw text format that you can convert using a spreadsheet editor. Here are some examples of typical ways you may choose to apply Data Transfer information:

  • Dimensional user aggregation of events and activities (for example, per creative)
  • Calculate unique conversions across multiple days
  • Match users against a customer database
  • Report on user geographic and demographic information

You can use match tables to provide a name-to-ID lookup for values contained within Data Transfer files, allowing you to match ad serving information (such as ad unit or line item) to pre-assigned values stored in the database.

Decide whether Data Transfer is right for you

Because Data Transfer information is provided as non-aggregated event-level data, it requires a degree of technical expertise to use. If your organization is not able to manage ETL processing, support large files, manipulate text files, design and administer a mid-sized database, and design and implement scripts, you might want to consider working with a third-party, such as one of our approved third-party partners in our partner directory.

If you're not sure whether Data Transfer is right for you, contact us.

Access Data Transfer files

Once you've set up Data Transfer, files are kept in DFP cloud storage buckets. You can access them on the web, with a command line tool, or through an API. Learn more about how to access DFP cloud storage buckets

Data Transfer files in detail

Files and file types
  • Differences among file types
    Different Data Transfer files offer you different information. Below is an overview of the file types; you can also see all the fields included in each Data Transfer file.
    If there's a corresponding Backfill file, the Network file doesn't include impressions served from Ad Exchange or AdSense via dynamic allocation. Use the Backfill file for information on dynamically allocated impressions.
    File type What it shows

    NetworkImpressions
    NetworkBackfillImpressions

    DFP has transitioned to downloaded impressions
    Starting October 2, 2017, Data Transfer impression files only include downloaded impressions. New files will soon be available for requests and code serves. Watch the release notes for more details.

    Learn more about the transition to downloaded impressions.

    Information about downloaded impressions.
    NetworkClicks
    NetworkBackfillClicks
    Information about clicks.
    NetworkVideoConversions
    NetworkBackfillVideoConversions
    Information about video-specific events, including actions (play, pause, etc.), content IDs, pod positioning, and more. See all video events
    NetworkRichMediaConversions
    NetworkBackfillRichMediaConversions
    Information about DoubleClick Rich Media events, including both standard and custom actions (play, pause, etc.), action duration, and more.
    NetworkActiveViews
    NetworkBackfillActiveViews
    Information about Active View-eligible DFP-based impressions. 
    NetworkActivities A log entry is generated each time a user views or clicks a campaign in the publisher’s site that activates an activity pixel (formerly known as a Spotlight pixel) on an advertiser’s page.

     

  • File delivery
    Data Transfer files are pushed to DFP cloud storage buckets on an hourly basis. We advise polling at regular intervals to check for updates. Data will be delivered and available between 5 and 15 hours after the recorded hour.

    The hour number (e.g., 01, 02, 03) specified in each file name is in Pacific timezone, but publishers get their own network timezone-specific data from the timestamps contained within the Data Transfer files. Be aware of this difference when you calculate file delivery.

    DoubleClick does not deliver data transfer information to third-party servers.
     
  • Naming conventions for Data Transfer files
    Data Transfer file names follow a predictable convention:
    [Type]_[Network ID]_[YYYYMMDD]_[HH].gz

    YYYYMMDD is the year, month, and date. HH is the start hour in 24-hour format. The start hour in the file name is listed in Pacific Time, even if the network is based in a different time zone. For example, a NetworkClicks file for the 4 PM hour could take the name:
    NetworkClicks_123456_20140815_16.gz

    In some extremely rare cases, Data Transfer files are re-published to correct for erroneous data. In these cases, the re-published files have the string "_corrected" inserted in the filename, but the previous versions of the files are not deleted from the bucket. Therefore, the corrected filename appears as:
    [Type]_[Network ID]_[YYYYMMDD]_[HH]_corrected.gz
     
    Any time a corrected file is found in the storage bucket, we recommend you use that version of the file as the authoritative report for that hour. This means that you'll need to remove events contained in the previous (incorrect) versions of the file from your data warehouse/analysis.
  • Make large Data Transfer files easier to process
    Google Code has released an open-source toolkit called CRUSH (Custom Reporting Utilities for Shell) for processing delimited-text data from the command line or in shell scripts. The CRUSH tools have been extensively developed and tested, and they work best on Linux or Unix operating systems. Support for CRUSH is available through the open-source community.
     
    A non-open-source alternative is DMX, a data integration software developed by Syncsort.
  • Best practices for consuming Data Transfer files
    • Don't try to map events to files
      For reasons described elsewhere in this document (e.g., Daylight saving time cutovers, late data in reporting, etc.), the mapping between an event timestamp and a specific Data Transfer file is not always easy to predict. Thus, when you want to analyze data falling in a certain time range, you should not limit your data ingestion and analysis to a specific set of Data Transfer files based on the start hour in the filename. If you do that, you may be overlooking data that's provided in a subsequent file. A better approach is to read all the Data Transfer files into a separate system (e.g., a data warehouse or query engine), and then restrict your analysis based on the timestamp of the events.
    • Handle data corrections
      When corrected Data Transfer files are delivered to your bucket, you will need to scrub the data corresponding to the original files from your data warehouse. We leave the original files within your storage bucket, which allows for multiple approaches to dealing with this situation. One recommended approach is to identify the files corresponding to incorrect data (by the presence of a "_corrected" file with an otherwise matching file name, as described above), and remove the events in that file from your data warehouse.
Storage
  • Required storage space
    A good rule of thumb is that each event uses between 25 and 35 bytes in a compressed file. As such, 10 million impressions would require about 300 MB of disk space in a compressed file.

    Keep in mind that these are estimates, and your file size could be somewhat larger. Also, because this is the size of the data in compressed form, you need additional space to decompress and use the files.
     
  • 60-day online storage period
    Data Transfer files older than 60 days are purged from DFP. If you would like to store your files for longer than the allotted 60 days, we recommend that you either store the files locally or move to a permanent cloud storage solution, which can include an independent Google Cloud Storage account over which you have full control.
Data
  • Daylight saving time
    Data Transfer file naming and delivery around Daylight saving time start/end transitions is a complex topic. We provide additional details in this section for reference purposes, but note that if you're following the "Best practices for consuming Data Transfer files" described above, the technical details of how the files are delivered shouldn't be relevant to your solution design.
    As previously mentioned, the file name convention includes the start hour for events in the file, which is always given in the US Pacific time zone (observing Daylight saving time). However, the timestamps present in that file are always given according to the DFP network time zone (which may or may not observe Daylight saving time). This may lead to empty/skipped files or files containing more than one hour's worth of data, depending on the interplay between these time zone settings. All impression data will still be delivered to you during the Daylight saving time cutover.
     
  • Late data
    From time to time, Data Transfer files take longer than usual to process. Delays of a few hours are normal.

    If data is late, it appears in the next hourly batched file with an accurate time stamp. This could mean, for example, that a file has mostly 8 a.m. - 9 a.m. timestamps with a scattering of earlier timestamps if processing was delayed.

    There aren't cases where data will be "early" (appearing in a previous hour's file).
     
  • Hours with no activity
    If there is no activity during a given hour, no Data Transfer file is posted. If the files for a given hour are missing, use reports in DFP to see whether there were any events during the missing hour. When checking for events during a given hour, keep in mind the date and day boundaries discussed below. If DFP reporting confirms that there were no relevant events during the hour you're checking, there's no need to contact support about the missing data transfer file.
     
  • Date and day boundaries
    The first hourly file for a given day typically contains events from midnight to 1 a.m. Pacific Time, but the event timestamps are in the publisher's network time zone. If, for example, the publisher is set to Eastern Time, they'd see events from 3 a.m. to 4 a.m. in the first hourly file. The three hours before that would actually be in the previous day's file. You may therefore have timestamps from a different date than is represented by the filename (please design accordingly). Always refer to the timestamp on events in the file, not the time the file is posted.
     
  • What's not included in Data Transfer files
    Data Transfer files include information about impression-level events. Some kinds of data, such as code serves that don't result in an impression, are not included. For example:
     
    • Out-of-page requests that are never rendered, or where the viewed impression macro wasn't properly implemented.
    • Video request where the video was never played.
       
  • Master/companion reporting in Data Transfer
    Data Transfer files show both the master and companion creative impressions. IsCompanion will be “TRUE” for the Companion creative impression.

    The CreativeId field contains the individual creative IDs for the Master and Companion creatives and not the Creative Set ID. There isn’t an additional field in Data Transfer for Creative Set ID to associate companion impressions to master impressions.
     
  • Discrepancy between Data Transfer files and DFP Query Tool/API-generated reporting
    Bad traffic (spam data) is periodically removed from the Query Tool or API-generated reports. But, due to the publishing schedule of Data Transfer files, some of this cleanup may be missed. This can result in Data Transfer showing slightly more impressions, clicks, Active View impressions, et cetera. When discrepancies occur, the extent tends to be ~1%.
Was this article helpful?
How can we improve it?