DFP Data Transfer reports
Data Transfer report files provide non-aggregated, event-level data from your ad campaigns. This data is essentially raw content from the DoubleClick ad server logs, with a separate file generated for each type of event. Data transfer files contain event data that is accurate to the second, and you can choose to include other information in the files to see device, geography, and other information related to the event.
Not available in DFP Small Business.
Data Transfer report files come at an additional cost. Contact your account manager to obtain an order form and set up your report file configurations.
If your organization is not able to manage ETL processing, support large files, manipulate text files, design and administer a mid-sized database, and design and implement scripts, consider working with an approved DoubleClick partner.
Available Data Transfer report files
Each Data Transfer file contains information about different events. You can add fields to each file type to see contextual information related to those events.
If there's a corresponding
Backfill file, the
Network file doesn't include impressions served from Ad Exchange or AdSense via dynamic allocation. Use the
Backfill file for information on dynamically-allocated impressions.
|File type||What it shows||Sample file|
||Records every ad request received by DFP, whether filled or unfilled.||Download|
||Records every response from DFP, whether downloaded or not.||Download|
||Information about downloaded impressions.||Download|
||Information about clicks.|
||Information about video-specific events, including actions (play, pause, etc.), content IDs, pod positioning, and more. See all video events|
||Information about DoubleClick Rich Media events, including both standard and custom actions (play, pause, etc.), action duration, and more.|
||Information about Active View-eligible DFP-based impressions.|
||A log entry is generated each time a user views or clicks a campaign in the publisher’s site that activates an activity pixel (formerly known as a Spotlight pixel) on an advertiser’s page.|
How files are delivered
Data Transfer files are pushed to DFP cloud storage buckets on an hourly basis. We advise polling at regular intervals to check for updates. Data will be delivered and available between 5 and 15 hours after the recorded hour. DoubleClick does not deliver data transfer information to third-party servers.
File names include the start hour for events in US Pacific time zone (observing Daylight saving time), but the timestamps present in that file are always given according to the DFP network time zone (which might not observe Daylight saving time). This can lead to empty/skipped files or files containing more than one hour's worth of data, depending on the interplay between these time zone settings.
Data Transfer file names follow a predictable convention:
YYYYMMDDis the year, month, and date.
HHis the start hour in 24-hour format.
The hour number (01, 02, 03) specified in each file name is in Pacific timezone, but publishers get their own network timezone-specific data from the timestamps contained within the Data Transfer files. Be aware of this difference when you calculate file delivery.
About the data contained in Data Transfer files
- Late data
Delays of a few hours are normal, but occasionally Data Transfer files take longer than usual to process. If data is late, it appears in the next hourly batched file with an accurate time stamp. This could mean, for example, that a file has mostly 8 a.m. to 9 a.m. timestamps with a scattering of earlier timestamps if processing was delayed. There aren't cases where data will be "early" (appearing in a previous hour's file).
- Hours with no activity
If there is no activity during a given hour, no Data Transfer file is posted. If the files for a given hour are missing, use reports in DFP to see whether there were any events during the missing hour. When checking for events during a given hour, keep in mind the date and day boundaries discussed below. If DFP reporting confirms that there were no relevant events during the hour you're checking, there's no need to contact support about the missing data transfer file.
- Date and day boundaries
The first hourly file for a given day typically contains events from midnight to 1 a.m. Pacific Time, but the event timestamps are in the publisher's network time zone. If, for example, the publisher is set to Eastern Time, they'd see events from 3 a.m. to 4 a.m. in the first hourly file. The three hours before that would actually be in the previous day's file. You might therefore have timestamps from a different date than is represented by the file name. Always refer to the timestamp on events in the file, not the time the file is posted or name of the file.
- Master/companion reporting in Data Transfer
Data Transfer files show both the master and companion creative impressions. IsCompanion will be “TRUE” for the Companion creative impression. The CreativeId field contains the individual creative IDs for the Master and Companion creatives and not the Creative Set ID. There isn’t an additional field in Data Transfer for Creative Set ID to associate companion impressions to master impressions.
- Discrepancy between Data Transfer files and DFP Query Tool/API-generated reporting
Bad traffic (spam data) is periodically removed from the Query Tool or API-generated reports. But, due to the publishing schedule of Data Transfer files, some of this cleanup may be missed. This can result in Data Transfer showing slightly more impressions, clicks, Active View impressions, et cetera. When discrepancies occur, the extent tends to be ~1%.
Use Data Transfer report files
Once you've set up Data Transfer, files are kept in DFP cloud storage buckets. You can access them on the web, with a command line tool, or through an API. Learn more about how to access DFP cloud storage buckets
If you limit your data ingestion and analysis to a specific set of Data Transfer files based on the start hour in the file name, you might overlook data that's provided in a subsequent file because of daylight savings time, late data collection, or other similar scenarios. A better approach is to read all the Data Transfer files into a separate system (such as a data warehouse or query engine) and restrict your analysis based on the timestamp of the events.
Data Transfer files come in a raw text format that you can convert using a spreadsheet editor. Here are some examples of typical ways you may choose to apply Data Transfer information:
- Dimensional user aggregation of events and activities (for example, per creative)
- Calculate unique conversions across multiple days
- Match users against a customer database
- Report on user geographic and demographic information
You can use match tables to provide a name-to-ID lookup for values contained within Data Transfer files, allowing you to match ad serving information (such as ad unit or line item) to pre-assigned values stored in the database.
A good rule of thumb is that each event uses between 25 and 35 bytes in a compressed file. As such, 10 million impressions would require about 300 MB of disk space in a compressed file. Keep in mind that these are estimates, and your file size could be somewhat larger. Also, because this is the size of the data in compressed form, you need additional space to decompress and use the files.
Data Transfer files older than 60 days are purged from DFP. If you would like to store your files for longer than the allotted 60 days, we recommend that you either store the files locally or move to a permanent cloud storage solution, which can include an independent Google Cloud Storage account over which you have full control.
Make large Data Transfer files easier to process
Google Code has released an open-source toolkit called CRUSH (Custom Reporting Utilities for Shell) for processing delimited-text data from the command line or in shell scripts. The CRUSH tools have been extensively developed and tested, and they work best on Linux or Unix operating systems. Support for CRUSH is available through the open-source community.
A non-open-source alternative is DMX, a data integration software developed by Syncsort.