Enable Gmail logs in BigQuery

This feature is only available with G Suite Enterprise and G Suite for Education.

As a G Suite administrator, you can search your Gmail logs to analyze and report on your organization's incoming and outgoing emails. Gmail log search within a BigQuery project dataset lets you:

  • Analyze Gmail logs using sophisticated, high-performance, custom queries
  • Retain Gmail data as long as necessary with a user-configurable retention period
  • Utilize the BigQuery web UI, command line, or third-party tools to do deep analysis
  • Create custom reports and dashboards using analytics tools, such as Google Data Studio

Before you begin

  1. Create or identify an active BigQuery project to use. Follow these instructions to create a project if you don’t have one. You can set up a free trial to try it out.
  2. Locate the ID of the project you want to use for Gmail logs in https://console.cloud.google.com/    
  3. Go to the IAM & Admin page of your project. 
  4. Add the ID of the admin who will act as the project editor to set up the project in the Admin console. (You can also add the administrator as the project owner.)

Assign Gmail logs to a BigQuery dataset

To analyze the flow of email through Gmail's delivery pipeline, you assign your Gmail logs to a dataset in a BigQuery project. After the Gmail logs are assigned, the data becomes available for analysis and reporting.   

  1. Sign in to your Google Admin console.

    Sign in using your administrator account (does not end in @gmail.com).

  2. From the Admin console Home page, go to Appsand thenG Suiteand thenGmailand thenAdvanced settings.

    Tip: To see Advanced settings, scroll to the bottom of the Gmail page.

  3. Under General Settings and then Setup, go to Email Logs in BigQuery and click Configure.
  4. In the Add setting window, enter a description.
  5. Select the BigQuery project you want to use for Gmail logs. (Select a project with write access.)  
  6. Enter a dataset name where Gmail logs will be stored. 
  7. Click Add setting to return to the settings page, and then click Save

    Note: If an error occurs, try clicking Add setting again. You might need to go to the BigQuery console and delete the previously created dataset. 
     
  8. After saving settings, go back to your BigQuery project.  A dataset containing the following has been added to the project: 
    • The normal roles for project owners, project editors, and project viewers.
    • Four required service accounts, designated as dataset editors (not the Cloud project IAM). Do not remove these service accounts or change their roles.

      gmail-for-work-logs-writer1@system.gserviceaccount.com. Writes the logs. 
      gmail-for-work-logs-writer2@system.gserviceaccount.com. Writes the logs.
      gmail-for-work-logs-recoverer@system.gserviceaccount.com. Automatically restores the template table if it's accidentally deleted. 
      gmail-for-work-logs-schema-updater@system.gserviceaccount.com. Updates the schema in the future.
  9. To verify that the service accounts are correctly added, position the pointer over the new dataset and, next to the dataset name, click Down, and then click Share dataset.

A new table named “daily_” is added to the dataset, which serves as a template to provide a table schema for actual daily tables. It’s left empty and never expires. Don’t remove, modify, rename, or add data in this table.

It can take up to an hour for your changes to take effect. After that, daily tables are automatically created in your dataset. Use GMT time to determine the date change line. The logs are now available for use.

Gmail log contents

Gmail logs store records of different stages an email went through in Gmail’s email delivery pipeline. There are typically  multiple records for a single email message, which is uniquely identified by message ID. Before reading the complete schema of a Gmail log table to compose your custom query, try some example queries for the most common use cases.

BigQuery syntax and functions

BigQuery supports 2 SQL dialects for queries—standard SQL and legacy SQL. By default, BigQuery runs queries using legacy SQL. For more information about query syntax and functions, see the BigQuery Query Reference (legacy SQL) and BigQuery Query Reference (standard SQL).

 


 

Was this article helpful?
How can we improve it?