Migrate Gmail logs in BigQuery to Workspace logs in BigQuery

This article is for Google Workspace administrators who use Gmail logs in BigQuery only, or who use both Gmail and Google Workspace logs in BigQuery. This article has detailed steps for migrating your Gmail BigQuery views, queries, and scripts to Workspace logs and reports in BigQuery.

To streamline BigQuery tasks, and to let you get all your BigQuery data in one place, we’re merging Gmail logs in BigQuery with Workspace logs in BigQuery. 

With this change, you get:

  • One place to store and manage BigQuery exports for all your Google Workspace services audit events
  • More complete data about Gmail events in BigQuery
  • Partitioned data, so you can export smaller data sets and reduce your BigQuery data storage costs
  • Data recovery, so you can correct misconfigurations and permissions changes

What you need to do

After Gmail logs in BigQuery is merged with Workspace logs in BigQuery, you won’t be able to use Gmail logs in BigQuery in your Google Admin console.

To keep your Gmail BigQuery data available, migrate your views, queries, and scripts to Google Workspace logs and reports in BigQuery:

Step Description More information
1 Set up service log exports to Workspace BigQuery

If you’ve already set up your Workspace logs and reports in BigQuery, skip this step and go to Step 2.

If you skip this step, be sure to:

  • Get your current project ID and dataset name that you’ll use for exporting Gmail data.
  • Understand how the dual export period can affect your BigQuery storage requirements.
2 (Optional) Backfill Gmail data to the new table This optional step is recommended for large organizations that want to migrate to Workspace BigQuery as quickly as possible and avoid a dual export period.
3 Migrate Gmail BigQuery logs views, queries & scripts Do this step only if you have custom views or saved queries for Gmail logs in BigQuery that you want to migrate to Workspace logs in BigQuery. If you don’t have any custom views or saved queries to migrate, skip this step and go directly to Step 4.
4 Turn off Gmail log exports to your Gmail-only BigQuery project We recommend you verify that your saved queries run as expected before completing this step.

Step 1: Set up service log exports to Workspace BigQuery

Do this step in your Google Admin console.

If you’ve already set up your Workspace logs and reports in BigQuery, skip this step and go to Step 2: (Optional) Backfill Gmail data to the new table. If you skip this step, be sure to:

  • Get your current project ID and dataset name that you’ll use for Gmail data exports.
  • Understand how the dual export period affects your BigQuery storage requirements.

Get detailed information about BigQuery and Reports API data in Set up service log exports to BigQuery.

  1. Sign in to your Google Admin console.

    Sign in using your administrator account (does not end in @gmail.com).

  2. On the left, click Reporting.
  3. Scroll down and click BigQuery Export.
  4. Point to BigQuery Export and click Edit.
  5. To turn on BigQuery logs, check the Enable Google Workspace data export to Google BigQuery box. 
  6. Under BigQuery project ID, select the project where you want to store Gmail logs. Choose a project with write access. If you don’t see the project, you must set it up in BigQuery. For details, go to Quickstart using the Google Cloud console.
  7. Under New dataset within project, enter a name for the new dataset used to store logs in the project. You can’t use an existing dataset, and dataset names must be unique for each project.
  8. Click Save.
    Note: If you can’t save the project, go to the Google Cloud console, delete the new dataset, then save it again in the Admin console.

The dataset is automatically created. Activity log events are typically available about 10 minutes after this setting is turned on. Return later to verify that the dataset and the activity table appear.

Temporary dual export period and storage requirements

After you set up Gmail logs export to Workspace BigQuery, Gmail logs data is exported to 2 destinations: 

  • Your previous, Gmail-only BigQuery project 
  • Your new or existing Workspace BigQuery project, which stores data for all Google Workspace applications

Gmail logs data will be exported to both projects until you complete Step 4: Turn off Gmail log exports to your Gmail-only BigQuery project. Temporarily exporting your data to both locations helps ensure that your data is transferred as expected, and that no data is lost. 

Dual export period and BigQuery storage

While your data is being exported to both projects, you’ll need more BigQuery storage. When you complete Step 4: Turn off Gmail log exports to your Gmail-only BigQuery project, your storage requirements should return to normal. We recommend that you verify that your saved queries run as expected before you do Step 4.

(Optional) Backfill Gmail data to shorten the dual export period

To migrate as quickly as possible, and avoid a dual export period, you can use a script to backfill Gmail data. Understand the impact of backfilling data and get detailed steps in Step 2: (Optional) Backfill Gmail data to the new table

Step 2: (Optional) Backfill Gmail data to the new table

 Do this step in your Google Cloud console.

This step is optional and recommended only for large organizations that want to migrate to Workspace BigQuery as quickly as possible, and avoid a dual export period. This option lets you use a script to migrate 60 days of data to the table you created in Step 1. The script backfills all non-expired Gmail data that was saved to your daily_ tables before turning on the Workspace export.

Important: If you plan to backfill Gmail data using the script, keep in mind:

  • The length of time it takes to complete this step depends on how much data you migrate. A large amount of data can result in a long migration time.
  • After data is migrated with the backfill script, you can't remove it from the table.

To backfill Gmail data:

  1. In the Google Cloud console, open an active BigQuery project. Open the project that has the saved queries you want to migrate. Go to the IAM & Admin page for your project. 
  2. Select the project dataset that you want to migrate.
  3. Run this script:
    1. Copy the script text:
      DECLARE time_workspace_export_enabled INT64;
      SET time_workspace_export_enabled = (
        SELECT MIN(gmail.event_info.timestamp_usec)
        FROM `NEW_PROJECT_ID.NEW_DATASET_NAME.activity`
      );
      INSERT INTO `NEW_PROJECT_ID.NEW_DATASET_NAME.activity`
      (
        _PARTITIONTIME,
        time_usec,
        event_type,
        event_name,
        record_type,
        gmail
      )
      SELECT
        TIMESTAMP_TRUNC(TIMESTAMP_MICROS(event_info.timestamp_usec), DAY, "UTC"),
        event_info.timestamp_usec,
        "delivery_type",
        "delivery",
        "gmail",
        STRUCT(
          event_info,
          STRUCT(
            message_info.action_type,
            message_info.rfc2822_message_id,
            message_info.subject,
            message_info.payload_size,
            message_info.source,
            message_info.destination,
            message_info.flattened_destinations,
            message_info.description,
            message_info.is_spam,
            message_info.is_policy_check_for_sender,
            message_info.num_message_attachments,
            message_info.attachment,
            STRUCT(
              message_info.connection_info.client_ip,
              message_info.connection_info.smtp_in_connect_ip,
              message_info.connection_info.smtp_out_connect_ip,
              message_info.connection_info.failed_smtp_out_connect_ip,
              message_info.connection_info.smtp_tls_state,
              message_info.connection_info.smtp_tls_version,
              message_info.connection_info.smtp_tls_cipher,
              message_info.connection_info.smtp_reply_code,
              message_info.connection_info.tls_required_but_unavailable,
              message_info.connection_info.smtp_out_remote_host,
              message_info.connection_info.smtp_user_agent_ip,
              message_info.connection_info.is_intra_domain,
              message_info.connection_info.dmarc_pass,
              message_info.connection_info.dmarc_published_domain,
              message_info.connection_info.client_host_zone,
              message_info.connection_info.smtp_response_reason,
              message_info.connection_info.ip_geo_city,
              message_info.connection_info.ip_geo_country,
              message_info.connection_info.authenticated_domain,
              message_info.connection_info.is_internal,
              message_info.connection_info.dkim_pass,
              message_info.connection_info.spf_pass
            ),
            message_info.message_set,
            message_info.smtp_relay_error,
            message_info.upload_error_category,
            message_info.triggered_rule_info,
            message_info.flattened_triggered_rule_info,
            message_info.structured_policy_log_info,
            message_info.smime_sign_message,
            message_info.smime_encrypt_message,
            message_info.smime_packaging_success,
            message_info.smime_extraction_success,
            message_info.smime_content_type,
            message_info.link_domain,
            message_info.spam_info
          )
        )
      FROM `OLD_PROJECT_ID.OLD_DATASET_NAME.daily_*`
      WHERE event_info.timestamp_usec < time_workspace_export_enabled;
      
    2. Go to the Editor field. If the Editor field isn't visible, click Compose new query "".
    3. Paste the script text into the Editor field.
    4. In the Editor field, make these edits:
    5. (Optional) By default, the script backfills up to 60 days of data. To backfill to a specific month, update the script by adding the month and year to the end of .daily_ For example, to backfill to July 2022, update this line in the script to: FROM `OLD_PROJECT_ID.OLD_DATASET_NAME.daily_202207*`
    6. In the upper left, click the Run button.

      If you get an error that says field names smp_tls_version or smtp_tls_cipher don't exist, replace the following 2 fields in the script with STRING(NULL):

      message_info.connection_info.smtp_tls_version 

      message_info.connection_info.smtp_tls_cipher 

Continue to Step 3: Migrate Gmail BigQuery logs views, queries & scripts.

Step 3: Migrate Gmail BigQuery logs views, queries & scripts

Do this step in your Google Cloud console.

Do this step only if you have saved queries or custom data set views that you want to migrate to Workspace logs in BigQuery:

  • To check if you have saved queries, follow these steps.
  • To check your custom data set views, follow these steps.

If you don’t have any custom views or saved queries to migrate, skip this step and go to Step 4: Turn off Gmail log exports to your Gmail-only BigQuery project.

To store Gmail data with the same structure as your current Gmail daily_ tables, create a view for the Gmail activity table. When you migrate your existing queries to Workspace BigQuery, you’ll modify the table name. You won’t modify the table format or the column names. For detailed information about BigQuery views, visit Introduction to views.

  1. In the Google Cloud console, open an active BigQuery project. Open the project that has the saved queries you want to migrate.  
  2. Select the project dataset that you want to migrate.
  3. To create a single view for the activity table, follow these steps to run the script:
    1. Copy the script text:
      CREATE VIEW
       `PROJECT_ID.DATASET_NAME.gmail_view`(event_info, message_info, date)
      AS
       SELECT
        gmail.event_info,
        gmail.message_info,
        _PARTITIONTIME
       FROM
        `PROJECT_ID.DATASET_NAME.activity`
       WHERE
        gmail IS NOT NULL;
      
    2. Go to the Editor field. If the Editor field isn't visible, then click Compose new query "".
    3. Paste the script text into the Editor field.
    4. In the pasted script, replace PROJECT_ID.DATASET_NAME with the project ID and dataset name from Step 1: Set up service log exports to Workspace BigQuery.
    5. In the upper left, click the Run button.
  4. Review saved queries that select from the daily_ tables. For existing queries, update the table names and add a WHERE condition as needed. Make one of these changes:
    • Change FROM `daily_YYYYMMDD` selections to FROM `gmail_view` WHERE date = “YYYY-MM-DD”. For example, change FROM `daily_20220815` to FROM `gmail_view` WHERE date = “2022-08-15”
    • Change FROM `daily_*` selections to FROM `gmail_view`

Step 4: Turn off Gmail log exports to your Gmail-only BigQuery project

Do this step in your Google Admin console.

Gmail logs data will be exported to two projects until you complete this step. Exporting data to both projects helps ensure that projects and queries are transferred as expected, and that no data is missing. We recommend you verify that your saved queries run as expected before doing this step. 

To immediately stop exporting Gmail logs to your Gmail-only BigQuery project:

  1. Sign in to your Google Admin console.

    Sign in using your administrator account (does not end in @gmail.com).

  2. From the Admin console home page, go to Appsand thenGoogle Workspaceand thenGmail​and thenSetupand thenEmail Logs in BigQuery.
  3. Point to the setting and click Edit "".
  4. Uncheck the Enable box.
  5. Click Save.

Related topics

Was this helpful?
How can we improve it?

Need more help?

Sign in for additional support options to quickly solve your issue

Search
Clear search
Close search
Google apps
Main menu
Search Help Center
true
73010
false
false