批量数据导出查询的最佳实践
始终使用聚合函数
表中的数据不一定会按日期、网址、网站或任意组合键合并。因此,一定要汇总字段(“总和”“计数”等),以确保行值会合并。例如,一个表可能有多行关于“fat cats”的查询:
query date search_type impressions ... fat cats 10-10-2022 WEB 1000 fat cats 10-10-2022 WEB 200 fat cats 10-10-2022 IMAGE 5000
如果仅仅搜索关于“fat cats”的查询,那么即时选择了特定的日期和搜索类型,也可能会生成多行结果。因此,您应按以下方式进行汇总,以查看展示次数之和:
/* 查看最热门查询的错误方式:每次查询只会生成 1 行 */ SELECT query, impressions FROM searchconsole.searchdata_site_impression ORDER BY 2 DESC LIMIT 1; Result: fat cats 5000 /* 查看最热门查询的正确方式 */ SELECT query, SUM(impressions) FROM searchconsole.searchdata_url_impression WHERE query != '' /* 滤除匿名化查询 */ GROUP BY query ORDER BY 2 DESC LIMIT 1; Result: fat cats 6200
按日期限制查询以节省处理费用
请注意,在 BigQuery 上运行查询时,您需要为此支付费用,并且表可能会非常庞大。导出的表会按日期分区,因而能够有效地将输入扫描限制为仅关注某些天;对于大型网站,这样能真正节省查询费用。若要尽量减少查询费用,不妨使用 WHERE 子句来限制日期分区表中的日期范围。
滤除 null 查询字符串
匿名化查询通常构成网站的唯一最常见查询。匿名化查询在表中会报告为长度为零的字符串。若要查看符合特定条件的最热门查询,您可能需要从 SQL 查询中滤除长度为零的查询值。
示例查询
以下是一些示例查询,可帮助您快速上手处理数据。
按网站显示结果
按天显示网站的所有统计信息(过去两周)
SELECT data_date AS date, sum(impressions) AS impressions, sum(clicks) as clicks, sum(clicks) / sum(impressions) AS ctr, /* 由于排名从零开始,所以添加了以下代码 */ ((sum(sum_top_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_site_impression WHERE search_type = 'WEB' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY data_date ORDER BY date desc LIMIT 1000
按点击次数显示美国热门移动网络查询(过去两周)
SELECT query, device, sum(impressions) AS impressions, sum(clicks) AS clicks, sum(clicks) / sum(impressions) AS ctr, ((sum(sum_top_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_site_impression WHERE search_type = 'WEB' AND country = 'usa' AND device = 'MOBILE' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY clicks LIMIT 1000
按网址显示结果
按点击次数显示热门探索网址(特定日期)
SELECT url, sum(impressions), sum(clicks) FROM searchconsole.searchdata_url_impression WHERE search_type = 'DISCOVER' AND data_date = DATE(2022, 02, 01) GROUP BY url ORDER BY 3 LIMIT 1000
关于常见问题解答的富媒体搜索结果,query+url 的总统计信息(过去两周)
SELECT url, query, sum(impressions) AS impressions, sum(clicks) AS clicks, sum(clicks) / sum(impressions) AS ctr, ((sum(sum_position) / sum(impressions)) + 1.0) AS avg_position FROM searchconsole.searchdata_url_impression WHERE search_type = 'WEB' AND is_tpf_faq = true AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY clicks LIMIT 1000
按网址显示所有流量(过去两周)
SELECT url, search_type, sum(impressions) as impressions, sum(clicks) as clicks FROM searchconsole.searchdata_url_impression WHERE data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1,2 ORDER BY 3 desc LIMIT 1000
包含“您的_品牌”的查询的展示次数(过去两周)
SELECT regexp_contains(query, 'your_brand') AS contains_brand, sum(impressions) AS impressions, sum(clicks) AS clicks FROM searchconsole.searchdata_url_impression WHERE search_type = 'WEB' AND data_date between DATE_SUB(CURRENT_DATE(), INTERVAL 14 day) and CURRENT_DATE() GROUP BY 1 ORDER BY 2 desc LIMIT 1000