gpt4 book ai didi

google-analytics - GA BigQuery导出-COUNT(DISTINCT(fullVisitorId)),带有源/媒介超额计数

转载 作者:行者123 更新时间:2023-12-03 17:25:25 25 4
gpt4 key购买 nike

我在GA BigQuery导出中计算唯一身份用户时遇到问题。我已经使用示例数据重现了相同的错误。

SELECT sum(users) as users, sum(sessions) as sessions FROM (
SELECT
h.page.pagePath as page_path,
trafficSource.source,
trafficSource.medium,
COUNT(DISTINCT(fullVisitorId)) AS users,
COUNT(*) as sessions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`, UNNEST(hits) h
WHERE h.page.pagePath = "/home"
GROUP BY page_path, source, medium
)
UNION ALL
SELECT sum(users) as users, sum(sessions) as sessions FROM (
SELECT
h.page.pagePath as page_path,
COUNT(DISTINCT(fullVisitorId)) AS users,
COUNT(*) as sessions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`, UNNEST(hits) h
WHERE h.page.pagePath = "/home"
GROUP BY page_path
)


当我包括 sourcemedium列时,与众不同的 fullVisitorId计数比没有它们时高10。包括这些列如何导致 fullVisitorId的数量增加?这对我来说没有意义。

是什么原因造成的,我如何获得准确的计数?

最佳答案

包括这些列如何导致增加fullVisitorId的数量?这对我来说没有意义。


您可以查看为什么要像这样运行内部查询:

SELECT
MAX(fullVisitorId) AS fullVisitorId,
h.page.pagePath as page_path,
trafficSource.source,
trafficSource.medium,
COUNT(DISTINCT(TRIM(fullVisitorId))) AS users,
COUNT(*) as sessions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`, UNNEST(hits) h
WHERE h.page.pagePath = "/home"
and fullVisitorId = '9902321252073939460'
GROUP BY page_path, source, medium


返回结果:

enter image description here

如您所见,由于某个用户来自2个不同的来源/媒介,因此您对同一用户进行了两次计数,这导致了增加。

解决此问题的一种方法是在源/介质上使用聚合函数,并从 GROUP BY删除它们,如下所示:


SELECT sum(users) as users, sum(sessions) as sessions FROM (
SELECT
h.page.pagePath as page_path,
MAX(trafficSource.source) as source,
MAX(trafficSource.medium) as medium,

COUNT(DISTINCT(TRIM(fullVisitorId))) AS users,
COUNT(*) as sessions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`, UNNEST(hits) h
WHERE h.page.pagePath = "/home"
GROUP BY page_path
)
UNION ALL
SELECT sum(users) as users, sum(sessions) as sessions FROM (
SELECT
h.page.pagePath as page_path,
COUNT(DISTINCT(TRIM(fullVisitorId))) AS users,
COUNT(*) as sessions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170101`, UNNEST(hits) h
WHERE h.page.pagePath = "/home"
GROUP BY page_path
)


现在用户数是相同的:

enter image description here

关于google-analytics - GA BigQuery导出-COUNT(DISTINCT(fullVisitorId)),带有源/媒介超额计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55909044/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com