gpt4 book ai didi

sql - 如何查找时间间隔内的购买次数 SQL

转载 作者:行者123 更新时间:2023-11-29 13:28:52 25 4
gpt4 key购买 nike

我正在使用 Redshift (Postgres) 和 Pandas 来完成我的工作。我正在尝试获取用户操作的数量,让我们说购买以使其更容易理解。我有一张表,购买包含以下数据:

user_id, timestamp ,  price
1, , 2015-02-01, 200
1, , 2015-02-02, 50
1, , 2015-02-10, 75

最终我想要某个时间戳内的购买次数。比如

userid, 28-14_days, 14-7_days, 7

这是我目前的情况,我知道我没有日期上限:

SELECT DISTINCT x_days.user_id, SUM(x_days.purchases) AS x_num, SUM(y_days.purchases) AS y_num,
x_days.x_date, y_days.y_date
FROM
(
SELECT purchases.user_id, COUNT(purchases.user_id) as purchases,
DATE(purchases.timestamp) as x_date
FROM purchases
WHERE purchases.timestamp > (current_date - INTERVAL '%(x_days_ago)s day') AND
purchases.max_value > 200
GROUP BY DATE(purchases.timestamp), purchases.user_id
) AS x_days
JOIN
(
SELECT purchases.user_id, COUNT(purchases.user_id) as purchases,
DATE(purchases.timestamp) as y_date
FROM purchases
WHERE purchases.timestamp > (current_date - INTERVAL '%(y_days_ago)s day') AND
purchases.max_value > 200
GROUP BY DATE(purchases.timestamp), purchases.user_id) AS y_days
ON
x_days.user_id = y_days.user_id
GROUP BY
x_days.user_id, x_days.x_date, y_days.y_date

params={'x_days_ago':x_days_ago, 'y_days_ago':y_days_ago}
where these are set in python/pandas

x_days_ago = 14y_days_ago = 7

但这并没有完全按照计划进行:

    user_id x_num   y_num   x_date      y_date
0 5451772 1 1 2015-02-10 2015-02-10
1 5026678 1 1 2015-02-09 2015-02-09
2 6337993 2 1 2015-02-14 2015-02-13
3 6204432 1 3 2015-02-10 2015-02-11
4 3417539 1 1 2015-02-11 2015-02-11

即使我没有可以查看的上限日期(因此 x 有效地搜索了 14 天到现在,y 是 7 天到现在,这意味着重叠),在某些情况下 y 更高。

谁能帮我解决这个问题或给我一个更好的方法?

谢谢!

最佳答案

这可能不是最有效的答案,但您可以使用子选择生成每个总和:

WITH
summed AS (
SELECT user_id, day, COUNT(1) AS purchases
FROM (SELECT user_id, DATE(timestamp) AS day FROM purchases) AS _
GROUP BY user_id, day
),
users AS (SELECT DISTINCT user_id FROM purchases)
SELECT user_id,
(SELECT SUM(purchases) FROM summed
WHERE summed.user_id = users.user_id
AND day >= DATE(NOW() - interval ' 7 days')) AS days_7,
(SELECT SUM(purchases) FROM summed
WHERE summed.user_id = users.user_id
AND day >= DATE(NOW() - interval '14 days')) AS days_14
FROM users;

(这是在 Postgres 中测试的,而不是在 Redshift 中测试的;但是 Redshift 文档表明同时支持 WITHDISTINCT。)我希望用一个窗口,以获得滚动总和;但是没有 generate_series() 会有点麻烦。

关于sql - 如何查找时间间隔内的购买次数 SQL,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28530908/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com