gpt4 book ai didi

sql - 查询滚动日期范围内不同值的计数

转载 作者:行者123 更新时间:2023-11-29 11:14:05 25 4
gpt4 key购买 nike

我有一组电子邮件地址和将这些电子邮件地址添加到表中的日期。对于各种不同的日期,可以有多个电子邮件地址条目。例如,如果我有下面的数据集。我想了解我们在上述日期和 3 天前之间收到的不同电子邮件的日期和数量。

Date   | email  
-------+----------------
1/1/12 | test@test.com
1/1/12 | test1@test.com
1/1/12 | test2@test.com
1/2/12 | test1@test.com
1/2/12 | test2@test.com
1/3/12 | test@test.com
1/4/12 | test@test.com
1/5/12 | test@test.com
1/5/12 | test@test.com
1/6/12 | test@test.com
1/6/12 | test@test.com
1/6/12 | test1@test.com

如果我们使用 3 的日期周期,结果集将看起来像这样

date   | count(distinct email)
-------+------
1/1/12 | 3
1/2/12 | 3
1/3/12 | 3
1/4/12 | 3
1/5/12 | 1
1/6/12 | 2

我可以使用下面的查询获得一个日期范围的不同计数,但希望按天获得一个范围的计数,这样我就不必手动更新数百个日期的范围。

select test.date, count(distinct test.email)  
from test_table as test
where test.date between '2012-01-01' and '2012-05-08'
group by test.date;

最佳答案

测试用例:

CREATE TABLE tbl (date date, email text);
INSERT INTO tbl VALUES
('2012-01-01', 'test@test.com')
, ('2012-01-01', 'test1@test.com')
, ('2012-01-01', 'test2@test.com')
, ('2012-01-02', 'test1@test.com')
, ('2012-01-02', 'test2@test.com')
, ('2012-01-03', 'test@test.com')
, ('2012-01-04', 'test@test.com')
, ('2012-01-05', 'test@test.com')
, ('2012-01-05', 'test@test.com')
, ('2012-01-06', 'test@test.com')
, ('2012-01-06', 'test@test.com')
, ('2012-01-06', 'test1@test.com`')
;

查询 - 仅返回 tbl 中存在条目的日期:

SELECT date
,(SELECT count(DISTINCT email)
FROM tbl
WHERE date BETWEEN t.date - 2 AND t.date -- period of 3 days
) AS dist_emails
FROM tbl t
WHERE date BETWEEN '2012-01-01' AND '2012-01-06'
GROUP BY 1
ORDER BY 1;

或者 - 返回指定范围内的所有天,即使当天没有行:

SELECT date
,(SELECT count(DISTINCT email)
FROM tbl
WHERE date BETWEEN g.date - 2 AND g.date
) AS dist_emails
FROM (SELECT generate_series(timestamp '2012-01-01'
, timestamp '2012-01-06'
, interval '1 day')::date) AS g(date);

db<> fiddle here

结果:

day        | dist_emails
-----------+------------
2012-01-01 | 3
2012-01-02 | 3
2012-01-03 | 3
2012-01-04 | 3
2012-01-05 | 1
2012-01-06 | 2

听起来像是window functions的工作起初,但我没有找到一种方法来定义合适的窗口框架。另外,per documentation :

Aggregate window functions, unlike normal aggregate functions, do notallow DISTINCT or ORDER BY to be used within the function argument list.

所以我用相关的子查询来解决它。我想这是最聪明的方法。

顺便说一句,“在所述日期和 3 天前之间”将是 4 天的时间段。你的定义在那里是矛盾的。

稍微短一些,但几天会慢一些:

SELECT g.date, count(DISTINCT email) AS dist_emails
FROM (SELECT generate_series(timestamp '2012-01-01'
, timestamp '2012-01-06'
, interval '1 day')::date) AS g(date)
LEFT JOIN tbl t ON t.date BETWEEN g.date - 2 AND g.date
GROUP BY 1
ORDER BY 1;

相关:

关于sql - 查询滚动日期范围内不同值的计数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10544182/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com