gpt4 book ai didi

postgresql - 计算组内的行数,但也来自全局结果集 : performance issue

转载 作者:行者123 更新时间:2023-11-29 12:42:28 25 4
gpt4 key购买 nike

我有一个包含日志记录的表。每条日志记录都由状态(打开关闭)和日期表示:

CREATE TABLE logs (
id BIGSERIAL PRIMARY KEY,
status VARCHAR NOT NULL,
inserted_at DATE NOT NULL
);

我需要获取包含以下信息的每日报告:

  1. 创建了多少条status = open 的日志记录,
  2. 创建了多少条status = closed 的日志记录,
  3. 包括这一天在内,至今有多少条status = open的日志记录。

这是一个示例报告输出:

    day     | created | closed | total_open
------------+---------+--------+------------
2017-01-01 | 2 | 0 | 2
2017-01-02 | 2 | 1 | 3
2017-01-03 | 1 | 1 | 3
2017-01-04 | 1 | 0 | 4
2017-01-05 | 1 | 0 | 5
2017-01-06 | 1 | 0 | 6
2017-01-07 | 1 | 0 | 7
2017-01-08 | 0 | 1 | 6
2017-01-09 | 0 | 0 | 6
2017-01-10 | 0 | 0 | 6
(10 rows)

我以一种非常“肮脏”的方式解决了这个问题:

INSERT INTO logs (status, inserted_at) VALUES
('created', '2017-01-01'),
('created', '2017-01-01'),
('closed', '2017-01-02'),
('created', '2017-01-02'),
('created', '2017-01-02'),
('created', '2017-01-03'),
('closed', '2017-01-03'),
('created', '2017-01-04'),
('created', '2017-01-05'),
('created', '2017-01-06'),
('created', '2017-01-07'),
('closed', '2017-01-08');

SELECT days.day,
count(case when logs.inserted_at = days.day AND logs.status = 'created' then 1 end) as created,
count(case when logs.inserted_at = days.day AND logs.status = 'closed' then 1 end) as closed,
count(case when logs.inserted_at <= days.day AND logs.status = 'created' then 1 end) -
count(case when logs.inserted_at <= days.day AND logs.status = 'closed' then 1 end) as total
FROM (SELECT day::date FROM generate_series('2017-01-01'::date, '2017-01-10'::date, '1 day'::interval) day) days,
logs
GROUP BY days.day
ORDER BY days.day;

另外(为简洁起见将其发布在 gist 上),并希望改进解决方案。

现在 explain 我的查询揭示了一些我想最小化的荒谬成本数字(我还没有索引)。

实现上述报告的高效查询会是什么样子?

最佳答案

一个可能的解决方案是使用 window functions :

select s.*, sum(created - closed) over (order by inserted_at)
from (select inserted_at,
count(status) filter (where status = 'created') created,
count(status) filter (where status = 'closed') closed
from (select d::date inserted_at
from generate_series('2017-01-01'::date, '2017-01-10'::date, '1 day'::interval) d) d
left join logs using (inserted_at)
group by inserted_at) s

http://rextester.com/GFRRP71067

此外,(inserted_at, status) 上的索引可以帮助您完成此查询。

注意:count(...) filter (where ...) 实际上只是一种编写count(case when ...然后... [else null] end).

关于postgresql - 计算组内的行数,但也来自全局结果集 : performance issue,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43122152/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com