gpt4 book ai didi

sql - 使用计数不同的 postgres 中的慢查询

转载 作者:行者123 更新时间:2023-11-29 12:05:20 25 4
gpt4 key购买 nike

我的目标是创建一个查询,该查询将返回在 365 天窗口内购买过的唯一客户的计数。我在 postgres 中创建了下面的查询,结果查询非常慢。我的表格只有 812,024 行订单日期和客户 ID。当我删除 distinct 语句时,我可以让查询在大约 60 秒内返回结果,有了它,我还没有完成。我在 (order_date, id) 上创建了一个索引。我是 SQL 的完全新手,这真的是我第一次用它做任何事情,在整天试图找到解决这个问题的方法之后,我找不到任何可以开始工作的东西,即使我已经看到很多关于 distinct 的缓慢性能。

SELECT
(d1.Ordered) AS Ordered,
COUNT(distinct d2.ID) Users
FROM
(
SELECT order_date AS Ordered
FROM orders
GROUP BY order_date
) d1
INNER JOIN
(
SELECT order_date AS Ordered, id
FROM orders
) d2
ON d2.Ordered BETWEEN d1.Ordered - 364 AND d1.Ordered
GROUP BY d1.Ordered
ORDER BY d1.Ordered

"Sort (cost=3541596.30..3541596.80 rows=200 width=29)"
" Sort Key: orders_1.order_date"
" -> HashAggregate (cost=3541586.66..3541588.66 rows=200 width=29)"
" -> Nested Loop (cost=16121.73..3040838.52 rows=100149627 width=29)"
" -> HashAggregate (cost=16121.30..16132.40 rows=1110 width=4)"
" -> Seq Scan on orders orders_1 (cost=0.00..14091.24 rows=812024 width=4)"
" -> Index Only Scan using x on orders (cost=0.43..1822.70 rows=90225 width=29)"
" Index Cond: ((order_date >= (orders_1.order_date - 364)) AND (order_date <= orders_1.order_date))"

最佳答案

不需要自连接,使用generate_series

select
g.order_date as "Ordered",
count(distinct o.id) as "Users"
from
generate_series(
(select min(order_date) from orders),
(select max(order_date) from orders),
'1 day'
) g (order_date)
left join
orders o on o.order_date between g.order_date - 364 and g.order_date
group by 1
order by 1

关于sql - 使用计数不同的 postgres 中的慢查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25026079/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com