gpt4 book ai didi

PostgreSQL 子查询案例顺序扫描

转载 作者:行者123 更新时间:2023-11-29 13:08:24 25 4
gpt4 key购买 nike

我有 2 个表:

  1. requests:内容6000万条记录(用作网站日志)

  2. requests_hours:内容几百行(从 requests 表中每分钟更新一次)

我有以下简单的查询,但是当我执行它时,大约需要 5 分钟才能完成,因为 Postgres 不使用 request_time_utc 列的索引,而只是进行顺序扫描。

SELECT COUNT(request_id)
FROM requests
WHERE request_time_utc >= (SELECT MAX(request_hour_utc) FROM requests_hours)

enter image description here

但是如果我只是删除子查询(它本身在 0.003 秒内执行)并将其替换为如下所示的静态值,我将在仅 0.008 秒内执行以下查询:

SELECT COUNT(request_id)
FROM requests
WHERE request_time_utc >= '2019-09-30 17:00:00'

enter image description here

查询应该每分钟只计算几行,从 1000 到 7000,所以对 request_time_utc 列的索引扫描肯定比顺序扫描好得多。

我不明白如何强制 PostgreSQL 对第一个查询进行索引扫描。

以上查询是为了简化问题;这是原始的:

SELECT 
customer_id,
DATE_TRUNC('hour', request_time_utc) AS request_hour_utc,
COUNT(request_id) AS total_requests,
SUM(data_in_size) AS total_data_in_size,
SUM(data_out_size) AS total_data_out_size,
SUM(process_long) AS total_process_long
FROM requests
WHERE request_time_utc >= (SELECT MAX(request_hour_utc) FROM requests_hours)
AND customer_id IS NOT NULL
GROUP BY request_hour_utc , customer_id
ORDER BY request_hour_utc DESC;

最佳答案

将您的子查询移动到 CTE,就像这样(我是在火车上用手机写的,所以您需要找到正确的查询:-)):

WITH your_max AS (SELECT MAX(request_hour_utc) as foo FROM requests_hours)
SELECT COUNT(request_id)
FROM requests CROSS JOIN your_max
WHERE request_time_utc >= your_max.foo

关于PostgreSQL 子查询案例顺序扫描,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58231385/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com