gpt4 book ai didi

sql - PostgreSQL array_agg 非常慢

转载 作者:行者123 更新时间:2023-11-29 13:24:05 24 4
gpt4 key购买 nike

我有一个表 callscalls_statisticscalls 有一个主键calls_id,它是calls_statistics 中的一个外键。

调用当前包含 16k 个条目。

当我运行时

SELECT c.*,
array_agg(cs.mean) AS statistics_means
FROM calls AS c
LEFT JOIN calls_statistics AS cs ON c.calls_id = cs.calls_id
GROUP BY c.calls_id
order by caller_id ASC, call_time ASC LIMIT 100;

查询耗时约622毫秒

 Limit  (cost=11947.99..11948.24 rows=100 width=551) (actual time=518.921..518.941 rows=100 loops=1)
-> Sort (cost=11947.99..11989.07 rows=16429 width=551) (actual time=518.918..518.928 rows=100 loops=1)
Sort Key: c.caller_id, c.call_time
Sort Method: top-N heapsort Memory: 126kB
-> HashAggregate (cost=11114.73..11320.09 rows=16429 width=551) (actual time=461.869..494.761 rows=16429 loops=1)
-> Hash Right Join (cost=6234.65..10705.12 rows=81922 width=551) (actual time=79.171..257.498 rows=81922 loops=1)
Hash Cond: (cs.calls_id = c.calls_id)
-> Seq Scan on calls_statistics cs (cost=0.00..2627.22 rows=81922 width=12) (actual time=3.534..26.778 rows=81922 loops=1)
-> Hash (cost=6029.29..6029.29 rows=16429 width=547) (actual time=75.578..75.578 rows=16429 loops=1)
Buckets: 2048 Batches: 1 Memory Usage: 9370kB
-> Seq Scan on calls c (cost=0.00..6029.29 rows=16429 width=547) (actual time=13.806..42.446 rows=16429 loops=1)
Total runtime: 622.537 ms

但是,当我禁用 array_agg 并运行查询时,它会使用我的索引:

SELECT c.*,
cs.mean
FROM calls AS c
LEFT JOIN calls_statistics AS cs ON c.calls_id = cs.calls_id
order by caller_id ASC, call_time ASC LIMIT 100;

查询仅需 0.565 毫秒!

 Limit  (cost=0.70..52.93 rows=100 width=551) (actual time=0.077..0.320 rows=100 loops=1)
-> Nested Loop Left Join (cost=0.70..42784.95 rows=81922 width=551) (actual time=0.075..0.304 rows=100 loops=1)
-> Index Scan using calls_caller_id_call_time_calls_id_idx on calls c (cost=0.29..22395.06 rows=16429 width=547) (actual time=0.042..0.091 rows=25 loops=1)
-> Index Scan using calls_stats_calls_idx on calls_statistics cs (cost=0.42..1.18 rows=6 width=12) (actual time=0.003..0.005 rows=4 loops=25)
Index Cond: (c.calls_id = calls_id)
Total runtime: 0.565 ms

不会只是聚合成数组就花那么多时间吧?我做错了什么?

我正在使用 Postgres 9.3。

最佳答案

一个选项是从表 calls 中选择 100 行,然后加入并聚合 calls_statistics

类似于:

WITH top_calls as (SELECT c.*
FROM calls AS c
ORDER BY caller_id ASC, call_time ASC
LIMIT 100)
SELECT c.*,
array_agg(cs.mean) AS statistics_means
FROM top_calls AS c
LEFT JOIN calls_statistics AS cs ON c.calls_id = cs.calls_id
GROUP BY c.calls_id
order by caller_id ASC, call_time ASC;

它将为您提供与第一个查询完全相同的输出。

关于sql - PostgreSQL array_agg 非常慢,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36889518/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com