gpt4 book ai didi

sql - 使用 NOT IN 性能低下

转载 作者:行者123 更新时间:2023-11-29 11:39:04 26 4
gpt4 key购买 nike

我正在尝试进行查询,以计算每年有多少部 Actor 不是男性的电影(对于每一年,计算那一年没有男性的电影数量).

这些是表格:

ACTOR (id, fname, lname, gender)
MOVIE (id, name, year)
CASTS (pid, mid, role) -- pid refers to actor id, mid refers to movie id

这是我编入索引的内容(表的 id 是主键,所以它们已经编入索引,或者我假设如此):

CREATE INDEX gender_index on actor(gender);
CREATE INDEX movie_name_index on movie(name);
CREATE INDEX movie_year_index on movie(year);
CREATE INDEX casts_index on casts(pid, mid, role);
CREATE INDEX casts_pid_index on casts(pid);
CREATE INDEX casts_mid_index on casts(mid);
CREATE INDEX casts_role_index on casts(role);

这是我的查询:

SELECT m.year, count(m.id)
FROM movie as m
WHERE m.id NOT IN (
SELECT DISTINCT m.id
FROM movie as m, casts as c, actor as a
WHERE m.id = c.mid and a.id = c.pid and a.gender = 'M'
)
GROUP BY m.year
ORDER BY m.year

查询需要很长时间(并且永远不会完成),那么我怎样才能让它更快呢?使用 NOT EXISTS 有帮助吗,尽管我认为优化器会处理这个问题?我还需要索引其他东西吗?还有其他更好的查询吗?如果这有任何不同,我正在使用 PostgreSQL。

这是EXPLAIN:

"GroupAggregate  (cost=1512539.61..171886457832.52 rows=61 width=8)"
" Group Key: m.year"
" -> Index Scan using movie_year_index on movie m (cost=1512539.61..171886453988.38 rows=768706 width=8)"
" Filter: (NOT (SubPlan 1))"
" SubPlan 1"
" -> Materialize (cost=1512539.18..1732298.66 rows=1537411 width=4)"
" -> Unique (cost=1512539.18..1718605.60 rows=1537411 width=4)"
" -> Merge Join (cost=1512539.18..1700559.32 rows=7218511 width=4)"
" Merge Cond: (m_1.id = c.mid)"
" -> Index Only Scan using movie_pkey on movie m_1 (cost=0.43..57863.94 rows=1537411 width=4)"
" -> Materialize (cost=1512531.37..1548623.92 rows=7218511 width=4)"
" -> Sort (cost=1512531.37..1530577.65 rows=7218511 width=4)"
" Sort Key: c.mid"
" -> Hash Join (cost=54546.59..492838.95 rows=7218511 width=4)"
" Hash Cond: (c.pid = a.id)"
" -> Seq Scan on casts c (cost=0.00..186246.43 rows=11445843 width=8)"
" -> Hash (cost=35248.91..35248.91 rows=1176214 width=4)"
" -> Seq Scan on actor a (cost=0.00..35248.91 rows=1176214 width=4)"
" Filter: ((gender)::text = 'M'::text)"

最佳答案

我会努力

SELECT m.year, count(m.id)
FROM movie m
WHERE NOT EXISTS (
SELECT NULL
FROM casts c, actor a
WHERE m.id = c.mid and a.id = c.pid and a.gender = 'M'
)
GROUP BY m.year
ORDER BY m.year

关于sql - 使用 NOT IN 性能低下,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42569560/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com