gpt4 book ai didi

sql - 从 PostgreSQL 上的 UNION ALL(不使用 UNION)中删除重复项的最快方法?

转载 作者:行者123 更新时间:2023-12-04 03:28:22 25 4
gpt4 key购买 nike

我有一个包含数亿行的表,我想在其中获取来自 2 的唯一值的单个列表。索引 栏目 同表 (没有唯一的行 ID)。
为了说明这一点,假设我们有一张带有 fruits 的表。列和 veggies专栏,我想建一个 healthy_foods列出两列中的唯一值。
我尝试了以下查询:
与联合

WITH cte as (
SELECT fruit, veggie
FROM recipes
)
SELECT fruit as healthy_food
FROM cte
UNION -- <---
SELECT veggie as healthy_food
FROM cte;
与 UNION ALL 然后 DISTINCT ON
WITH cte as (...)
SELECT DISTINCT ON (healthy_food) healthy_food FROM -- <---
(SELECT fruit as healthy_food
FROM cte
UNION ALL -- <---
SELECT veggie as healthy_food
FROM cte) tb;
使用 UNION ALL 然后 GROUP BY
WITH cte as (...)
SELECT fruit as healthy_food
FROM cte
UNION ALL -- <---
SELECT veggie as healthy_food
FROM cte
GROUP BY healthy_food; -- <---
(并在来自 UNION 的每个 SELECT 上添加 HAVING COUNT(*) = 1GROUP BY)
UNION ALL 的执行速度非常快,但我尝试过的所有重复删除组合都需要 +15 分钟。
考虑到 2 个字段/列来自同一个表并已编入索引,我该如何优化此查询?
(或者,跟踪所有唯一值的最便宜的方法是什么?可能是在 UNIQUE 表或 View 上插入触发器?)

最佳答案

如果水果和/或蔬菜之间有很多重复项,但水果和蔬菜之间没有那么多重复项(如您示例中的名称所示),并且由于您有 索引对于他们两个,模拟索引跳过扫描(又名松散索引扫描)会产生奇迹:

WITH RECURSIVE fruit AS (
(
SELECT fruit
FROM recipes
ORDER BY 1
LIMIT 1
)
UNION ALL
SELECT (SELECT fruit
FROM recipes
WHERE fruit > t.fruit
ORDER BY 1
LIMIT 1)
FROM fruit t
WHERE t.fruit IS NOT NULL
)
, veggie AS (
(
SELECT veggie
FROM recipes
ORDER BY 1
LIMIT 1
)
UNION ALL
SELECT (SELECT veggie
FROM recipes
WHERE veggie > t.veggie
ORDER BY 1
LIMIT 1)
FROM veggie t
WHERE t.veggie IS NOT NULL
)
SELECT DISTINCT healthy_food
FROM (
SELECT fruit AS healthy_food FROM fruit
UNION ALL
SELECT veggie AS healthy_food FROM veggie
) sub
WHERE healthy_food IS NOT NULL;
刚刚 DISTINCT而不是 DISTINCT ON (就像您尝试过的)在外部 SELECT ,因为我们正在处理单个列。
看:
  • SELECT DISTINCT is slower than expected on my table in PostgreSQL
  • Optimize GROUP BY query to retrieve latest row per user

  • 您不妨使用 UNION而不是 UNION ALL + DISTINCT在外 SELECT .只是避免了,因为你明确要求它。但我看不出重点。

    关于sql - 从 PostgreSQL 上的 UNION ALL(不使用 UNION)中删除重复项的最快方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67267659/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com