gpt4 book ai didi

postgresql - 如何删除后续匹配的行?

转载 作者:行者123 更新时间:2023-12-04 09:51:17 24 4
gpt4 key购买 nike

我有 jsonb 数据类型,其中每一行都有一个名称,last_updated 和其他键。我将如何创建一个查询,每天每个名称只留下 1 行?

即这个:

id | data
1 | {"name": "foo1", "last_updated": "2019-10-06T09:29:30.000Z"}
2 | {"name": "foo1", "last_updated": "2019-10-06T01:29:30.000Z"}
3 | {"name": "foo1", "last_updated": "2019-10-07T01:29:30.000Z"}
4 | {"name": "foo2", "last_updated": "2019-10-06T09:29:30.000Z"}
5 | {"name": "foo2", "last_updated": "2019-10-06T01:29:30.000Z"}
6 | {"name": "foo2", "last_updated": "2019-10-06T02:29:30.000Z"}

变成:
id | data
1 | {"name": "foo1", "last_updated": "2019-10-06T09:29:30.000Z"}
3 | {"name": "foo1", "last_updated": "2019-10-07T01:29:30.000Z"}
4 | {"name": "foo2", "last_updated": "2019-10-06T09:29:30.000Z"}

这个查询将在大约 900 万行上运行,大约有 300 个名字。

最佳答案

尝试这样的事情:


create table test (
id serial,
data jsonb
);

数据
insert into test (data) values
('{"name": "foo1", "last_updated": "2019-10-06T09:29:30.000Z"}'),
('{"name": "foo1", "last_updated": "2019-10-06T01:29:30.000Z"}'),
('{"name": "foo1", "last_updated": "2019-10-07T01:29:30.000Z"}'),
('{"name": "foo2", "last_updated": "2019-10-06T09:29:30.000Z"}'),
('{"name": "foo2", "last_updated": "2019-10-06T01:29:30.000Z"}'),
('{"name": "foo2", "last_updated": "2019-10-06T02:29:30.000Z"}');

查询
with latest as (
select data->>'name' as name, max(data->>'last_updated') as last_updated
from test
group by data->>'name'
)
delete from test t
where not exists (
select 1 from latest
where t.data->>'name' = name
and t.data->>'last_updated' = last_updated
);

select * from test;

示例

https://dbfiddle.uk/?rdbms=postgres_10&fiddle=2415e6f2c9c7980e69d178a331120dcd

您可能必须像 create index on test((data->>'name')); 一样索引您的 jsonb 列;你也可以为 last_updated 这样做。

我假设用户没有相同的 last_updated。

如果这个假设不成立,你可以试试这个:
with ranking as (
select
row_number() over (partition by data->>'name' order by data->>'last_updated' desc) as sr,
x.*
from test x
)
delete from test t
where not exists (
select 1 from ranking
where sr = 1
and id = t.id
);

在这种情况下,我们首先为用户的记录提供一个序列号。每个用户的 latest_updated 时间获得 sr 1。

然后,我们要求数据库删除所有与 sr 1 的 id 不匹配的记录。

示例: https://dbfiddle.uk/?rdbms=postgres_10&fiddle=dba1879a755ed0ec90580352f82554ee

关于postgresql - 如何删除后续匹配的行?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62008510/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com