gpt4 book ai didi

sql - 根据条件聚合文本

转载 作者:行者123 更新时间:2023-11-29 12:57:24 24 4
gpt4 key购买 nike

previous question 上我问了一个类似的问题,该问题依赖于辅助表作为拆分数据的标准的一部分。看来我目前的目标更容易,但我想不通。

给定表格:

CREATE TABLE conversations (id int, record_id int, is_response bool, text text);
INSERT INTO conversations VALUES
(1, 1, false, 'in text 1')
, (2, 1, true , 'response text 1')
, (3, 1, false, 'in text 2')
, (4, 1, true , 'response text 2')
, (5, 1, true , 'response text 3')
, (6, 2, false, 'in text 1')
, (7, 2, true , 'response text 1')
, (8, 2, false, 'in text 2')
, (9, 2, true , 'response text 2')
, (10, 2, true , 'response text 3');

我想根据 is_response 值聚合文本并输出以下内容:

 record_id | aggregated_text                                   |
----------+---------------------------------------------------+
1 |in text 1 response text 1 |
----------+---------------------------------------------------+
1 |in text 2 response text 2 response text 3 |
----------+---------------------------------------------------+
2 |in text 1 response text 1 |
----------+---------------------------------------------------+
2 |in text 2 response text 2 response text 3 |

我尝试了以下查询,但它无法连续聚合两个响应,IE :is_response 在一个序列中为真。

SELECT
record_id,
string_agg(text, ' ' ORDER BY id) AS aggregated_text
FROM (
SELECT
*,
coalesce(sum(incl::integer) OVER (ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),0) AS grp
FROM (
SELECT *, is_response as incl
FROM conversations
) c
) c1
GROUP BY record_id, grp
HAVING bool_or(incl)
ORDER BY max(id);

我的查询输出只是为以下 is_response 行添加了另一行,如下所示:

 record_id | aggregated_text                                   |
----------+---------------------------------------------------+
1 |in text 1 response text 1 |
----------+---------------------------------------------------+
1 |in text 2 response text 2 |
----------+---------------------------------------------------+
1 |response text 3 |
----------+---------------------------------------------------+
2 |in text 1 response text 1 |
----------+---------------------------------------------------+
2 |in text 2 response text 2 |
----------+---------------------------------------------------+
2 | response text 3 |
----------+---------------------------------------------------+

我该如何解决?

最佳答案

这是 answer 的变体我给了你previous question :

SELECT record_id, string_agg(text, ' ')
FROM (
SELECT *, coalesce(sum(incl::integer) OVER w,0) AS subgrp
FROM (
SELECT *, is_response AND NOT coalesce(lead(is_response) OVER w,false) AS incl
FROM conversations
WINDOW w AS (PARTITION BY record_id ORDER BY id)
) t
WINDOW w AS (PARTITION BY record_id ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)
) t1
GROUP BY record_id, subgrp
HAVING bool_or(incl)
ORDER BY min(id);

想法是,对于每一行,我们在 lead 窗口函数的帮助下查看同一记录的下一行。如果没有这样的行,或者如果有一个并且它的 is_response 为假而当前 is_response 为真,那么我们选择该行,聚合所有以前未使用的值文本

此查询还确保如果最后一次对话不完整(您的示例数据中不会发生这种情况),它将被忽略。

关于sql - 根据条件聚合文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39706923/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com