gpt4 book ai didi

sql - 计算记录分组的所有现有组合

转载 作者:行者123 更新时间:2023-11-29 11:39:23 25 4
gpt4 key购买 nike

我有这些数据库表

  • 问题:id,文本
  • 答案:id、text、question_id
  • answer_tags:id、answer_id、tag_id
  • 标签:id、文本

  • 问题有很多答案
  • answer通过answer_tags有很多标签,属于question
  • tag通过answer_tags有很多答案
  • 一个答案有无限数量的标签

我想显示按计数排序的标签分组的所有组合

示例数据

Question 1, Answer 1, tag1, tag2, tag3, tag4
Question 2, Answer 2, tag2, tag3, tag4
Question 3, Answer 3, tag3, tag4
Question 4, Answer 4, tag4
Question 5, Answer 5, tag3, tag4, tag5
Question 1, Answer 6, <no tags>

如何使用 SQL 解决此问题?

我不确定这对 SQL 是否可行,但如果可行,我认为它需要 RECURSIVE 方法。

预期结果:

tag3, tag4 occur 4 times
tag2, tag3, tag4 occur 2 times
tag2, tag3 occur 2 times

我们只会返回分组数大于 1 的结果。不会返回任何单个标签,必须至少有 2 个标签在一起才能计算在内。

最佳答案

以@filiprem 的回答为基础,并使用来自答案 here 的稍微修改的函数你得到:

--test data
create table questions (id int, text varchar(100));
create table answers (id int, text varchar(100), question_id int);
create table answer_tags (id int, answer_id int, tag_id int);
create table tags (id int, text varchar(100));

insert into questions values (1, 'question1'), (2, 'question2'), (3, 'question3'), (4, 'question4'), (5, 'question5');
insert into answers values (1, 'answer1', 1), (2, 'answer2', 2), (3, 'answer3', 3), (4, 'answer4', 4), (5, 'answer5', 5), (6, 'answer6', 1);
insert into tags values (1, 'tag1'), (2, 'tag2'), (3, 'tag3'), (4, 'tag4'), (5, 'tag5');
insert into answer_tags values
(1,1,1), (2,1,2), (3,1,3), (4,1,4),
(5,2,2), (6,2,3), (7,2,4),
(8,3,3), (9,3,4),
(10,4,4),
(11,5,3), (12,5,4), (13,5,5);
--end test data

--function to get all possible combinations from an array with at least 2 elements
create or replace function get_combinations(source anyarray) returns setof anyarray as $$
with recursive combinations(combination, indices) as (
select source[i:i], array[i] from generate_subscripts(source, 1) i
union all
select c.combination || source[j], c.indices || j
from combinations c, generate_subscripts(source, 1) j
where j > all(c.indices) and
array_length(c.combination, 1) <= 2
)
select combination from combinations
where array_length(combination, 1) >= 2
$$ language sql;

--expected results
SELECT tags, count(*) FROM (
SELECT q.id, get_combinations(array_agg(DISTINCT t.text)) AS tags
FROM questions q
JOIN answers a ON a.question_id = q.id
JOIN answer_tags at ON at.answer_id = a.id
JOIN tags t ON t.id = at.tag_id
GROUP BY q.id
) t1
GROUP BY tags
HAVING count(*)>1;

注意:这给出了 tag2,tag4 出现了 2 次,这在预期结果中被遗漏了(来自问题 1 和 2)

关于sql - 计算记录分组的所有现有组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55534740/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com