gpt4 book ai didi

mysql - 生成组合并确定出现次数最多的组合

转载 作者:行者123 更新时间:2023-11-29 07:18:29 25 4
gpt4 key购买 nike

我有一个包含 3 列的表格。我需要生成基于特定“诊断”执行的表的值(在“测试”列中)的组合(始终为 3)。但是,特定诊断可能有 2 个或更少的测试,在这种情况下,逻辑仍会输出该组合,尽管有 2 个值。引用下表,对于每个 cust_id,都有一个“诊断”列,“测试”基于该列执行。现在对于每个诊断值组,我需要在“测试”列中生成相应值的唯一组合。请注意,组合应始终具有 3 个值(其中值 >= 3),但对于少于 3 个值(1 或 2)的诊断,仍应输出相应的组合(具有可用的 1 或 2 个值并替换Null 代替不可用的值)。

患者:

pat_id | diagnosis | tests

1001 | Thyroid | CAT
1001 | Thyroid | MRI
1001 | Thyroid | Blood

1001 | Tonsil | CAT
1001 | Tonsil | MRI
1001 | Tonsil | Blood
1001 | Tonsil | RAPID

1002 | Pneumonia | MRI
1002 | Pneumonia | Eliza

1003 | Bronchitis | X-Ray

因此,对于 pat_id = '1001'diagnosis = 'Thyroid'我们看到“测试”有 3 个不同的值。因此,只有 1 个唯一组合是可能的,即 {CAT, MRI, Blood} .同样,对于 pat_id = '1001' and diagnosis = 'Tonsil' ,我们看到“测试”列中有 4 个不同的值。因此,将有 4 种组合,即 {CAT, MRI, Blood}, {CAT, MRI, RAPID}, {MRI, Blood, RAPID} & {CAT, blood, RAPID} .对于 pat_id = '1002'只有两个唯一值存在。因此组合将只有 1 即 {MRI, Eliza} .同样,pat_id = '1003'只有 1 个值,即 X-Ray,因此输出应为 {X-Ray}对于 '1003' .

像这样,我需要为一组中的所有诊断值生成相似的组合,最后确定在该表中出现次数最多的唯一组合。输出应该是表中出现次数最多的组合。

到目前为止,下面的 sql 返回所有具有 3 个或更多值的组合。但它无法输出小于 3 个值的值。意思是,1002 & 1003未输出,因为它们的值少于 3 个但需要输出。该解决方案还需要处理此类情况。

select p1.pat_id, p1.diagnosis, p1.tests, p2.tests, p3.tests
from patient p1 join
patient p2
on p1.pat_id = p2.pat_id and p1.diagnosis = p2.diagnosis and
p1.tests < p2.tests join
patient p3
on p2.pat_id = p3.pat_id and p2.diagnosis = p3.diagnosis and
p2.tests < p3.tests ;

另请说明我如何识别最常出现的组合。谢谢。

最佳答案

您可以将查询与左连接一起使用,以允许第二个和第三个测试为 NULL。但是随后您将需要删除具有两个或多个测试的组的 NULL 行。您可以使用相关(依赖)COUNT(*) 子查询来实现:

select
p1.pat_id,
p1.diagnosis,
p1.tests as test1,
p2.tests as test2,
p3.tests as test3
from patient p1
left join patient p2
on p2.diagnosis = p1.diagnosis
and p2.pat_id = p1.pat_id
and p2.tests > p1.tests
left join patient p3
on p3.diagnosis = p1.diagnosis
and p3.pat_id = p1.pat_id
and p3.tests > p2.tests
where
case (
select count(*)
from patient p
where p.diagnosis = p1.diagnosis
and p.pat_id = p1.pat_id
)
when 1 then true
when 2 then p2.tests is not null
else p3.tests is not null
end
order by p1.pat_id, p1.diagnosis

结果:

| pat_id | diagnosis  | test1 | test2 | test3 |
| ------ | ---------- | ----- | ----- | ----- |
| 1001 | Thyroid | Blood | CAT | MRI |
| 1001 | Tonsil | CAT | MRI | RAPID |
| 1001 | Tonsil | Blood | MRI | RAPID |
| 1001 | Tonsil | Blood | CAT | MRI |
| 1001 | Tonsil | Blood | CAT | RAPID |
| 1002 | Pneumonia | Eliza | MRI | |
| 1003 | Bronchitis | X-Ray | | |

View on DB Fiddle

要按出现次数对不同的组合进行排序,您只需将其修改为 GROUP BY ... ORDER BY COUNT(*) 查询:

select 
p1.tests as test1,
p2.tests as test2,
p3.tests as test3,
count(*) as cnt
from patient p1
left join patient p2
on p2.diagnosis = p1.diagnosis
and p2.pat_id = p1.pat_id
and p2.tests > p1.tests
left join patient p3
on p3.diagnosis = p1.diagnosis
and p3.pat_id = p1.pat_id
and p3.tests > p2.tests
where
case (
select count(*)
from patient p
where p.diagnosis = p1.diagnosis
and p.pat_id = p1.pat_id
)
when 1 then true
when 2 then p2.tests is not null
else p3.tests is not null
end
group by p1.tests, p2.tests, p3.tests
order by cnt desc

结果:

| test1 | test2 | test3 | cnt |
| ----- | ----- | ----- | --- |
| Blood | CAT | MRI | 2 |
| CAT | MRI | RAPID | 1 |
| Blood | MRI | RAPID | 1 |
| Eliza | MRI | | 1 |
| X-Ray | | | 1 |
| Blood | CAT | RAPID | 1 |

View on DB Fiddle

关于mysql - 生成组合并确定出现次数最多的组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57855939/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com