gpt4 book ai didi

mysql - SQL统计一对多关系中不匹配的记录

转载 作者:行者123 更新时间:2023-11-29 07:02:40 27 4
gpt4 key购买 nike

我有两个 MySQL 表:

调查(日期、地点、公里数)主键:日期+地点 (每个调查一条记录)

标本(日期、地点、物种)(每个调查日期和地点零个或多个记录)

我想查找样本表中不包含特定物种记录的调查计数和调查公里数总和。换句话说,未发现特定物种的调查数量。

调查总数为:

select count(date) as surveys, sum(kilometers) as KM_surveyed 
from surveys;

+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 20141 | 40673.59 |
+---------+-------------+

查找未发现样本的调查数量很容易:

select count(s.date) as surveys, sum(s.kilometers) as KM_surveyed 
from surveys=s left join specimens=p
on (s.date=p.date and s.location=p.location)
where p.date is null;

+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 8820 | 15848.26 |
+---------+-------------+

样本中的记录总数为:

select count(*) from specimens;

+-----------+
| count(*) |
+-----------+
| 51566 |
+-----------+

所有调查中发现的斑海豹 (HASE) 的正确数量是:

select count(*) from specimens where species = 'HASE';

+-----------+
| count(*) |
+-----------+
| 662 |
+-----------+

查找发现斑海豹 (HASE) 的调查数量并不容易。
由于样本表通常包含每个调查的多个记录,因此此查询返回的不是调查数量,而是找到的 HASE 样本数量:

select count(s.date), sum(s.kilometers) 
from surveys=s
left join specimens=p on (s.date=p.date and s.location=p.location)
where p.species = 'HASE';

+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 662 | 2030.70 | WRONG! that is number of specimens not surveys
+---------+-------------+

查找未发现斑海豹 (HASE) 的调查数量也并非易事。此查询返回的不是调查数量,而是发现的不是斑海豹的样本数量:

select count(s.date), sum(s.kilometers) 
from surveys=s
left join specimens=p on (s.date=p.date and s.location=p.location)
where p.species <> 'HASE' or p.date is null;`

+---------+-------------+
| surveys | KM_surveyed |
+---------+-------------+
| 50904 | 151310.49 |
+---------+-------------+

错误! 50904 = 非 HASE 样本数量

如何构建查询来正确计算发现斑海豹的调查数量和未发现斑海豹的调查数量?

最佳答案

您可以在 WHERE 子句中使用 EXISTS/NOT EXISTS 子查询。

specimen表中找到HASE的调查:

select count(*), sum(s.kilometers)
from surveys s
where exists (
select *
from specimens p
where s.date=p.date
and s.location=p.location
and p.species = 'HASE'
)

specimen表中找不到HASE的调查:

select count(*), sum(s.kilometers)
from surveys s
where not exists (
select *
from specimens p
where s.date=p.date
and s.location=p.location
and p.species = 'HASE'
)

第一个查询的替代方案可以是:

select count(*), sum(s.kilometers)
from (
select distinct date, location
from specimens
where species = 'HASE'
) p
join surveys s using (date, location)

根据数据(如果“HASE”是稀有“物种”),它可能会更快。

Barmar 已经发布了第二个查询的可能最佳替代方案。

关于mysql - SQL统计一对多关系中不匹配的记录,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42821004/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com