gpt4 book ai didi

sql - 联接表中的 count(distinct) 返回重复/不正确的值

转载 作者:搜寻专家 更新时间:2023-10-30 20:25:58 25 4
gpt4 key购买 nike

SQL:

SELECT COUNT(DISTINCT person.p_id) AS numberOfPeople, 
location.l_id AS location
FROM job
INNER JOIN person ON job.j_person = person.p_id
INNER JOIN (location INNER JOIN area ON location.l_area = area.a_id) ON job.j_location = location.l_id
GROUP BY area.a_name, location.l_name

数据库:“工作”表与“人员”(在 j_person = p_id 上) 和“位置”(在 j_location = l_id 上) 有链接/em>

Table: person (list of all people in the company, PK = p_id)
+------+--------+--
| p_id | p_name | etc.
+------+--------+--
| 01 | John | ...
+------+--------+--
| 02 | Suzy | ...
+------+--------+--
| 03 | Mike | ...
+------+--------+--
| 04 | Kim | ...
+------+--------+--


Table: job (list of all jobs, PK = j_id)
+------+----------+------------+--------+
| j_id | j_person | j_location | j_type |
+------+----------+------------+--------+
| AB | 02 | cityB | type2 |
+------+----------+------------+--------+
| CD | 02 | cityA | type3 |
+------+----------+------------+--------+
| EF | 01 | cityC | type2 |
+------+----------+------------+--------+
| GH | 03 | cityB | type1 |
+------+----------+------------+--------+
| IJ | 04 | cityA | type1 |
+------+----------+------------+--------+
| KL | 04 | cityA | type2 |
+------+----------+------------+--------+


Table: location (list of all locations, PK = l_id)
+-------+----------+--------+
| l_id | l_name | l_area |
+-------+----------+----
| cityA | London | ...
+-------+----------+----
| cityB | New York | ...
+-------+----------+----
| cityC | Brussels | ...
+-------+----------+----

我需要什么:

每个城市的人员列表,以下是此 SQL 语句的结果:

  • 区域 1:
    • 伦敦:2
    • 纽约:2
  • 区域 2:
    • 布鲁塞尔:1

但是...现在来解决我的问题

结果不能显示任何重复的数字/人。例如:Suzy (p_id = 02) 在伦敦和纽约都有工作,但为了最终数字正确,她可能只在这 2 个城市中的 1 个城市中被计算在内。

我想我正在寻找一些解决方案,可以消除任何已经包含/计算在内的结果,以便它们不能在另一个/下一个城市再次计算。当对每个城市的人数求和时,该结果必须与表“person”中的记录总数相同。

例如,这不是问题。 Suzy 不会被包含在比方说纽约中,因为地点/城市是更大区域的一部分。一个人永远只在一个区域内工作。


我在尝试解释我想要实现的目标时遇到了一些麻烦,而且我不是英语母语人士,所以如果有什么地方不够清楚,请告诉我。

最佳答案

为此,您首先必须在分组之前将每人的工作数量限制为 1。这是一种方法:

with person as (select 1 p_id, 'John' p_name from dual union all
select 2 p_id, 'Suzy' p_name from dual union all
select 3 p_id, 'Mike' p_name from dual union all
select 4 p_id, 'Kim' p_name from dual),
jobs as (select 'AB' j_id, 2 j_person, 'cityB' j_location, 'type2' j_type from dual union all
select 'CD' j_id, 2 j_person, 'cityA' j_location, 'type3' j_type from dual union all
select 'EF' j_id, 1 j_person, 'cityC' j_location, 'type2' j_type from dual union all
select 'GH' j_id, 3 j_person, 'cityB' j_location, 'type1' j_type from dual union all
select 'IJ' j_id, 4 j_person, 'cityA' j_location, 'type1' j_type from dual union all
select 'KL' j_id, 4 j_person, 'cityA' j_location, 'type2' j_type from dual),
location as (select 'cityA' l_id, 'London' l_name from dual union all
select 'cityB' l_id, 'New York' l_name from dual union all
select 'cityC' l_id, 'Brussels' l_name from dual)
-- end of setting up some subqueries to mimic your tables with data in them. See SQL below:
select location_name,
count(distinct person_id) number_of_people
from (select p.p_id person_id,
p.p_name person_name,
l.l_name location_name,
j.j_type job_type,
row_number() over (partition by p.p_id order by j.j_type, l.l_name) rn
from jobs j
inner join person p on j.j_person = p.p_id
inner join location l on j.j_location = l.l_id)
where rn = 1
group by location_name;

LOCATION_NAME NUMBER_OF_PEOPLE
------------- ----------------
London 1
Brussels 1
New York 2

您可以看到我已经使用 row_number() 分析函数为每个 p_id 的行分配一个数字,按照工作类型和位置名称的顺序。如果决定针对 row_number = 1 的行列出哪个位置的逻辑与此不同,您需要适当修改排序子句。

从那里开始,只需过滤结果以仅显示每个 p_id 的第一行,然后对结果进行分组以获得不同的人数。

关于sql - 联接表中的 count(distinct) 返回重复/不正确的值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34023502/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com