gpt4 book ai didi

tsql - 使用 "UNION ALL"和 "GROUP BY"实现 "Intersect"

转载 作者:行者123 更新时间:2023-12-01 06:45:46 27 4
gpt4 key购买 nike

我提供了以下查询来查找 2 个数据集中的公共(public)记录,但我很难确定查询的正确性,因为我的数据库中有很多数据记录。

是否可以使用 UNION ALL 在“Customers”和“Employees”表之间实现 Intersect 并在结果上应用 GROUP BY,例如下面?

SELECT D.Country, D.Region, D.City
FROM (SELECT DISTINCT Country, Region, City
FROM Customers
UNION ALL
SELECT DISTINCT Country, Region, City
FROM Employees) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;

那么我们是否可以说存在于此查询结果中的任何记录也存在在“Customers & Employees”表之间的 Intersect 集中 ANDIntersect 中存在于“Customers & Employees”表之间的任何记录也会在此查询的结果中吗?

最佳答案

So is it right to say any record in result of this query is in "Intersect" set between "Customers & Employees" "AND" any record that exist in "Intersect" set between "Customers & Employees" is in result of this query too?

是的。

... 是的,但效率不高,因为您要过滤掉重复项 3 次而不是一次。在您的查询中,您是

  1. 使用 DISTINCT 从员工中提取唯一记录
  2. 使用 DISTINCT 从客户那里提取唯一记录
  3. 使用 UNION ALL 组合两个查询
  4. 在外部查询中使用 GROUP BY 来过滤您在步骤 1、2 和 3 中检索到的记录。

使用 INTERSECT 将返回相同的结果,但效率更高。要亲自查看,您可以创建下面的示例数据并运行两个查询:

use tempdb
go
if object_id('dbo.customers') is not null drop table dbo.customers;
if object_id('dbo.employees') is not null drop table dbo.employees;

create table dbo.customers
(
customerId int identity,
country varchar(50),
region varchar(50),
city varchar(100)
);

create table dbo.employees
(
employeeId int identity,
country varchar(50),
region varchar(50),
city varchar(100)
);

insert dbo.customers(country, region, city)
values ('us', 'N/E', 'New York'), ('us', 'N/W', 'Seattle'),('us', 'Midwest', 'Chicago');
insert dbo.employees
values ('us', 'S/E', 'Miami'), ('us', 'N/W', 'Portland'),('us', 'Midwest', 'Chicago');

运行这些查询:

SELECT D.Country, D.Region, D.City
FROM
(
SELECT DISTINCT Country, Region, City
FROM Customers
UNION ALL
SELECT DISTINCT Country, Region, City
FROM Employees
) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;

SELECT Country, Region, City
FROM dbo.customers
INTERSECT
SELECT Country, Region, City
FROM dbo.employees;

结果:

Country     Region     City
----------- ---------- ----------
us Midwest Chicago

Country Region City
----------- ---------- ----------
us Midwest Chicago

如果不能使用 INTERSECT 或者您想要更快的查询,您可以通过几种不同的方式改进您发布的查询,例如:

选项 1:让 GROUP BY 像这样处理所有重复数据删除:

这与您发布的内容相同,但没有区别

SELECT D.Country, D.Region, D.City
FROM
(
SELECT Country, Region, City
FROM Customers
UNION ALL
SELECT Country, Region, City
FROM Employees
) AS D
GROUP BY D.Country, D.Region, D.City
HAVING COUNT(*) = 2;

选项 2:使用 ROW_NUMBER

这是我的偏好并且可能是最有效的

SELECT Country, Region, City
FROM
(
SELECT
rn = row_number() over (partition by D.Country, D.Region, D.City order by (SELECT null)),
D.Country, D.Region, D.City
FROM
(
SELECT Country, Region, City
FROM Customers
UNION ALL
SELECT Country, Region, City
FROM Employees
) AS D
) uniquify
WHERE rn = 2;

关于tsql - 使用 "UNION ALL"和 "GROUP BY"实现 "Intersect",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46771656/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com