gpt4 book ai didi

sql - 如何在 SQL 中为具有多个关系的相关地址创建新的标识符?

转载 作者:行者123 更新时间:2023-12-03 03:57:05 24 4
gpt4 key购买 nike

我有一个如下所示的数据集,其中包括地址和 customer_id。在此示例中,多个客户可以运送到同一地址,并且一个客户可以运送到多个地址。我想利用它们之间的关系来创建一个新 ID,将所有相关地址和 customer_id 与新标识符连接起来。

原始表格

+-------+-----+
|address| id |
+-------+-----+
| 11 rd | aa |
+-------+-----+
| 11 rd | ab |
+-------+-----+
| 21 dr | ac |
+-------+-----+
| 21 dr | ab |
+-------+-----+
| 31 rd | ad |
+-------+-----+
| 21 dr | abb |
+-------+-----+
| 41 dr | abb |
+-------+-----+

期望输出表

+-------+-----+--------+
|address| id | new_id |
+-------+-----+--------+
| 11 rd | aa | 1 |
+-------+-----+--------+
| 11 rd | ab | 1 |
+-------+-----+--------+
| 21 dr | ac | 1 |
+-------+-----+--------+
| 21 dr | ab | 1 |
+-------+-----+--------+
| 31 rd | ad | 2 |
+-------+-----+--------+
| 21 dr | abb | 1 |
+-------+-----+--------+
| 41 dr | abb | 1 |
+-------+-----+--------+

挑战在于相关关联的数量可能是无限的,并且我下面的 SQL 只能跨两个连接工作。

这是我的 SQL,它适用于这个小数据集,但当扩展到更多关系时,它将需要更多联接。任何有关构建此结构的正确方法的想法将不胜感激!

CREATE TABLE #temp (address char(40), id char(40))
INSERT INTO #temp VALUES ('11 rd', 'aa');
INSERT INTO #temp VALUES ('11 rd', 'ab');
INSERT INTO #temp VALUES ('21 dr', 'ac');
INSERT INTO #temp VALUES ('21 dr', 'ab');
INSERT INTO #temp VALUES ('31 rd', 'ad');
INSERT INTO #temp VALUES ('21 dr', 'abb');
INSERT INTO #temp VALUES ('41 dr', 'abb');

SELECT
*
,DENSE_RANK() OVER(PARTITION BY id ORDER BY address ASC)as address_rank
,DENSE_RANK() OVER(PARTITION BY address ORDER BY id ASC)as id_rank
INTO #temp2
FROM #temp

SELECT a.address,a.id_rank,a.id,b.address as combined_address
INTO #temp3
FROM #temp2 a
LEFT JOIN #temp2 b ON a.id=b.id AND b.address_rank = 1

SELECT
a.address
,a.id
,DENSE_RANK() OVER(ORDER BY b.combined_address ASC)as new_id
FROM #temp3 a
LEFT JOIN #temp3 b ON a.combined_address=b.address and b.id_rank = 1

最佳答案

这是一个图行走问题。我首先稍微重新表述一下数据。它有助于每一行都有一个唯一的标识符,适本地称为id。因此,我将 id 重命名为 name

然后,构造边并使用递归 CTE 遍历图。在 SQL Server 中,这看起来像:

with edges as (
select distinct t1.id as id1, t2.id as id2
from temp t1 join
temp t2
on (t1.address = t2.address or t1.name = t2.name) and (t1.id <> t2.id)
),
cte as (
select id, id as next_id, convert(varchar(max), concat(',', id, ',')) as visited, 1 as lev
from temp
union all
select cte.id, e.id2, concat(visited, e.id2, ','), lev + 1
from cte join
edges e
on cte.next_id = e.id1
where visited not like concat('%,', e.id2, ',%') and lev < 10
)
select id, min(next_id)
from cte
group by id;

Here是一个数据库<> fiddle 。

关于sql - 如何在 SQL 中为具有多个关系的相关地址创建新的标识符?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64145880/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com