gpt4 book ai didi

tsql - T-SQL-连接多个表导致重复的行

转载 作者:行者123 更新时间:2023-12-04 13:36:31 25 4
gpt4 key购买 nike

我正在努力从下面的示例Fiddle中获得以下结果。

结果:

enter image description here

源表:

enter image description here

Fiddle

这个想法是通过查找预算和预测ID的最大数量来创建每个列,从而为每个RecordID创建一行,以创建其他列(如果该预算或预测ID的值不存在,则将列单元格留空)。

我尝试了PIVOT函数,但甚至无法获得令人满意的结果。

UPDATE :
请参见下图,我试图解释预期的输出:

用词:属于记录ID的每个预算ID的分别为 BDate 结果(百分比*状态记录的BAmount表)和创建单独的列。

在我的示例中,RecordID 55 在“预算”表中有两个条目-因此,需要 2x3 列才能在此RecordID的单行的自己列中显示每个日期,结果和状态。

由于RecordID 77 在“预算”表中具有大多数(三个)条目,因此可用于在所有行上创建3x3列。

预测也是如此。

output

我希望你能帮助我。

谢谢你。

最佳答案

再会,

Note! My goal is to help the reader to learn and not to provide final query as a solution for his homework. Therefore I will present the solution in two steps, and I will add several "PRINT" command in the dynamic solution, so the reader will have option to check the intermediate step in the work.



OP提供的DDL + DML:
DROP TABLE IF EXISTS Budget;
CREATE TABLE Budget
(BudgetID int, RecordID int, BDate date,Percentage int, [Status] varchar(50));
INSERT INTO Budget
(BudgetID, RecordID, BDate,Percentage,Status)
VALUES
(1, 55, '2017-01-01', 60, 'ordered'),
(2, 55, '2017-03-24', 40, 'ordered'),
(3, 66, '2018-08-15', 100, 'invoiced'),
(4, 77, '2018-12-02', 25, 'paid'),
(5, 77, '2018-09-10', 35, 'ordered'),
(6, 77, '2019-07-13', 40, 'ordered')
GO

DROP TABLE IF EXISTS Forecast;
CREATE TABLE Forecast
(ForecastID int, RecordID int, FDate date, Percentage int);
INSERT INTO Forecast
(ForecastID, RecordID, FDate,Percentage)
VALUES
(1, 55, '2020-12-01', 100),
(2, 77, '2023-05-17', 25),
(3, 77, '2024-11-28', 75)
GO

DROP TABLE IF EXISTS Records;
CREATE TABLE Records
(RecordID int, BAmount int, FAmount int, Name varchar(40), Description varchar(40) )
;
INSERT INTO Records
(RecordID, BAmount,FAmount,Name, Description)
VALUES
(55, 15000, 33000, 'Prod1', 'Desc1' ),
(66, 22000, 17500, 'Prod2', 'Desc2' ),
(77, 40000, 44000, 'Prod3', 'Desc3' )
GO

select * from Budget
select * from Forecast
select * from Records

让我们首先展示一个简单的静态解决方案

这是基于以下知识:预算表中的每个RecordID最多具有三行,而预测表中的每个RecordID最多具有两行。这将有助于理解我接下来将展示的动态解决方案
;With CteBudget as (
select
b.BDate, b.BudgetID, b.Percentage, b.RecordID, b.Status
,RN = ROW_NUMBER() OVER (partition by b.RecordID order by b.BudgetID)
from Budget b
),
CteForecast as (
select
f.FDate, f.ForecastID, f.Percentage, f.RecordID
,RN = ROW_NUMBER() OVER (partition by f.RecordID order by f.ForecastID)
from Forecast f
)
select
r.RecordID, r.Name, r.Description,

b1.BDate BDate1, (b1.Percentage * r.BAmount)/100 BResult1, b1.Status BStatus1,
b2.BDate BDate2, (b2.Percentage * r.BAmount)/100 BResult2, b2.Status BStatus2,
b3.BDate BDate3, (b3.Percentage * r.BAmount)/100 BResult3, b3.Status BStatus3,

f1.FDate FDate1, (f1.Percentage * r.BAmount)/100 FResult1,
f2.FDate FDate2, (f2.Percentage * r.BAmount)/100 FResult2

from Records r
left join CteBudget b1 on r.RecordID = b1.RecordID and b1.RN = 1
left join CteBudget b2 on r.RecordID = b2.RecordID and b2.RN = 2
left join CteBudget b3 on r.RecordID = b3.RecordID and b3.RN = 3
left join CteForecast f1 on r.RecordID = f1.RecordID and f1.RN = 1
left join CteForecast f2 on r.RecordID = f2.RecordID and f2.RN = 2
--where r.RecordID = 77
GO

Note! For static solution and without indexes (as I will add in the end), the above solution is VERY bad regarding performance, but once we add the right index and as a base for a dynamic solution this option should fit well.



现在我们可以提出动态解决方案。
-- Get number of columns
Declare @NumBudget tinyint
Declare @NumForecast tinyint
SELECT @NumBudget = MAX(C) FROM (
select COUNT(RecordID) C
from Budget
GROUP BY RecordID
) t
SELECT @NumForecast = MAX(C) FROM (
select COUNT(RecordID) C
from Forecast
GROUP BY RecordID
) t
---------------------------------------------
DECLARE @SQLString1 nvarchar(MAX) = '';
DECLARE @SQLString2 nvarchar(MAX) = '';
DECLARE @loop int = 1;
WHILE @loop <= @NumBudget BEGIN
SET @SQLString1 = @SQLString1 + N'
b' + CONVERT(VARCHAR(2),@loop) + '.BDate BDate' + CONVERT(VARCHAR(2),@loop) + ', (b' + CONVERT(VARCHAR(2),@loop) + '.Percentage * r.BAmount)/100 BResult' + CONVERT(VARCHAR(2),@loop) + ', b' + CONVERT(VARCHAR(2),@loop) + '.Status BStatus' + CONVERT(VARCHAR(2),@loop) + ', '

SET @SQLString2 = @SQLString2 + N'
left join CteBudget b' + CONVERT(VARCHAR(2),@loop) + ' on r.RecordID = b' + CONVERT(VARCHAR(2),@loop) + '.RecordID and b' + CONVERT(VARCHAR(2),@loop) + '.RN = 1'

SET @loop = @loop + 1
END
SET @loop = 1

WHILE @loop <= @NumForecast BEGIN
SET @SQLString1 = @SQLString1 + N'
f' + CONVERT(VARCHAR(2),@loop) + '.FDate FDate' + CONVERT(VARCHAR(2),@loop) + ', (f' + CONVERT(VARCHAR(2),@loop) + '.Percentage * r.BAmount)/100 FResult' + CONVERT(VARCHAR(2),@loop) + ','

SET @SQLString2 = @SQLString2 + N'
left join CteForecast f' + CONVERT(VARCHAR(2),@loop) + ' on r.RecordID = f' + CONVERT(VARCHAR(2),@loop) + '.RecordID and f' + CONVERT(VARCHAR(2),@loop) + '.RN = 1'

SET @loop = @loop + 1
END
SET @SQLString1 = STUFF (@SQLString1, LEN(@SQLString1) , 1 , '')
PRINT '/************************************************/'
PRINT @SQLString1
PRINT @SQLString2
PRINT '/************************************************/'

DECLARE @SQLString nvarchar(MAX);
SET @SQLString = N'
;With CteBudget as (
select
b.BDate, b.BudgetID, b.Percentage, b.RecordID, b.Status
,RN = ROW_NUMBER() OVER (partition by b.RecordID order by b.BudgetID)
from Budget b
),
CteForecast as (
select
f.FDate, f.ForecastID, f.Percentage, f.RecordID
,RN = ROW_NUMBER() OVER (partition by f.RecordID order by f.ForecastID)
from Forecast f
)
select
r.RecordID, r.Name, r.Description,
'
+ @SQLString1
+ N'
from Records r'
+ @SQLString2

print @SQLString

EXECUTE sp_executesql @SQLString
GO

IMPORTANT! This solution is not necessarily the one that gives the best performance, but the one that is probably the simplest to follow and understand. On production once we will have the real DDL+DML and the server will have the statistics, then we will be able to improve the performance and chose best solution for our specific case.



指标

Note! The above solution might lead to a lot of sorting of the data, and having the right indexes is HIGHLY important here! It is important to test several different options and select the best.



为了论坛(或任何您称为stackoverflow的原因,我认为这不是讨论论坛界面而是Q&A界面),我添加了查询以创建假定在生产中使用的CLUSTERED INDEX,以及一个可选的NONCLUSTERED INDEX您应该进行测试(我没有测试其他选项,这意味着我想到了第一个选项,因此建议您使用真实的DDL + DML继续检查正确的索引)。
-- CLUSTERED INDEX
CREATE CLUSTERED INDEX IX_Budget_BudgetID
ON dbo.Budget (BudgetID);
GO
CREATE CLUSTERED INDEX IX_Forecast_ForecastID
ON dbo.Forecast (ForecastID);
GO
CREATE CLUSTERED INDEX IX_Records_RecordID
ON dbo.Records (RecordID);
GO

-- NONCLUSTERED INDEX
CREATE NONCLUSTERED INDEX NX_Budget_RecordID_BudgetID
ON dbo.Budget (RecordID,BudgetID);
GO
CREATE NONCLUSTERED INDEX NX_Forecast_RecordID_ForecastID
ON dbo.Forecast (RecordID,ForecastID);
GO
CREATE NONCLUSTERED INDEX NX_Records_RecordID_RecordID
ON dbo.Records (RecordID);
GO

关于tsql - T-SQL-连接多个表导致重复的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51174706/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com