gpt4 book ai didi

Sql Server 2008 - 不带聚合功能的 PIVOT

转载 作者:行者123 更新时间:2023-12-03 01:36:40 25 4
gpt4 key购买 nike

我知道您有多个主题涉及此问题。但是,我还没有找到一个可以满足我的需求的。我需要(按需)将选择的深层表数据转换为宽输出表。其中的问题是我无法将聚合与 Pivot 一起使用,因为它会消耗输出中所需的响应。我已经找到了一个解决方案,但我认为这不是最好的,因为它需要无数的左连接才能工作。我已将所有尝试和注释包含在内,如下所示:

-- Sql Server 2008 db.-- Deep table structure (not subject to modification) contains name/value pairs with a userId as-- foreign key.  In many cases there can be MORE THAN ONE itemValue given by the user for the-- itemName such as if asked their race, can answer White + Hispanic, etc.  Each response is stored-- as a seperate record - this cannot currently be changed.-- Goal: pivot deep data to wide while also compressing result -- set down. Account for all items per userId, and duplicating-- column values (rather than show nulls) as applicable-- Sample table to store some data of both single and multiple responsesDECLARE @testTable AS TABLE(userId int, itemName varchar(50), itemValue varchar(255))INSERT INTO @testTableSELECT 1, 'q01', '1-q01 Answer'UNION SELECT 1, 'q02', '1-q02 Answer'UNION SELECT 1, 'q03', '1-q03 Answer 1'UNION SELECT 1, 'q03', '1-q03 Answer 2'UNION SELECT 1, 'q03', '1-q03 Answer 3'UNION SELECT 1, 'q04', '1-q04 Answer'UNION SELECT 1, 'q05', '1-q05 Answer'UNION SELECT 2, 'q01', '2-q01 Answer'UNION SELECT 2, 'q02', '2-q02 Answer'UNION SELECT 2, 'q03', '2-q03 Answer 1'UNION SELECT 2, 'q03', '2-q03 Answer 2'UNION SELECT 2, 'q04', '2-q04 Answer'UNION SELECT 2, 'q05', '2-q05 Answer'SELECT 'Raw Data'SELECT * FROM @TestTableSELECT 'Using Pivot - shows aggregate result of itemValue per itemName - eats others'; WITH Data AS (    SELECT        [userId]        , [itemName]        , [itemValue]    FROM         @testTable)SELECT    [userId]    , [q02]    , [q03]    , [q05]FROM    DataPIVOT(    MIN(itemValue)  -- Aggregate function eats needed values.    FOR itemName in ([q02], [q03], [q05])) AS PivotTableSELECT 'Aggregate with Grouping - Causes Null Values'SELECT    DISTINCT userId     ,[q02] = Max(CASE WHEN itemName = 'q02' THEN itemValue END)    ,[q03] = Max(CASE WHEN itemName = 'q03' THEN itemValue END)    ,[q05] = Max(CASE WHEN itemName = 'q05' THEN itemValue END)FROM    @testTableWHERE    itemName in ('q02', 'q03', 'q05')   -- Makes it a hair quickerGROUP BY    userId  -- If by userId only, it only gives 1 row PERIOD = BAD!!    , [itemName]    , [itemValue]SELECT 'Multiple Left Joins - works properly but bad if pivoting 175 columns or so'; WITH Data AS (    SELECT        userId         ,[itemName]        ,[itemValue]    FROM        @testTable    WHERE        itemName in ('q02', 'q03', 'q05')   -- Makes it a hair quicker)SELECT    DISTINCT s1.userId    ,[q02] = s2.[itemValue]    ,[q03] = s3.[itemValue]    ,[q05] = s5.[itemValue]FROM    Data s1    LEFT JOIN Data s2         ON s2.userId = s1.userId             AND s2.[itemName] = 'q02'    LEFT JOIN Data s3         ON s3.userId = s1.userId             AND s3.[itemName] = 'q03'    LEFT JOIN Data s5         ON s5.userId = s1.userId             AND s5.[itemName] = 'q05'

因此,底部查询是唯一一个(到目前为止)可以完成我需要它做的事情,但是当我使用实际项目名称进行透视时,LEFT JOIN 将失控并导致性能问题。如有任何建议,我们将不胜感激。

最佳答案

我认为您必须坚持使用联接,因为联接正是产生您想要的结果的方式。连接的目的是将行集组合在一起(有条件或无条件),并且您的目标输出只不过是行子集的组合。

但是,如果大多数问题始终只有单一答案,则可以大大减少必要的联接数量。这个想法是仅将多个响应组作为单独的行集连接起来。对于单响应项目,它们仅作为目标项目整个数据集的一部分进行连接。

一个例子应该可以更好地说明我可能无法口头描述的内容。假设源数据中有两个潜在的多重响应组,'q03''q06'(实际上,这是源表:

DECLARE @testTable AS TABLE(
userId int,
itemName varchar(50),
itemValue varchar(255)
);

INSERT INTO @testTable
SELECT 1, 'q01', '1-q01 Answer'
UNION SELECT 1, 'q02', '1-q02 Answer'
UNION SELECT 1, 'q03', '1-q03 Answer 1'
UNION SELECT 1, 'q03', '1-q03 Answer 2'
UNION SELECT 1, 'q03', '1-q03 Answer 3'
UNION SELECT 1, 'q04', '1-q04 Answer'
UNION SELECT 1, 'q05', '1-q05 Answer'
UNION SELECT 1, 'q06', '1-q06 Answer 1'
UNION SELECT 1, 'q06', '1-q06 Answer 2'
UNION SELECT 1, 'q06', '1-q06 Answer 3'
UNION SELECT 2, 'q01', '2-q01 Answer'
UNION SELECT 2, 'q02', '2-q02 Answer'
UNION SELECT 2, 'q03', '2-q03 Answer 1'
UNION SELECT 2, 'q03', '2-q03 Answer 2'
UNION SELECT 2, 'q04', '2-q04 Answer'
UNION SELECT 2, 'q05', '2-q05 Answer'
UNION SELECT 2, 'q06', '2-q06 Answer 1'
UNION SELECT 2, 'q06', '2-q06 Answer 2'
;

与原始帖子中的表格相同,但添加了 'q06' 项),生成的脚本可能如下所示:

WITH ranked AS (
SELECT
*,
rn = ROW_NUMBER() OVER (PARTITION BY userId, itemName ORDER BY itemValue)
FROM @testTable
),
multiplied AS (
SELECT
r.userId,
r.itemName,
r.itemValue,
rn03 = r03.rn,
rn06 = r06.rn
FROM ranked r03
INNER JOIN ranked r06 ON r03.userId = r06.userId AND r06.itemName = 'q06'
INNER JOIN ranked r ON r03.userId = r.userId AND (
r.itemName = 'q03' AND r.rn = r03.rn OR
r.itemName = 'q06' AND r.rn = r06.rn OR
r.itemName NOT IN ('q03', 'q06')
)
WHERE r03.itemName = 'q03'
AND r.itemName IN ('q02', 'q03', 'q05', 'q06')
)
SELECT userId, rn03, rn06, q02, q03, q05, q06
FROM multiplied
PIVOT (
MIN(itemValue)
FOR itemName in (q02, q03, q05, q06)
) AS PivotTable

关于Sql Server 2008 - 不带聚合功能的 PIVOT,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7854553/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com