gpt4 book ai didi

sql - UPDATE 中的 CASE 会产生意想不到的结果。移至 WHERE 子句时修复。为什么?

转载 作者:行者123 更新时间:2023-12-02 00:54:32 26 4
gpt4 key购买 nike

我创建了一个查询来更新标志,我使用 CASE 语句来确定值。但是,当我将查询作为 UPDATE 语句运行时,只有大约一半的预期行被更新?更有趣的是,我之前对相同的数据运行了完全相同的 UPDATE 查询,并且它按预期工作(查看旧的与新的是促使我进行调查的原因)。

我尝试使用相同的 CASE 语句进行 SELECT 查询,我得到了正确的结果,但是将它切换回 UPDATE 只更新了大约一半的记录。

将条件移动到 WHERE 子句解决了这个问题。好像是SET部分的CASE语句出问题了。我不明白为什么?我想知道,以免以后犯任何错误。

原始代码:

UPDATE D
SET PUBLISH_FLAG =
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT

工作代码:

UPDATE D
SET PUBLISH_FLAG = 'Y'
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT
WHERE
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'

我认为两者应该产生完全相同的结果?我错过了什么?

为了帮助澄清下面的代码有 2 个显示差异,“PUBLISH_FLAG”列(使用我的原始代码或 PSK 的答案更新)有 10162 个“Y”值(其余的“N”),pub_2 列有更正 18917 个“Y”值。

SELECT
PUBLISH_FLAG,
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END as pub_2
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT
WHERE
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END = 'Y'

最佳答案

您的第一个查询与第二个肯定不同。事实上,从我在这里看到的情况来看,我会声明您使用 CASE 进行的更新是正确的,因为它正在更新标志的两侧。另一个带有 WHERE 的查询不会将标志更新为 N 它应该更新的位置。您究竟如何确定预期的“正确”更新数量?我认为您希望 UPDATE 语句更新的行数与 SELECT 语句一样多,但情况并非总是如此。根据您的过滤器,您正在创建的 JOIN 可能会产生笛卡尔积。

考虑下面的查询。

CREATE TABLE #table1 (Field_1 INT, Field_2 VARCHAR(MAX))
INSERT INTO
#table1
VALUES
(1, 'Item A'),
(2, 'Item B'),
(3, 'Item C'),
(4, 'Item D'),
(5, 'Item E')

CREATE TABLE #table2 (Field_1 INT, Field_2 VARCHAR(MAX))
INSERT INTO
#table2
VALUES
(1, 'Item A'),
(1, 'Item B'),
(2, 'Item B'),
(2, 'Item C'),
(3, NULL)

-- This produces 7 rows:
SELECT
*
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]

-- This updates 1 row. This is akin to your second query. Only one flag value is changed.
-- You would still have to write an UPDATE statement for the 'N' flag update.
UPDATE
#table1
SET
#table1.[Field_2] = 'Y'
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
WHERE
#table2.[Field_2] = 'Item C'

-- Because your UPDATE statement only updates the values to 'Y' where a condition matches, only one record is changed here.
-- The others are left untouched.
SELECT
*
FROM
#table1

-- Now what happens if we perform the reverse UPDATE.
UPDATE
#table1
SET
#table1.[Field_2] = 'N'
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
WHERE
NOT (#table2.[Field_2] = 'Item C')

-- First of all we notice that we are not dealing with NULL values at all so only two records get changed to 'N'.
-- The first record gets changed because it does not have a match on 'Item C'.
-- The second record also gets changed because it does not have a match on 'Item C', i.e. there is at least one record without an 'Item C' match.
-- The last three records have either no match in the JOIN or are NULL in #table2. Meaning they are not updated.
-- This is why I'm more a fan of your CASE query, because in theory it should deal with setting everything to the correct value.
SELECT
*
FROM
#table1

-- Let's see what would happen with a CASE statement.
-- Since our JOIN is a cartesian product there are multiple options for #table1.Id == 2: it can be updated to both N and Y.
-- N is chosen by T-SQL. You will see that after the UPDATE.
SELECT
*, CASE WHEN #table2.[Field_2] = 'Item C' THEN 'Y' ELSE 'N' END
FROm
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]

-- This updates 5 rows, maybe you would have expected 7 here based on the above SELECT statement?
-- You can also notice how it updates everything to N, that's because our CASE deals with both sides.
-- It's either 'Y' or either 'N'. It will always touch every record it can to UPDATE it.
-- This in contrast with an UPDATE statement which will only touch one side and because of JOIN clauses and NULL values
-- it's entirely possible that both UPDATE statements do not touch the entire table if written incorrectly.

-- You would have to write an UPDATE statement like this one, which comes after the first.
--UPDATE
-- #table1
--SET
-- #table1.[Field_2] = 'N'
--FROM
-- #table1
--LEFT JOIN
-- #table2 ON #table1.[Field_1] = #table2.[Field_1]
--WHERE
-- #table1.[Field_2] <> 'Y' OR #table1.[Field_2] IS NULL

-- In conclusion this means that if you want to be absolutely sure you have updated all values to their correct setting: use CASE.
-- But if you only care about setting 'Y' to the correct value: don't use CASE.
-- If you do use CASE, make sure you are definitely performing your JOIN correct and you are calculating the correct value for both sides.
UPDATE
#table1
SET
#table1.[Field_2] = CASE WHEN #table2.[Field_2] = 'Item C' THEN 'Y' ELSE 'N' END
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]

SELECT
*
FROM
#table1

DROP TABLE #table1
DROP TABLE #table2

关于sql - UPDATE 中的 CASE 会产生意想不到的结果。移至 WHERE 子句时修复。为什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55349925/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com