gpt4 book ai didi

sql - Hive查询使用 “not column=value” where子句删除空值

转载 作者:行者123 更新时间:2023-12-02 19:48:21 24 4
gpt4 key购买 nike

table1数据样本:

year month day utmsource
2017 03 26 NULL
2017 03 27 NULL
2017 03 27 facebook
2017 03 27 newsletter
2017 03 27 banner
2017 03 27 facebook

预期选择:
year month day utmsource
2017 03 27 NULL
2017 03 27 newsletter
2017 03 27 banner

我的Hive查询:
-- result = 0, it did not include the NULL utmsource record
SELECT SUM(CASE WHEN utmsource IS NULL THEN 1 ELSE 0 END) as amountnull
FROM table1
WHERE year=2017 AND month=03 AND day=27 AND NOT utmsource="facebook"

-- result = 1 the NULL utmsource record is included
SELECT SUM(CASE WHEN utmsource IS NULL THEN 1 ELSE 0 END) as amountnull
FROM table1
WHERE year=2017 AND month=03 AND day=27 AND (utmsource IS NULL OR NOT utmsource="facebook")

-- also returns 0, the NULL utmsource record is not included
SELECT SUM(CASE WHEN utmsource IS NULL THEN 1 ELSE 0 END) as amountnull
FROM table1
WHERE year=2017 AND month=03 AND day=27 AND NOT utmsource <=> 'facebook';

问题:
  • 有人可以解释这种行为吗?
  • 我可以将设置更改为
    检索查询2的结果而不添加额外的OR
    查询功能? =>不等于在结果
  • 中包含空值

    最佳答案

    您想要的是一个NULL-安全相等(或不相等)运算符。在ANSI SQL中,有一个名为is distinct from的运算符。 Hive似乎使用MySQL版本,即<=>。因此,您可以执行以下操作:

    SELECT SUM(CASE WHEN utmsource IS NULL THEN 1 ELSE 0 END) as amountnull
    FROM tablename
    WHERE year=2017 AND month=03 AND day=27 AND NOT utmsource <=> 'facebook';

    该操作符在 documentation中进行了描述。

    我还应该指出,您可能会发现这是 SELECT的更简单公式:
    SELECT (COUNT(*) - COUNT(utmsource)) as amountnull
    FROM tablename
    WHERE year=2017 AND month=03 AND day=27 AND NOT utmsource <=> 'facebook';

    尽管总的来说,这似乎是最简单的:
    SELECT COUNT(*)as amountnull
    FROM tablename
    WHERE year=2017 AND month=03 AND day=27 AND utmsource IS NULL;

    'Facebook'的比较是不必要的。

    关于sql - Hive查询使用 “not column=value” where子句删除空值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43120560/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com