gpt4 book ai didi

sql - Impala 分析函数在 where 子句中

转载 作者:可可西里 更新时间:2023-11-01 15:27:15 26 4
gpt4 key购买 nike

所以这个问题的基本前提是我在 hadoop 中有一些巨大的表,我需要每个月从中获取一些样本。我模拟了下面的内容以显示我想要的东西,但显然这不是真实数据......

--Create the table
CREATE TABLE exp_dqss_team.testranking (
Name STRING,
Age INT,
Favourite_Cheese STRING
) STORED AS PARQUET;

--Put some data in
INSERT INTO TABLE exp_dqss_team.testranking
VALUES (
('Tim', 33, 'Cheddar'),
('Martin', 49, 'Gorgonzola'),
('Will', 39, 'Brie'),
('Bob', 63, 'Cheddar'),
('Bill', 35, 'Brie'),
('Ben', 42, 'Gorgonzola'),
('Duncan', 55, 'Brie'),
('Dudley', 28, 'Cheddar'),
('Edmund', 27, 'Brie'),
('Baldrick', 29, 'Gorgonzola'));

我想得到的是每个奶酪类别中最年轻的 2 个人。下面给出了每个奶酪类别的年龄排名,但不会将其限制在前两个:

SELECT RANK() OVER(PARTITION BY favourite_cheese ORDER BY age asc) AS rank_my_cheese, favourite_cheese, name, age
FROM exp_dqss_team.testranking;

如果我添加一个 WHERE 子句,它会给我以下错误:

WHERE clause must not contain analytic expressions

SELECT RANK() OVER(PARTITION BY favourite_cheese ORDER BY age asc) AS rank_my_cheese, favourite_cheese, name, age
FROM exp_dqss_team.testranking
WHERE RANK() OVER(PARTITION BY favourite_cheese ORDER BY age asc) <3;

有没有比创建一个包含所有排名的表格然后使用排名中的 WHERE 子句从中选择更好的方法?

最佳答案

你能试试这个吗?

select * from (
SELECT RANK() OVER(PARTITION BY favourite_cheese ORDER BY age asc) AS rank_my_cheese, favourite_cheese, name, age
FROM exp_dqss_team.testranking
) as temp
where rank_my_cheese <= 2;

关于sql - Impala 分析函数在 where 子句中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42621222/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com