gpt4 book ai didi

hadoop - 如何在Hive中使用2 for循环

转载 作者:行者123 更新时间:2023-12-02 20:32:13 31 4
gpt4 key购买 nike

如何在Hive中使用2 for循环?

我有如下输入数据:

1  a 3

15 b 4

1 b 2

25 a 5

15 c 3

1 a 3

15 c 2

25 b 4

中间输出:1个总数a和b中的一个,对于15和25类似
1 a 6

1 b 2

15 b 4

15 c 5

25 a 5

25 b 4

最终输出:需要1个最大计数
1 a 6

15 c 5

25 a 5

最佳答案

您可以使用窗口函数并获取结果。请检查以下内容:

> select * from shailesh;
INFO : OK
+----------------+----------------+----------------+--+
| shailesh.col1 | shailesh.col2 | shailesh.col3 |
+----------------+----------------+----------------+--+
| 1 | a | 3 |
| 15 | b | 4 |
| 1 | b | 2 |
| 25 | a | 5 |
| 15 | c | 3 |
| 1 | a | 3 |
| 15 | c | 2 |
| 25 | b | 4 |
+----------------+----------------+----------------+--+
8 rows selected (0.359 seconds)

> create table shailesh2 as select col1, col2, max(col3s) col3s2 from (select col1,col2,sum(col3) over(partition by col1,col2) col3s from shailesh ) t group by col1, col2;
INFO : OK
+-----------------+-----------------+-------------------+--+
| shailesh2.col1 | shailesh2.col2 | shailesh2.col3s2 |
+-----------------+-----------------+-------------------+--+
| 1 | a | 6 |
| 1 | b | 2 |
| 15 | b | 4 |
| 15 | c | 5 |
| 25 | a | 5 |
| 25 | b | 4 |
+-----------------+-----------------+-------------------+--+
6 rows selected (0.36 seconds)

> select col1, col2, col3s2 from (select col1,col2,col3s2, rank() over(partition by col1 order by col3s2 desc) as rk from shailesh2) t2 where rk=1;
INFO : OK
+-------+-------+---------+--+
| col1 | col2 | col3s2 |
+-------+-------+---------+--+
| 1 | a | 6 |
| 15 | c | 5 |
| 25 | a | 5 |
+-------+-------+---------+--+
3 rows selected (37.224 seconds)

关于hadoop - 如何在Hive中使用2 for循环,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52850210/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com