gpt4 book ai didi

hadoop - 基于 2 个字段的滚动总计的 Hive 查询

转载 作者:行者123 更新时间:2023-12-02 21:19:56 24 4
gpt4 key购买 nike

我有一张 table 在下面显示

    Date | Customer | Count | Daily_Count | ITD_Count  
d1 | A | 3 | 3 |
d2 | B | 4 | 4 |
d3 | A | 7 | 16 |
d3 | B | 9 | 16 |
d4 | A | 8 | 9 |
d4 | B | 1 | 9 |

字段说明:

Date : date
customer : name of customer
Count : # of customers
daily_Count : # of customers on daily basis calculated as


SUM(count) OVER (partition BY date )as Daily_Count  

问题 :
如何计算 ITD_Count 中的 Running Total 或 Rolling Total?
输出应该看起来像
Date | Customer | Count | Daily_Count | ITD_Count  
d1 | A | 3 | 3 | 3
d2 | B | 4 | 4 | 7
d3 | A | 7 | 16 | 23
d3 | B | 9 | 16 | 23
d4 | A | 8 | 9 | 31
d4 | B | 1 | 9 | 31

我尝试了几种使用 Window 功能的变体。但在我所有的尝试中都遇到了障碍。

尝试1;
  SUM(daily_COunt) OVER (partition BY date order by date rows between unbounded preceding and current row ) as ITD_account_linking

尝试 2:
  SUM(daily_COunt) OVER (partition BY date, daily_count order by date rows between unbounded preceding and current row ) as ITD_account_linking

并在此之后进行了几次尝试。 :(

欢迎任何可能的建议来指导我正确的方向。

如果您需要更多详细信息,请告诉我。

最佳答案

使用 Hive 窗口化和分析功能。

SELECT Date, Customer, Count, Daily_Count, 
SUM(Daily_Count) OVER (ORDER BY Date ROWS UNBOUNDED PRECEDING) AS ITD_Count
FROM table;

关于hadoop - 基于 2 个字段的滚动总计的 Hive 查询,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37688123/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com