gpt4 book ai didi

sql - Hive percentile()超过()要求分组依据

转载 作者:行者123 更新时间:2023-12-02 20:52:01 24 4
gpt4 key购买 nike

我将计算通过group by room_id计算其他数据的所有数据的百分位数,如下所示:

   select 
distinct room_id,
count(user_id) over (partition by room_id) as user_cnt,
sum(price) over (partition by room_id) as price,
percentile(cast(price as bigint),0.5) over () as price_median
from
ods.ods_trade
where day = '2017-08-08' and trade_status = 1

上面的代码可以在 SparkSQL中正确运行,但是在 hive中显示:
At least 1 group must only depend on input columns ... Expression not in GROUP BY key 'price'
percentile() over()也返回1值,那么为什么会发生此问题以及如何解决呢?任何帮助将不胜感激..

例如。:
数据为:
room  user price(consume)
a u1 1
a u1 5
a u2 3
b u1 2
b u3 4
c u4 6
c u4 7

预期成绩:
  room_id  user_cnt   price  price_median
a 2 8 4
b 2 6 4
c 1 13 4

最佳答案

错误显示价格不在分组依据中。下面的查询应该工作:

select room, count(distinct user_id) , sum(price),
price_median from (
SELECT room, user_id, price,
percentile(cast(price as bigint),0.5) OVER () as price_median
FROM ods.ods_trade
GROUP BY room, USER_id, price
)k1
group by room, price_median

注意:列名可能会略有不同。

关于sql - Hive percentile()超过()要求分组依据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45738102/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com