作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我想按时差将行拆分为Hive中最接近的半小时
以下是我打算实现的示例
User Start_time End_time Duration
A 2020-05-05 06:45:00 2020-05-05 07:15:00 30
B 2020-05-05 10:15:00 2020-05-05 11:15:00 60
User Start_time End_time Duration
A 2020-05-05 06:45:00 2020-05-05 07:00:00 15
A 2020-05-05 07:00:00 2020-05-05 07:15:00 15
B 2020-05-05 10:15:00 2020-05-05 10:30:00 15
B 2020-05-05 10:30:00 2020-05-05 11:00:00 30
B 2020-05-05 11:00:00 2020-05-05 11:15:00 15
最佳答案
计算所需的时间间隔数(为floor(duration/30)
),然后使用lateral view posexplode(split(space(s.num_intervals),' '))
生成行。请参见代码中的其他逻辑:
with mytable as (--test dataset
select stack(3,
'A', '2020-05-05 06:45:00', '2020-05-05 07:15:00', 30,
'B', '2020-05-05 10:15:00', '2020-05-05 11:15:00', 60,
'C', '2020-05-05 10:00:00', '2020-05-05 12:15:00', 135
) as (Usr, Start_time, End_time, Duration)
)
select Usr, start_time, end_time, (unix_timestamp(end_time)-unix_timestamp(start_time))/60 Duration
from
(
select Usr,
case when i.i=0 then start_time --first record
when i.i=1 then from_unixtime(unix_timestamp(start_time)+shift*60)
else from_unixtime(unix_timestamp(start_time)+shift*60+(i.i-1)*30*60)
end start_time,
case when i.i=num_intervals then end_time --end of range
when i.i=0 then from_unixtime(unix_timestamp(start_time)+shift*60) --first record
else from_unixtime(unix_timestamp(start_time)+shift*60+(i.i)*30*60)
end end_time
from
(--calculate required intervals and shift to 30 min
SELECT Usr,Start_time,End_time,
cast( floor(duration/30) as int) num_intervals,
case when (minute(Start_time) between 1 and 29) then 30-minute(Start_time)
when minute(Start_time) > 30 then 60-minute(Start_time)
else 30
end shift
FROM mytable
)s
lateral view posexplode(split(space(s.num_intervals),' ')) i as i,x
)s
;
usr start_time end_time duration
A 2020-05-05 06:45:00 2020-05-05 07:00:00 15
A 2020-05-05 07:00:00 2020-05-05 07:15:00 15
B 2020-05-05 10:15:00 2020-05-05 10:30:00 15
B 2020-05-05 10:30:00 2020-05-05 11:00:00 30
B 2020-05-05 11:00:00 2020-05-05 11:15:00 15
C 2020-05-05 10:00:00 2020-05-05 10:30:00 30
C 2020-05-05 10:30:00 2020-05-05 11:00:00 30
C 2020-05-05 11:00:00 2020-05-05 11:30:00 30
C 2020-05-05 11:30:00 2020-05-05 12:00:00 30
C 2020-05-05 12:00:00 2020-05-05 12:15:00 15
关于hadoop - 根据Hive中的时差将行拆分为最接近的半小时,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61885658/
我是一名优秀的程序员,十分优秀!