gpt4 book ai didi

sql - 如果两个连续事件的时间戳差异超过 30 分钟,则填充 session ID 并生成新的 session ID

转载 作者:行者123 更新时间:2023-12-05 05:52:24 25 4
gpt4 key购买 nike

输入 - 从现有的配置单元或 Redshift 表中读取

user   |    Timestamp    |  SessionId
---------------------------------------
u1 | 10:00AM | ?
u1 | 10:05AM | ?
u1 | 10:10AM | ?
u1 | 10:15AM | ?
u1 | 11:40AM | ?
u1 | 11:50AM | ?
u1 | 12:15PM | ?

预期输出

user   |    Timestamp    |  SessionId
---------------------------------------
u1 | 10:00AM | s1
u1 | 10:05AM | s1
u1 | 10:10AM | s1
u1 | 10:15AM | s1
u1 | 11:40AM | s2
u1 | 11:50AM | s2
u1 | 12:15PM | s3

我们如何解决这个问题以使用 hive 或 redshift 更新现有表?

最佳答案

将时间戳转换为 unix_timestamp(秒),使用 lag() 函数获取之前的时间戳,计算差值,如果超过 30 分钟则分配 new_session=1,计算 new_session 的运行总和以获得 session ID。

with mydata as (
select 'u1' as `user`,'10:00AM' `timestamp` union all
select 'u1','10:05AM' union all
select 'u1','10:10AM' union all
select 'u1','10:15AM' union all
select 'u1','11:40AM' union all
select 'u1','11:50AM' union all
select 'u1','12:15AM' -----------15 min after midnight
)

select `user`, `timestamp`,
concat('s',sum(new_session) over(partition by `user` order by `timestamp`)) as session_id
from
(
select --calculate new_session flag based on differennce between ts and prev_ts
`user`, `timestamp`, ts, prev_ts,
case when ((ts-prev_ts)/60 > 30) or prev_ts is NULL then 1 end as new_session
from
(
select `user`, `timestamp`, ts,
--calculate previous time
lag(ts) over(partition by `user` order by ts) prev_ts
from
(
--convert time to seconds
select `user`, `timestamp`, unix_timestamp(`timestamp`,'hh:mma') as ts from mydata
)s --ts conversion
)s --prev_ts
)s --new_session

结果:

user    timestamp   session_id
u1 10:00AM s1
u1 10:05AM s1
u1 10:10AM s1
u1 10:15AM s1
u1 11:40AM s2
u1 11:50AM s2
u1 12:15AM s3

请注意,我将 12:15PM 更改为 12:15AM 以获得 s3 session ,因为在您的数据示例中,12:15PM 是中午后 15 分钟,11:50AM 和 12:15PM 之间的差异是 25 分钟并且不会触发新 session 。要像您的问题一样获得 S3 session ,应该是 12:15AM。12:15AM 是午夜后 15 分钟,请参阅 12-hour_clock wiki

关于sql - 如果两个连续事件的时间戳差异超过 30 分钟,则填充 session ID 并生成新的 session ID,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70155950/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com