sql - 在 PostgreSQL 中按用户展平相交时间跨度-6ren

sql - 在 PostgreSQL 中按用户展平相交时间跨度

转载作者：行者123 更新时间：2023-11-29 11:48:05

24

4

我正在尝试将重叠的开始结束时间戳合并为单个时间跨度。类似问题可用here所以。我想为数据中的每个用户分别合并时间戳。

SQLFiddle

示例数据:

-- drop table if exists app_log;

create table app_log (
  user_id int,
  login_time timestamp,
  logout_time timestamp
);

insert into app_log values
  (1, '2014-01-01 08:00', '2014-01-01 10:00'), /* here we start */
  (1, '2014-01-01 09:10', '2014-01-01 09:59'), /* fully included in previous interval */
  (1, '2014-01-01 10:00', '2014-01-01 10:48'), /* continuing first interval */
  (1, '2014-01-01 10:40', '2014-01-01 10:49'), /* continuing previous interval */
  (1, '2014-01-01 10:55', '2014-01-01 11:00'), /* isolated interval */
  (2, '2014-01-01 09:00', '2014-01-01 11:00'), /* 2nd user is shifted by one hour */
  (2, '2014-01-01 10:10', '2014-01-01 10:59'), /* to simulate overlaps with 1st user */
  (2, '2014-01-01 11:00', '2014-01-01 11:48'), 
  (2, '2014-01-01 11:40', '2014-01-01 11:49'), 
  (2, '2014-01-01 11:55', '2014-01-01 12:00')  
;

要求的结果:

  used_id  login_time       logout_time
  1        2014-01-01 08:00 2014-01-01 10:49 /* Merging first 4 lines */
  1        2014-01-01 10:55 2014-01-01 11:00 /* 5 th line is isolated */
  2        2014-01-01 09:00 2014-01-01 11:49 /* Merging lines 6-9 */
  2        2014-01-01 11:55 2014-01-01 12:00 /* last line is isolated */

我尝试使用 mentioned question 中提供的解决方案之一，但即使是单个用户也不会返回正确答案:

with recursive

in_data as (select login_time as d1, logout_time as d2 from app_log where user_id = 1)

, dateRanges (ancestorD1, parentD1, d2, iter) as
(
--anchor is first level of collapse
    select
        d1 as ancestorD1,
        d1 as parentD1,
        d2,
        cast(0 as int) as iter
    from in_data

--recurse as long as there is another range to fold in
    union all

    select
        tLeft.ancestorD1,
        tRight.d1 as parentD1,
        tRight.d2,
        iter + 1  as iter
    from dateRanges as tLeft join in_data as tRight
        --join condition is that the t1 row can be consumed by the recursive row
        on tLeft.d2 between tRight.d1 and tRight.d2
            --exclude identical rows
            and not (tLeft.parentD1 = tRight.d1 and tLeft.d2 = tRight.d2)
)
select
    ranges1.*
from dateRanges as ranges1
where not exists (
    select 1
    from dateRanges as ranges2
    where ranges1.ancestorD1 between ranges2.ancestorD1 and ranges2.d2
        and ranges1.d2 between ranges2.ancestorD1 and ranges2.d2
        and ranges2.iter > ranges1.iter
);

结果:

ancestord1 parentd1 d2 iter
2014-01-01 10:55:00;2014-01-01 10:55:00;2014-01-01 11:00:00;0
2014-01-01 08:00:00;2014-01-01 10:40:00;2014-01-01 10:49:00;2
2014-01-01 09:10:00;2014-01-01 10:40:00;2014-01-01 10:49:00;3

上面的查询有什么问题，我如何扩展它以获取用户的结果？ PostgreSQL 中是否有更好的解决方案？

最佳答案

我找到了这个 example of how to make a 'range aggregate'使用窗口函数和大量嵌套子查询。我只是将其调整为按 user_id 进行分区和分组，它似乎可以满足您的要求:

SELECT user_id, min(login_time) as login_time, max(logout_time) as logout_time
FROM (
    SELECT user_id, login_time, logout_time,
        max(new_start) OVER (PARTITION BY user_id ORDER BY login_time, logout_time) AS left_edge
    FROM (
        SELECT user_id, login_time, logout_time,
            CASE 
                WHEN login_time <= max(lag_logout_time) OVER (
                    PARTITION BY user_id ORDER BY login_time, logout_time 
                ) THEN NULL 
                ELSE login_time 
            END AS new_start
        FROM (
            SELECT 
                user_id, 
                login_time, 
                logout_time,
                lag(logout_time) OVER (PARTITION BY user_id ORDER BY login_time, logout_time) AS lag_logout_time
            FROM app_log
        ) AS s1
    ) AS s2
) AS s3
GROUP BY user_id, left_edge
ORDER BY user_id, min(login_time)

结果:

 user_id |     login_time      |     logout_time
---------+---------------------+---------------------
       1 | 2014-01-01 08:00:00 | 2014-01-01 10:49:00
       1 | 2014-01-01 10:55:00 | 2014-01-01 11:00:00
       2 | 2014-01-01 09:00:00 | 2014-01-01 11:49:00
       2 | 2014-01-01 11:55:00 | 2014-01-01 12:00:00
(4 rows)

它的工作原理是首先检测每个新范围的开始(按 user_id 划分)，然后按检测到的范围进行扩展和分组。我发现我必须非常仔细地阅读那篇文章才能理解它!

文章建议可以使用 Postgresql>=9.0 通过删除最里面的子查询和更改窗口范围来简化它，但我无法让它工作。

关于sql - 在 PostgreSQL 中按用户展平相交时间跨度，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21928848/

24

4

0

文章推荐： Mysql - 基于另一列值的单个列中的多个唯一集

文章推荐： ios - iOS7 有类似 UIAccessibilityCustomAction 的东西吗？

文章推荐： ios - 在 iOS 设备的设置下看不到开发者菜单

文章推荐： sql - 索引是否保留在派生表上？

Javascript 对象映射(展平)
您好，我有一个使用 JSON.Stringify 输出到此的对象 {"0":["test1","ttttt","","","","","","","",""],"1":["test2","ghjgjh
python - 展平/删除分层列标题
我有以下数据框，它是执行 groupby + 聚合总和的结果: df.groupby(['id', 'category']).agg([pd.Series.sum])
展平 3D 三角形带的算法
我有一个 3D 三角形带(见插图)。三角形不在一个平面内。我想展平三角形带，使所有三角形都位于第一个三角形的平面内。计划是围绕与第一个三角形的连接边旋转第二个三角形，使其与第一个三角形在同一平面内
ios - 展平 CGPath
简单地说，我正在寻找可在 iOS 上使用的与 NSBezierPath 的 -bezierPathByFlatteningPath 等效的方法。这对我来说是直接处理 CGPath 的函数还是 UIBe
c# - 展平 JToken
假设我有以下 JToken: @"{ ""data"": [ { ""company"": { ""ID"": ""12
Git merge 展平
如果我在多个分支中处理单个功能，我会使用 git pull branch1 branch2 branch3 将所有更改 pull 入我的主分支。但是，每个分支的所有提交日志也会被复制。如何将提交日志扁
Python成语链接(展平)有限迭代的无限迭代？
这个问题在这里已经有了答案: How do I make a flat list out of a list of lists? (33 个答案) 关闭6年前。假设我们有一个返回列表(或有限迭代器)
pyspark - 展平 PySpark 数组中的嵌套结构
给定如下模式: root |-- first_name: string |-- last_name: string |-- degrees: array | |-- element: struc
sql - 展平 BigQuery 表中多个相同大小的数组列
我有一个包含多个列的表，其中一些列是相同长度的数组。我想解除它们的嵌套，以获得包含来自不同行中的数组的值的结果。所以有这样一张 table : 我想去: 这是其中一个数组列的工作方式: WITH d
ffmpeg - 展平 360 度鱼眼视频
我最近买了一台 RICOH THETA S，用于在 360 vr 中录制足球比赛。我想使用 ffmpeg 将我用我的相机录制的鱼眼电影展平，这可能吗？ enter image description
python - 展平 Pandas 数据透视表
这是我的 question 的后续.是否可以将表格展平为如下所示，而不是数据透视表: data = {'year': ['2016', '2016', '2015', '2014', '2013'],
cocoa - NSBezierPath/线相交/展平
我目前正在将我的 jruby/java2d 图形绘制/布局应用程序移植到 macruby/cocoa。因此我需要获取开放的 NSBezierPath 与封闭的 NSBezierPath 的交点。在
scala - 展平 Scala 尝试
是否有一种简单的方法来展平一组 try 以给出尝试值的成功或失败？例如: def map(l:List[Int]) = l map { case 4 => Failure(new Excepti
sql-server - 展平/合并重叠的时间间隔
我有一个包含数百万行的“服务”表。每行对应于工作人员在给定日期和时间间隔内提供的服务(每行都有一个唯一的 ID)。在某些情况下，工作人员可能会在重叠的时间范围内提供服务。我需要编写一个查询来合并重叠的
elasticsearch - 展平 Elasticsearch _源输出
我在使用Elastic Search(ES)检索JSON对象时遇到问题。现在，当我尝试使用下面的请求正文从ES查询一些数据时， "_source": [ "data.id", "dat
java - 展平 map 中的列表列表
我有一个订单流(来源是订单列表)。每个订单都有一个 Customer 和一个 OrderLine 列表。我想要实现的是拥有一个以客户为键的 map ，以及属于该客户的所有订单行，在一个简单的列表中作
scala - 展平 Scala 中的嵌套对象
给定一个如下所示的复杂对象: case class Complex ( id: Long, name: String, nested: Seq[Complex] ) 实际上，这可能会变成这
promise - 展平 Promise map
我很好奇你如何将数组 Promise 映射的结果展平。我有一个函数 Promise.maps 一组值，它们本身就是 promise (需要解析)并返回一个数组。所以，我得到类似的结果: [ [1, 2
couchdb - 出于报告目的“展平”文档层次结构
我是 CouchDB 的新手，我只是想评估它在常见任务中的实用性。其中一项任务是生成报告。我的问题是:如果我有这样的文档结构: { "_id": "29763f342ab34fd7b579fd4
elixir - 展平/合并嵌套 map
假设我们有这样的 map : %{"a": %{"b": 2, "c":5}, "d": 1} 有没有类似this function的东西(js回答同一问题)内置elixr？最终结果应该是: %{"

首页

博学

6Ren·AI

商城

sql - 在 PostgreSQL 中按用户展平相交时间跨度