gpt4 book ai didi

sql - 使用交叉连接在 Postgresql 中为每个起点-终点对填充时间序列

转载 作者:行者123 更新时间:2023-11-29 13:09:05 27 4
gpt4 key购买 nike

SQLfiddle:http://sqlfiddle.com/#!17/0ba48b

假设我有一张这样的 table

CREATE TABLE trips (origin int, destination int, t TIMESTAMP);

INSERT INTO trips VALUES
(1, 1, '2016-06-22 08:02:00'),
(1, 3, '2016-06-22 08:10:00'),
(1, 3, '2016-06-22 08:12:00'),
(2, 1, '2016-06-22 08:49:00');

起点/终点的集合是{1,2,3}。我想生成一个这样的表:

origin | destination |      t       |count 
--------------------------------------------
1 | 1 | 08:00:00 | 1
1 | 1 | 08:15:00 | 0
1 | 1 | 08:30:00 | 0
1 | 1 | 08:45:00 | 0
1 | 2 | 08:00:00 | 0
...| | |
1 | 3 | 08:00:00 | 2
1 | 3 | 08:15:00 | 0

基本上,对于每个起点-终点对,我想生成整个时间序列(间隔 15 分钟),汇总行程并在需要时插入 0/null。

到目前为止我的尝试:

with 

-- generate bins
bins as (
select bin
from
(select generate_series(
date_trunc('hour',
(SELECT min(t)
FROM trips
LIMIT 1)) , -- start at one day ago, rounded to the hour
date_trunc('hour', (SELECT max(t)
FROM trips
LIMIT 1))+ '1 day'::interval, -- stop at now, rounded to the hour
'15 minutes'::interval -- one hour intervals
) as bin ) intervals
where extract('ISODOW' FROM bin) < 6 -- exclude weekends
),
-- bin the od table https://gis.stackexchange.com/a/127874/13965
od as (
SELECT COUNT(*) cnt, origin, destination,
to_timestamp(floor((extract('epoch' from t) / 900 )) * 900) -- this bins it to 15 minutes, but has many missing values
AT TIME ZONE 'UTC' as interval_alias_exit
FROM
trips
GROUP BY interval_alias_exit, origin, destination
)
-- merge the two
select
* from
bins
left join od
on bins.bin = od.interval_alias_exit
where origin = '1'

但这不会为每个 OD 对创建时间序列。例如,当 origin 为 1 时,它只会返回两行。请看 fiddle 的例子。

最佳答案

您需要将所有点与所有点交叉连接以获得所有起点和目的地对,然后将其与时间序列交叉连接。然后,您可以离开加入出发地和目的地相同且时间戳大于或等于系列时间戳但小于系列时间戳加 15 分钟的行程。然后按系列的起点、终点和时间戳进行聚合,并计算旅行的时间戳。

SELECT o.origin,
d.destination,
gs.t,
count(t.t) count
FROM (SELECT origin
FROM trips
UNION
SELECT destination
FROM trips) o
CROSS JOIN (SELECT destination
FROM trips
UNION
SELECT origin
FROM trips) d
CROSS JOIN generate_series((SELECT date_trunc('hour', min(t))
+ concat(floor(extract('minutes' FROM min(t)) / 15) * 15,
' minutes')::interval
FROM trips),
(SELECT date_trunc('hour', max(t))
+ concat(floor(extract('minutes' FROM max(t)) / 15) * 15,
' minutes')::interval
FROM trips),
'15 minutes'::interval) gs (t)
LEFT JOIN trips t
ON t.origin = o.origin
AND t.destination = d.destination
AND t.t >= gs.t
AND t.t < gs.t + '15 minutes'::interval
GROUP BY o.origin,
d.destination,
gs.t
ORDER BY o.origin,
d.destination,
gs.t;

(与您发布的查询相比,您的示例所需结果中的时间序列边界似乎有另一种逻辑。我试图匹配前者。但我想您会弄清楚如何更改它以适应您的实际需求。)

SQL Fiddle

关于sql - 使用交叉连接在 Postgresql 中为每个起点-终点对填充时间序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57153442/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com