gpt4 book ai didi

sql - 按字段值使用 Google BigQuery 拆分时间序列

转载 作者:行者123 更新时间:2023-12-04 15:40:55 25 4
gpt4 key购买 nike

我在 Google Bigquery 中有一个数据集,其中包含随时间变化的车辆位置以及它们相对于基地的行驶方向,例如

time |  x  |  y  | direction | vehicle_id
-----|-----|-----|-----------|-----------
0:00 | ... | ... | returning | 100
0:00 | ... | ... | returning | 200
0:00 | ... | ... | exploring | 300
0:05 | ... | ... | returning | 100
0:05 | ... | ... | exploring | 200
0:05 | ... | ... | exploring | 300
0:10 | ... | ... | exploring | 100
0:10 | ... | ... | exploring | 200
0:10 | ... | ... | exploring | 300
0:15 | ... | ... | exploring | 100
0:15 | ... | ... | exploring | 200
0:15 | ... | ... | returning | 300

我可以很容易地按车辆进行聚合,但我无法想出一个查询来将每个车辆系列分解为“行程”,包括连续出现的“返回”或“探索”。我读过有关分析函数的资料,但似乎都不符合要求。

SELECT
vehicle_id,
ARRAY_AGG(
STRUCT(direction, time, x, y)
ORDER BY time) as series
FROM t
GROUP BY vehicle_id;
[
{
"vehicle_id": 100,
"series":
[
{"direction": "returning", "time": "0:00", "x": ..., "y": ...},
{"direction": "returning", "time": "0:05", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:10", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:15", "x": ..., "y": ...}
]
},
{
"vehicle_id": 200,
"series":
[
{"direction": "returning", "time": "0:00", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...}
]
},
{
"vehicle_id": 300,
"series":
[
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...},
{"direction": "exploring", "time": "0:00", "x": ..., "y": ...},
{"direction": "returning", "time": "0:00", "x": ..., "y": ...}
]
}
]

我真正想要的是有一系列乘车旅行,其中每次旅行都有一个方向和一系列 (t, x, y) 位置。这可能吗?

最佳答案

以下是针对 BigQuery 标准 SQL 并使用纯 SQL 获得完全相同的结果

#standardSQL
SELECT vehicle_id, ARRAY_AGG(STRUCT(direction, trip)) trips
FROM (
SELECT vehicle_id, direction, ARRAY_AGG(STRUCT(time, x, y) ORDER BY time) trip
FROM dataset
GROUP BY vehicle_id, direction
)
GROUP BY vehicle_id

如果应用到您问题中的示例数据,如下例所示

#standardSQL
WITH dataset AS (
SELECT
TIMESTAMP '2019-09-07 00:00:00' AS time,
0.1 AS x, 0.1 AS y, 'returning' AS direction,
100 AS vehicle_id
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:00', 0.2, 0.2, 'returning', 200
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:00', 0.3, 0.3, 'exploring', 300
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:05', 1.1, 1.1, 'returning', 100
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:05', 1.2, 1.2, 'exploring', 200
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:05', 1.3, 1.3, 'exploring', 300
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:10', 2.1, 2.1, 'exploring', 100
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:10', 2.2, 2.2, 'exploring', 200
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:10', 2.3, 2.3, 'exploring', 300
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:15', 3.1, 3.1, 'exploring', 100
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:15', 3.2, 3.2, 'exploring', 200
UNION ALL SELECT TIMESTAMP '2019-09-07 00:00:15', 3.3, 3.3, 'returning', 300
)
SELECT vehicle_id, ARRAY_AGG(STRUCT(direction, trip)) trips
FROM (
SELECT vehicle_id, direction, ARRAY_AGG(STRUCT(time, x, y) ORDER BY time) trip
FROM dataset
GROUP BY vehicle_id, direction
)
GROUP BY vehicle_id

结果是

enter image description here

关于sql - 按字段值使用 Google BigQuery 拆分时间序列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57830681/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com