gpt4 book ai didi

clickhouse - 使用 ClickHouse 折叠重叠时间间隔

转载 作者:行者123 更新时间:2023-12-05 02:11:34 25 4
gpt4 key购买 nike

我阅读了类似的问题,可以通过使用窗口函数使其工作,但是,由于 ClickHouse 似乎不支持它们,我正在寻找替代解决方案。

给定像 (1, 5), (2, 3), (3, 8), (10, 15) 这样的时间间隔,我想将重叠的间隔“合并”为单个间隔。在我的示例中,它将是:

(1, 8) 和 (10, 15)。

感谢任何指点!谢谢!

最佳答案

这个任务很容易被 arrayReduce 解决如果此函数已与任意 lambda 一起使用。虽然这不存在,但请尝试通过可用的方式解决问题。

SELECT
intervals,

arraySort(x -> x, intervals) sortedIntervals,

/* try to merge each interval with precede ones */
arrayMap((x, index) -> index != 1
? (arrayReduce(
'min',
arrayMap(
i -> sortedIntervals[i + 1].1,
/* get indexes of intervals that can be merged with the current one (index is zero-based) */
arrayFilter(
i -> x.1 <= sortedIntervals[i + 1].2 AND x.2 >= sortedIntervals[i + 1].1,
range(index)))),
arrayReduce(
'max',
arrayMap(
i -> sortedIntervals[i + 1].2,
/* get indexes of intervals that can be merged with the current one (index is zero-based) */
arrayFilter(
i -> x.1 <= sortedIntervals[i + 1].2 AND x.2 >= sortedIntervals[i + 1].1,
range(index)))))
: x,
sortedIntervals,
arrayEnumerate(sortedIntervals)) rawResult,

/* filter out intervals nested to other ones */
arrayFilter(
(x, index) -> index == length(rawResult) OR x.1 != rawResult[index + 1].1,
rawResult,
arrayEnumerate(rawResult)) result
FROM
(
SELECT [(1, 5), (2, 3), (3, 8), (10, 15)] intervals
UNION ALL
SELECT [(2, 4), (1, 3), (3, 6), (12, 14), (7, 7), (13, 16), (9, 9), (8, 9), (10, 15)]
UNION ALL
SELECT [(20, 22), (18, 18), (16, 21), (1, 8), (2, 9), (3, 5), (10, 12), (11, 13), (14, 15)]
UNION ALL
SELECT []
UNION ALL
SELECT [(1, 11)]
)
FORMAT Vertical;

/*
Row 1:
──────
intervals: [(2,4),(1,3),(3,6),(12,14),(7,7),(13,16),(9,9),(8,9),(10,15)]
sortedIntervals: [(1,3),(2,4),(3,6),(7,7),(8,9),(9,9),(10,15),(12,14),(13,16)]
rawResult: [(1,3),(1,4),(1,6),(7,7),(8,9),(8,9),(10,15),(10,15),(10,16)]
result: [(1,6),(7,7),(8,9),(10,16)]

Row 2:
──────
intervals: [(1,5),(2,3),(3,8),(10,15)]
sortedIntervals: [(1,5),(2,3),(3,8),(10,15)]
rawResult: [(1,5),(1,5),(1,8),(10,15)]
result: [(1,8),(10,15)]

Row 3:
──────
intervals: [(20,22),(18,18),(16,21),(1,8),(2,9),(3,5),(10,12),(11,13),(14,15)]
sortedIntervals: [(1,8),(2,9),(3,5),(10,12),(11,13),(14,15),(16,21),(18,18),(20,22)]
rawResult: [(1,8),(1,9),(1,9),(10,12),(10,13),(14,15),(16,21),(16,21),(16,22)]
result: [(1,9),(10,13),(14,15),(16,22)]

Row 4:
──────
intervals: []
sortedIntervals: []
rawResult: []
result: []

Row 5:
──────
intervals: [(1,11)]
sortedIntervals: [(1,11)]
rawResult: [(1,11)]
result: [(1,11)]
*/

关于clickhouse - 使用 ClickHouse 折叠重叠时间间隔,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57347444/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com