gpt4 book ai didi

python - 是否可以按时间间隔对数据进行分组?

转载 作者:太空宇宙 更新时间:2023-11-03 20:22:55 25 4
gpt4 key购买 nike

我正在尝试分析卫星遥测数据。我们的卫星有 178 个 channel 。我们想按更新时间间隔进行分组。例如, channel 1 每 10 秒发送一条消息(有 100 个 channel ), channel 2 每 20 秒发送一条消息, channel 3 30 秒, channel 4 60 秒。所以我们的 channel 会根据更新时间发送信息。是否可以按时间间隔排序?例如 10 秒的组:所有 channel 10 秒更新等等

数据:

channel 1:  3:25:15 (update time 10 secs)
channel 1: 3:25:25
channel 2: 3:25:35 (update time 20 secs)
channel 1: 3:25:35
channel 1: 3:25:45
channel 3: 3:25:45 (update time 30 secs)
channel 1: 3:25:55
channel 2: 3:25:55
channel 1: 3:26:05
channel 1: 3:26:15
channel 2: 3:26:15
channel 3: 3:26:15
channel 4: 3:26:15 (update time 60 secs)

我想要一个结果:

group by 10 secs:

channel 1: 3:25:15
channel 1: 3:25:25
channel 1: 3:25:35
channel 1: 3:25:45
channel 1: 3:25:55
channel 1: 3:26:05

每个时间间隔依此类推。

注:共有 178 个 channel 。我不知道哪些 channel 有10秒、20秒等等。所以我必须按更新时间对它们进行排序。

最佳答案

好吧,以下内容可能需要调整,但您应该明白了:)让我们一步一步来

首先我的数据存储在/tmp/data中。我必须为 channel 4 添加另一个测量值,否则它会被排除(见下文):

$ cat /tmp/data
channel 1: 3:25:15 (update time 10 secs)
channel 1: 3:25:25
channel 2: 3:25:35 (update time 20 secs)
channel 1: 3:25:35
channel 1: 3:25:45
channel 3: 3:25:45 (update time 30 secs)
channel 1: 3:25:55
channel 2: 3:25:55
channel 1: 3:26:05
channel 1: 3:26:15
channel 2: 3:26:15
channel 3: 3:26:15
channel 4: 3:26:15 (update time 60 secs)
channel 4: 3:27:15 (update time 60 secs)

现在,我创建一个加载函数,它将创建一个字典,其中键是 channel 号(int),值是更新时间列表:

from collections import defaultdict
from datetime import datetime

import re

def read_data(fpath):
# Format: {"channel X": [update1, update2]}
data = defaultdict(list)
with open(fpath) as f:
for line in f:
parts = re.findall('[:\w]+', line)

data[int(parts[1][:-1])].append(parts[2])

return data


data = read_data("/tmp/data")

# Sort timestamps
for channel in data:
data[channel].sort()
print(data)

这为您提供了(稍微格式化输出以使其更易于阅读):

defaultdict(<type 'list'>, {
1: ['3:25:15', '3:25:25', '3:25:35', '3:25:45', '3:25:55', '3:26:05', '3:26:15'],
2: ['3:25:35', '3:25:55', '3:26:15'],
3: ['3:25:45', '3:26:15'],
4: ['3:26:15', '3:27:15']
})

最后,有趣的代码!我们将循环这些数据,对于每个 channel ,我们将:

  • 计算更新间隔列表(时间戳之间的时间差)
  • 对上述结果求平均值并四舍五入为整数 - 这部分可能需要在实际数据中进行更多工作

将上述内容存储在另一个字典中,其中键是间隔,值是似乎具有此更新间隔的 channel 列表:

# Identify intervals
channel_interval = defaultdict(list)
FMT = '%H:%M:%S'

for channel, report_times in data.items():
# We need at least 2 samples to determine interval -
# channel 4 needed another entry for this to work
if len(report_times) < 2:
continue

# Collect all reports timediff for this channel
diffs = []
# This converts timestamp to datatime
prev_time = datetime.strptime(report_times[0], FMT)

for rt in report_times[1:]:
cur_time = datetime.strptime(rt, FMT)
diffs.append((cur_time - prev_time).seconds)
prev_time = cur_time

# average the report time difference - int division
# here you might need to be smarter with real data and round up a bit
# if needed
interval = sum(diffs) // len(diffs)
channel_interval[interval].append(channel)

报告:只需循环每个channel_interval,并为落在该间隔内的每个 channel 打印时间戳:

# report
for interval, channels in channel_interval.items():
print("Updating every {} seconds (channels={})".format(interval, channels))
for channel in channels:
hdr = '\nchannel {}: '.format(channel)
print(hdr + hdr.join(data[channel]))
print("\n")

最终输出为:

Updating every 60 seconds (channels=[4])

channel 4: 3:26:15
channel 4: 3:27:15


Updating every 10 seconds (channels=[1])

channel 1: 3:25:15
channel 1: 3:25:25
channel 1: 3:25:35
channel 1: 3:25:45
channel 1: 3:25:55
channel 1: 3:26:05
channel 1: 3:26:15


Updating every 20 seconds (channels=[2])

channel 2: 3:25:35
channel 2: 3:25:55
channel 2: 3:26:15


Updating every 30 seconds (channels=[3])

channel 3: 3:25:45
channel 3: 3:26:15


正如我所说,上面的内容可能需要进行一些小的更改才能处理真实数据,但这应该是一个好的开始。如果您有疑问,请告诉我

<小时/>

更新1:如果您希望按时间间隔排序打印,可以循环

for channel, interval in sorted(channel_interval.items(), key=lambda x: x[0])

关于python - 是否可以按时间间隔对数据进行分组?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58042826/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com