gpt4 book ai didi

python - 循环遍历csv文件并根据条件获取数据

转载 作者:太空宇宙 更新时间:2023-11-03 21:47:31 24 4
gpt4 key购买 nike

我遇到了一个问题,在我查看的所有 CSV 帖子中都没有找到任何解决方案。我有数千行的 csv,其中第一列有日期和时间戳。每 2 秒就有一个新的时间戳

注释1:非常重要的注释(导致我的问题)是每个日期和时间都出现几次

注2:日期已排序

我的前 40 行

30/07/2018 22:52:52,4,50,26

30/07/2018 22:52:52,7,49,26

30/07/2018 22:52:52,6,50,26

30/07/2018 22:52:52,5,51,26

30/07/2018 22:52:52,2,50,26

30/07/2018 22:52:52,3,49,26

30/07/2018 22:52:55,4,50,26

30/07/2018 22:52:55,7,49,26

30/07/2018 22:52:55,6,50,26

30/07/2018 22:52:55,5,51,26

30/07/2018 22:52:55,2,50,26

30/07/2018 22:52:55,3,49,26

30/07/2018 22:52:57,4,50,26

30/07/2018 22:52:57,7,49,26

30/07/2018 22:52:57,6,50,26

30/07/2018 22:52:57,5,51,26

30/07/2018 22:52:57,2,50,26

30/07/2018 22:52:57,3,49,26

30/07/2018 22:52:59,4,50,26

30/07/2018 22:52:59,7,49,26

30/07/2018 22:52:59,6,50,26

30/07/2018 22:52:59,5,51,26

30/07/2018 22:52:59,2,50,26

30/07/2018 22:52:59,3,49,26

30/07/2018 22:53:02,4,50,26

30/07/2018 22:53:02,7,49,26

30/07/2018 22:53:02,6,50,26

30/07/2018 22:53:02,5,51,26

30/07/2018 22:53:02,2,50,26

30/07/2018 22:53:02,3,49,26

30/07/2018 22:53:04,4,50,26

30/07/2018 22:53:04,7,49,26

30/07/2018 22:53:04,6,50,26

30/07/2018 22:53:04,5,51,26

30/07/2018 22:53:04,2,50,26

30/07/2018 22:53:04,3,49,26

30/07/2018 22:53:07,4,50,26

30/07/2018 22:53:07,7,49,26

30/07/2018 22:53:07,6,50,26

30/07/2018 22:53:07,5,51,26

30/07/2018 22:53:07,2,50,26

30/07/2018 22:53:07,3,49,26

30/07/2018 22:53:09,4,50,26

30/07/2018 22:53:09,7,49,26

30/07/2018 22:53:09,6,50,26

30/07/2018 22:53:09,5,50,26

30/07/2018 22:53:09,2,50,26

30/07/2018 22:53:09,3,49,26

我需要从用户处获取输入(例如 5),然后每 5 秒获取最后一个时间戳,并从第 2 列和第 3 列中创建字典。所以对于输入 5,我必须采取行:

30/07/2018 22:52:59,4,50,26

30/07/2018 22:52:59,7,49,26

30/07/2018 22:52:59,6,50,26

30/07/2018 22:52:59,5,51,26

30/07/2018 22:52:59,2,50,26

30/07/2018 22:52:59,3,49,26

30/07/2018 22:53:09,7,49,26

30/07/2018 22:53:09,6,50,26

30/07/2018 22:53:09,5,50,26

30/07/2018 22:53:09,2,50,26

30/07/2018 22:53:09,3,49,26

字典应该是这样的:

{timestamp : {2nd column : 3rd columns}}

{30/07/2018 22:52:59: {4:50,7:49,6:50,5:51,2:50,3:49}}

到目前为止,我所拥有的对于每个时间戳只能运行 1 次,这意味着我为每个时间戳都获取了这本字典:

{30/07/2018 22:52:59: {4:50}, 30/07/2018 22:53:09:{4:50}}

这是我的代码:

with open(os.path.join(inputPath,filename),"r") as f:
dictTemp = {}
r = csv.reader(f)
#Gets first date from node file
minTime = dt.strptime(next(r)[0], "%d/%m/%Y %H:%M:%S")
#open file second time to loop through all rows
for line in r:
currentTime = dt.strptime(line[0], "%d/%m/%Y %H:%M:%S")
if((currentTime-minTime).total_seconds() > 5):
minTime = currentTime
scenariotimeStamps.append((currentTime.strftime("%Y%m%d%H%M%S")))
dictTemp[line[1]] = line[2]
dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))] = dictTemp

最佳答案

与:

dictTemp[line[1]] = line[2]
dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))] = dictTemp

您将在每次迭代中覆盖字典 dicComplete[str(currentTime.strftime("%Y%m%d%H%M%S"))] 。将两行更改为:

dictComplete.setdefault(str(currentTime.strftime("%Y%m%d%H%M%S")), {})[line[1]] = line[2]

而且,由于您希望在验证自上次时间戳以来至少 5 秒后获取具有相同时间戳的所有行,而不是:

if((currentTime-minTime).total_seconds() > 5):

如果currentTime等于minTime,您应该允许它:

if currentTime == minTime or (currentTime-minTime).total_seconds() > 5:

关于python - 循环遍历csv文件并根据条件获取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52381242/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com