gpt4 book ai didi

python-2.7 - 如何在 python pandas 中处理这种复杂的逻辑?

转载 作者:行者123 更新时间:2023-12-01 00:18:37 26 4
gpt4 key购买 nike

我有一些数据,比如遵循结构。它用于python pandas Data Frame,我将其命名为df。

Data1,Data2,Flag
2016-04-29,00:40:15,1
2016-04-29,00:40:24,2
2016-04-29,00:40:35,2
2015-04-29,00:40:36,2
2015-04-29,00:40:43,2
2015-04-29,00:40:45,2
2015-04-29,00:40:55,1
2015-04-29,00:41:05,1
2015-04-29,00:41:16,1
2015-04-29,00:41:17,2
.....................
.....................
2016-11-29,11:52:36,2
2016-11-29,11:52:43,2
2016-11-29,11:52:45,2
2016-11-29,11:52:55,1

我想得到满足以下要求的数据。

  1. 如您所知,第一个数据的时间序列是 2016-04-29,00:40:15。我想在此数据框中获取比 Primer 的数据大 18 秒的下一个数据。我将获得第二个数据:2016-04-29,00:40:35,2第三条数据是:2015-04-29,00:40:55,1
  2. 如果下一个数据的标志与引物的数据不同。无论是否超过 18 秒,我都会获取此数据。

针对以上两个需求,我将获取如下数据:

Data1,Data2,Flag
2016-04-29,00:40:15,1
2016-04-29,00:40:24,2
2015-04-29,00:40:43,2
2015-04-29,00:40:55,1
2015-04-29,00:41:16,1
2015-04-29,00:41:17,2
.....................

最佳答案

在这里,试试这个:

df['Data2'] = pd.to_timedelta(df['Data2'])

tdf = df.copy()
sel_idx = []
while len(tdf) > 0:
sel_idx.extend([tdf.index[0]])
cond1 = tdf['Data2'] > tdf.loc[sel_idx[-1], 'Data2'] + pd.to_timedelta(18, 's')
cond2 = (tdf['Flag'] != tdf.loc[sel_idx[-1], 'Flag']) & (tdf['Data2'] > tdf.loc[sel_idx[-1], 'Data2'])
tdf = tdf[cond1 | cond2]
df.loc[sel_idx, :]

测试

代码:

import pandas as pd
from io import StringIO

data = StringIO("""Data1,Data2,Flag
2016-04-29,00:40:15,1
2016-04-29,00:40:24,2
2016-04-29,00:40:35,2
2015-04-29,00:40:36,2
2015-04-29,00:40:43,2
2015-04-29,00:40:45,2
2015-04-29,00:40:55,1
2015-04-29,00:41:05,1
2015-04-29,00:41:16,1
2015-04-29,00:41:17,2
2016-11-29,11:52:36,2
2016-11-29,11:52:43,2
2016-11-29,11:52:45,2
2016-11-29,11:52:55,1""")

df = pd.read_csv(data)
df['Data2'] = pd.to_timedelta(df['Data2'])
print("Input\n", df)

tdf = df.copy()
sel_idx = []
while len(tdf) > 0:
sel_idx.extend([tdf.index[0]])
cond1 = tdf['Data2'] > tdf.loc[sel_idx[-1], 'Data2'] + pd.to_timedelta(18, 's')
cond2 = (tdf['Flag'] != tdf.loc[sel_idx[-1], 'Flag']) & (tdf['Data2'] > tdf.loc[sel_idx[-1], 'Data2'])
tdf = tdf[cond1 | cond2]
print("Ouput\n", df.loc[sel_idx, :])

输出:

Input
Data1 Data2 Flag
0 2016-04-29 00:40:15 1
1 2016-04-29 00:40:24 2
2 2016-04-29 00:40:35 2
3 2015-04-29 00:40:36 2
4 2015-04-29 00:40:43 2
5 2015-04-29 00:40:45 2
6 2015-04-29 00:40:55 1
7 2015-04-29 00:41:05 1
8 2015-04-29 00:41:16 1
9 2015-04-29 00:41:17 2
10 2016-11-29 11:52:36 2
11 2016-11-29 11:52:43 2
12 2016-11-29 11:52:45 2
13 2016-11-29 11:52:55 1

Output
Data1 Data2 Flag
0 2016-04-29 00:40:15 1
1 2016-04-29 00:40:24 2
4 2015-04-29 00:40:43 2
6 2015-04-29 00:40:55 1
8 2015-04-29 00:41:16 1
9 2015-04-29 00:41:17 2
10 2016-11-29 11:52:36 2
13 2016-11-29 11:52:55 1

关于python-2.7 - 如何在 python pandas 中处理这种复杂的逻辑?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38910438/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com