gpt4 book ai didi

pandas - 根据列值选择行

转载 作者:行者123 更新时间:2023-12-04 02:23:16 27 4
gpt4 key购买 nike

我有一个类似这样的数据框

data = {'ID': [1,2,3,4,5,6,7,8,9],
'Doc':['Order','Order','Inv','Order','Order','Shp','Order', 'Order','Inv'],
'Rep':[101,101,101,102,102,102,103,103,103]}
frame = pd.DataFrame(data)


Doc ID Rep
0 Order 1 101
1 Order 2 101
2 Inv 3 101
3 Order 4 102
4 Order 5 102
5 Shp 6 102
6 Order 7 103
7 Order 8 103
8 Inv 9 103

现在我想为 Rep 选择仅 Doc 类型为 Inv 的行。

我想要一个数据框作为

    Doc     ID  Rep
0 Order 1 101
1 Order 2 101
2 Inv 3 101
6 Order 7 103
7 Order 8 103
8 Inv 9 103

所有代表都会有 Doc 类型的订单,所以我试图做这样的事情

frame[frame.Rep == frame.Rep[frame.Doc == 'Inv']] 

但是我得到一个错误

ValueError:只能比较相同标签的 Series 对象

最佳答案

您可以使用两次 boolean indexing - 首先通过条件获取所有 Rep 然后通过 isin 获取所有行:

a = frame.loc[frame['Doc'] == 'Inv', 'Rep']
print (a)
2 101
8 103
Name: Rep, dtype: int64

df = frame[frame['Rep'].isin(a)]
print (df)
Doc ID Rep
0 Order 1 101
1 Order 2 101
2 Inv 3 101
6 Order 7 103
7 Order 8 103
8 Inv 9 103

query 的解决方案:

a = frame.query("Doc == 'Inv'")['Rep']
df = frame.query("Rep in @a")
print (df)
Doc ID Rep
0 Order 1 101
1 Order 2 101
2 Inv 3 101
6 Order 7 103
7 Order 8 103
8 Inv 9 103

时间:

np.random.seed(123)
N = 1000000
L = ['Order','Shp','Inv']
frame = pd.DataFrame({'Doc': np.random.choice(L, N, p=[0.49, 0.5, 0.01]),
'ID':np.arange(1,N+1),
'Rep':np.random.randint(1000, size=N)})
print (frame.head())

Doc ID Rep
0 Shp 1 95
1 Order 2 147
2 Order 3 282
3 Shp 4 82
4 Shp 5 746

In [204]: %timeit (frame.groupby('Rep').filter(lambda x: 'Inv' in x['Doc'].values))
1 loop, best of 3: 250 ms per loop

In [205]: %timeit (frame[frame['Rep'].isin(frame.loc[frame['Doc'] == 'Inv', 'Rep'])])
100 loops, best of 3: 17.3 ms per loop

In [206]: %%timeit
...: a = frame.query("Doc == 'Inv'")['Rep']
...: frame.query("Rep in @a")
...:
100 loops, best of 3: 14.5 ms per loop

编辑:

谢谢 John Galt 的好建议:

df = frame.query("Rep in %s" % frame.query("Doc == 'Inv'")['Rep'].tolist()) 
print (df)
Doc ID Rep
0 Order 1 101
1 Order 2 101
2 Inv 3 101
6 Order 7 103
7 Order 8 103
8 Inv 9 103

关于pandas - 根据列值选择行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45642845/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com