gpt4 book ai didi

python Pandas 数据框: need speed up process related to calculate 3 rows data

转载 作者:太空宇宙 更新时间:2023-11-03 12:55:44 25 4
gpt4 key购买 nike

我有如下数据:

Tran|Type|Amount|comment
1212|A|12|Buy
1212|AA|13|Buy
1212|CC|25|S
1213|AA|1112|B
1213|A|78|B
1213|CC|1190|SEllding
1214|AA|1112|B
1214|A|78|B
1214|CC|1190|SEllding
1215|AA|1112|B
1215|A|78|B
1216|AA|1112|B


....

我需要过滤掉所有具有 3 个类型 A、AA、CC 和 A.Amount + AA.Amount= CC.Amount 的交易

数据量巨大(1亿条记录)

我的代码如下,但是运行起来很慢

df1=df.groupby("tran").filter(lambda x: len(x) == 3)
listrefn=df1.tran.tolist()
df1=df[df.tran.isin(listrefn)]
df2=df1[df1.field=='A']
dfA=df2[['tran','Amount']]
df2=df1[df1.field=='AA']
dfAA=df2[['tran','Amount']]
df2=df1[df1.field=='CC']
dfCC=df2[['tran','Amount']]

dfA=dfA.rename(columns={'tran':'tran','Amount':'A'})
dfAA=dfAA.rename(columns={'tran':'tran','Amount':'AA'})
dfCC=dfCC.rename(columns={'tran':'tran','Amount':'CC'})

dftmp=pandas.merge(dfA,dfAA,how='left')
dftmp1=pandas.merge(dftmp,dfCC,how='left')
dftmp1['diff']=dftmp1.A-dftmp1.AA-dftmp1.CC
dftmp=dftmp1[['tran','diff']]
dftmp1=dftmp[dftmp['diff']==0]

请大家帮忙指教

最佳答案

您可以使用 pivotquery :

#If necessary filtering:
#df = df[df.groupby("Tran")['Type'].transform('size') == 3]

idx = df.pivot(index='Tran', columns='Type', values='Amount').query('A + AA == CC').index
print (idx)
Int64Index([1212, 1213, 1214], dtype='int64', name='Tran')

df = df[df.Tran.isin(idx)]
#same as
#df = df.query('Tran in @idx')
print (df)
Tran Type Amount comment
0 1212 A 12 Buy
1 1212 AA 13 Buy
2 1212 CC 25 S
3 1213 AA 1112 B
4 1213 A 78 B
5 1213 CC 1190 SEllding
6 1214 AA 1112 B
7 1214 A 78 B
8 1214 CC 1190 SEllding

过滤的另一种解决方案:

df = df.set_index('Tran').loc[idx].reset_index()
print (df)
Tran Type Amount comment
0 1212 A 12 Buy
1 1212 AA 13 Buy
2 1212 CC 25 S
3 1213 AA 1112 B
4 1213 A 78 B
5 1213 CC 1190 SEllding
6 1214 AA 1112 B
7 1214 A 78 B
8 1214 CC 1190 SEllding

关于 python Pandas 数据框: need speed up process related to calculate 3 rows data,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43226050/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com