gpt4 book ai didi

Python Pandas 用缺失值填充数据框

转载 作者:太空宇宙 更新时间:2023-11-03 13:10:29 25 4
gpt4 key购买 nike

我以这个数据框为例

import pandas as pd

#create dataframe
df = pd.DataFrame([['DE', 'Table',201705,201705, 1000], ['DE', 'Table',201705,201704, 1000],\
['DE', 'Table',201705,201702, 1000], ['DE', 'Table',201705,201701, 1000],\
['AT', 'Table',201708,201708, 1000], ['AT', 'Table',201708,201706, 1000],\
['AT', 'Table',201708,201705, 1000], ['AT', 'Table',201708,201704, 1000]],\
columns=['ISO','Product','Billed Week', 'Created Week', 'Billings'])
print (df)

ISO Product Billed Week Created Week Billings
0 DE Table 201705 201705 1000
1 DE Table 201705 201704 1000
2 DE Table 201705 201702 1000
3 DE Table 201705 201701 1000
4 AT Table 201708 201708 1000
5 AT Table 201708 201706 1000
6 AT Table 201708 201705 1000
7 AT Table 201708 201704 1000

我需要做的是为每个 groupby['ISO','Product'] 填充一些缺失的数据,其中每个 groupby['ISO','Product'] 序列中有一个中断,即在某一周内没有创建账单,所以它是丢失的。它需要基于计费周的最大值和创建周的最小值。即应该是完整的组合,没有顺序中断。

因此,对于上述内容,我需要以编程方式追加到数据库中的缺失记录如下所示:

  ISO Product  Billed Week  Created Week  Billings
0 DE Table 201705 201703 0
1 AT Table 201708 201707 0

最佳答案

这是我的解决方案,相信有高手会给出更好的解决方案~让我们拭目以待吧~

df1=df.groupby('ISO').agg({'Billed Week' : np.max,'Created Week' : np.min})
df1['ISO']=df1.index

Created Week Billed Week ISO
ISO
AT 201704 201708 AT
DE 201701 201705 DE

ISO=[]
BilledWeek=[]
CreateWeek=[]
for i in range(len(df1)):
BilledWeek.extend([df1.ix[i,1]]*(df1.ix[i,1]-df1.ix[i,0]+1))
CreateWeek.extend(list(range(df1.ix[i,0],df1.ix[i,1]+1)))
ISO.extend([df1.ix[i,2]]*(df1.ix[i,1]-df1.ix[i,0]+1))
DF=pd.DataFrame({'BilledWeek':BilledWeek,'CreateWeek':CreateWeek,'ISO':ISO})
Target=DF.merge(df,left_on=['BilledWeek','CreateWeek','ISO'],right_on=['Billed Week','Created Week','ISO'],how='left')
Target.Billings.fillna(0,inplace=True)
Target=Target.drop(['Billed Week', 'Created Week'],axis=1)
Target['Product']=Target.groupby('ISO')['Product'].ffill()

Out[75]:
BilledWeek CreateWeek ISO Product Billings
0 201708 201704 AT Table 1000.0
1 201708 201705 AT Table 1000.0
2 201708 201706 AT Table 1000.0
3 201708 201707 AT Table 0.0
4 201708 201708 AT Table 1000.0
5 201705 201701 DE Table 1000.0
6 201705 201702 DE Table 1000.0
7 201705 201703 DE Table 0.0
8 201705 201704 DE Table 1000.0
9 201705 201705 DE Table 1000.0

关于Python Pandas 用缺失值填充数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44711471/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com