gpt4 book ai didi

python - 使用 Pandas Python 操作数据框

转载 作者:行者123 更新时间:2023-11-28 22:35:07 26 4
gpt4 key购买 nike

好吧,我经历了很长一段时间才让我的数据框看起来像这样,以便我能够绘制它:

data_mth  GROUP
201504 499 and below 0.001806
201505 499 and below 0.007375
201506 499 and below 0.000509
201507 499 and below 0.007344
201504 500 - 599 0.016672
201505 500 - 599 0.011473
201506 500 - 599 0.017733
201507 500 - 599 0.017651
201504 800 - 899 0.472784
201505 800 - 899 0.516837
201506 800 - 899 0.169811
201507 800 - 899 0.293966
201504 900 and above 0.065144
201505 900 and above 0.226626
201506 900 and above 0.251585
201507 900 and above 0.299850

由于这种方式占用了多少空间,我不得不修改我的代码,现在我有了这个数据框:

ptnr_cur_vntg_scor_band  499 and below  500 - 599  800 - 899  900 and above
data_mth
201504 0.001806 0.016672 0.472784 0.065144
201505 0.007375 0.011473 0.516837 0.226626
201506 0.000509 0.017733 0.169811 0.251585
201507 0.007344 0.017651 0.293966 0.299850

将第二个数据框操作为看起来像第一个数据框的好方法是什么?

我当前的代码如下所示:

df = self.bunch['occ_data.all_data']
df = cpr.filter(df, 'ccm_acct_status', 'Open', 'Open-Inactive', 'Open-Active', 'OpenFraud', 'New')
df = df.groupby(['ptnr_cur_vntg_scor_band', 'data_mth']).sum()['ccm_curr_vntg_cnt']

df = df.unstack(0).fillna(0)

df.loc[:,"499andbelow":"NoVantageScore"] = df.loc[:, "499andbelow":"NoVantageScore"].div(df.sum(axis=1), axis=0)
df = df.fillna(0)

它的输出是上面的第二个数据帧。

最佳答案

import io
import pandas as pd

data = io.StringIO('''\
499 and below,500 - 599,800 - 899,900 and above
201504,0.001806,0.016672,0.472784,0.065144
201505,0.007375,0.011473,0.516837,0.226626
201506,0.000509,0.017733,0.169811,0.251585
201507,0.007344,0.017651,0.293966,0.299850
''')

df = pd.read_csv(data)
df.index.name = 'data_mth'
df.columns.name = 'ptnr_cur_vntg_scor_band'
print(df)

# ptnr_cur_vntg_scor_band 499 and below 500 - 599 800 - 899 900 and above
# data_mth
# 201504 0.001806 0.016672 0.472784 0.065144
# 201505 0.007375 0.011473 0.516837 0.226626
# 201506 0.000509 0.017733 0.169811 0.251585
# 201507 0.007344 0.017651 0.293966 0.299850

s = df.unstack().swaplevel()
s.index.names = 'data_mth', 'GROUP'
print(s)

输出:

data_mth  GROUP   
201504 499 and below 0.001806
201505 499 and below 0.007375
201506 499 and below 0.000509
201507 499 and below 0.007344
201504 500 - 599 0.016672
201505 500 - 599 0.011473
201506 500 - 599 0.017733
201507 500 - 599 0.017651
201504 800 - 899 0.472784
201505 800 - 899 0.516837
201506 800 - 899 0.169811
201507 800 - 899 0.293966
201504 900 and above 0.065144
201505 900 and above 0.226626
201506 900 and above 0.251585
201507 900 and above 0.299850
dtype: float64

关于python - 使用 Pandas Python 操作数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38445311/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com