gpt4 book ai didi

python - Pandas ,避免数据透视表中的层次结构

转载 作者:行者123 更新时间:2023-11-28 17:28:13 24 4
gpt4 key购买 nike

我有一个 pandas 数据框 df,使用以下函数从中生成数据透视表;

def objective2(excel_file):
df = pd.read_excel(excel_file)

# WBC cut-offs
df['WBC_groups'] = pd.cut(df.WBC, [0, 4, 12, 100],
labels=['WBC < 4', 'WBC Normal', 'WBC > 12'])

df['count'] = 1

table = df.pivot_table('count', index=['Sex'],
columns=['WBC_groups', 'Outcome_at_24'],
aggfunc='sum',
margins=True, margins_name='Total')

return table

这会生成下表:

WBC_groups         WBC < 4      WBC Normal      WBC > 12      Total
Outcome_at_24 Alive Died Alive Died Alive Died
Sex
Female 10.0 2.0 20.0 6.0 14.0 NaN 86.0
Male 3.0 NaN 28.0 3.0 26.0 4.0 111.0
Total 13.0 2.0 48.0 9.0 40.0 4.0 197.0

如何避免列中的层次结构,使表格看起来像这样:

WBC_groups       WBC < 4    WBC Normal   WBC > 12   Alive   Died  Total      
Sex
Female 10.0 2.0 20.0 6.0 14.0 86.0
Male 3.0 NaN 28.0 3.0 26.0 111.0
Total 13.0 2.0 48.0 9.0 40.0 197.0

注意:表格中的数据不准确,只是假数据。

最佳答案

我认为你无法避免层次结构,因为在 pivot_table 中使用包含两列的参数列 - WBC_groupsOutcome_at_24

最简单的解决方案是设置新的列名,然后设置 drop rem 列:

df.columns = ['WBC < 4', 'WBC Normal', 'WBC > 12', 'Alive', 'Died', 'rem', 'Total']
df = df.drop('rem', axis=1)
print df
WBC < 4 WBC Normal WBC > 12 Alive Died Total
Sex
Female 10.0 2.0 20.0 6.0 14.0 86.0
Male 3.0 NaN 28.0 3.0 26.0 111.0
Total 13.0 2.0 48.0 9.0 40.0 197.0

但是如果您需要更通用的解决方案:

print df
WBC_groups WBC < 4 WBC Normal WBC > 12 Total
Outcome_at_24 Alive Died Alive Died Alive Died
Sex
Female 10.0 2.0 20.0 6.0 14.0 NaN 86.0
Male 3.0 NaN 28.0 3.0 26.0 4.0 111.0
Total 13.0 2.0 48.0 9.0 40.0 4.0 197.0

cols1 = df.columns.get_level_values('WBC_groups').to_series().drop_duplicates().tolist()
print cols1
['WBC < 4', 'WBC Normal', 'WBC > 12', 'Total']

cols2 = df.columns.get_level_values('Outcome_at_24').to_series().drop_duplicates().tolist()
print cols2
['Alive', 'Died', ' ']

cols = cols1[:-1] + cols2[:2] + ['rem'] + cols1[-1:]
print cols
['WBC < 4', 'WBC Normal', 'WBC > 12', 'Alive', 'Died', 'rem', 'Total']

df.columns = cols

df = df.drop('rem', axis=1)
print df
WBC < 4 WBC Normal WBC > 12 Alive Died Total
Sex
Female 10.0 2.0 20.0 6.0 14.0 86.0
Male 3.0 NaN 28.0 3.0 26.0 111.0
Total 13.0 2.0 48.0 9.0 40.0 197.0

关于python - Pandas ,避免数据透视表中的层次结构,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36820136/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com