gpt4 book ai didi

python - DataFrame 到用户定义的格式

转载 作者:太空宇宙 更新时间:2023-11-04 00:55:14 24 4
gpt4 key购买 nike

我有一个数据框

name  salary department              position
a 25000 x normal employee
b 50000 y normal employee
c 10000 y experienced employee
d 20000 x experienced employee

我想得到如下格式的结果:

dept  total salary  salary_percentage count_normal_employee      count_experienced_employee
x 55000 55000/115000 1 1
y 60000 60000/115000 1 1

最佳答案

您可以使用 pivot_tablefillna对于 df1groupbysum , 用 sum 划分新列 total salary df2 和最后一个 merge 的原始列 salary :

#pivot df, fill NaN by 0
df1 = df.pivot_table(index='department', columns='position', values='name', aggfunc='count').fillna(0).reset_index()
#reset column name - for nicer df
df1.columns.name = None
print df1
department experienced employee normal employee
0 x 1 1
1 y 1 1

#sum by groups by column department and rename column salary
df2 = df.groupby('department')['salary'].sum().reset_index().rename(columns={'salary':'total salary'})

df2['salary_percentage'] = df2['total salary'] / df['salary'].sum()
print df2
department total salary salary_percentage
0 x 45000 0.428571
1 y 60000 0.571429

print pd.merge(df1, df2, on=['department'])
department experienced employee normal employee total salary \
0 x 1 1 45000
1 y 1 1 60000

salary_percentage
0 0.428571
1 0.571429

关于python - DataFrame 到用户定义的格式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35439726/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com