gpt4 book ai didi

python-3.x - Pandas 分组并计算两列的比率

转载 作者:行者123 更新时间:2023-12-02 19:41:00 25 4
gpt4 key购买 nike

我正在尝试使用 Pandas 和 groupby 来计算两列的比率。在下面的示例中,我要计算每个部门的员工状态比率(部门状态数量/每个部门员工总数)。例如,销售部门共有 3 名员工,拥有员工状态的员工数量为 2,即比例为 2/3,即 66.67%。我设法破解了这个问题,但必须有一个更优雅和简单的方法来做到这一点。如何更有效地获得下面所需的输出?

原始数据框:

  Department    Name      Status
0 Sales John Employee
1 Sales Steve Employee
2 Sales Sara Contractor
3 Finance Allen Contractor
4 Marketing Robert Employee
5 Marketing Lacy Contractor

代码:

mydict ={
'Name': ['John', 'Steve', 'Sara', 'Allen', 'Robert', 'Lacy'],
'Department': ['Sales', 'Sales', 'Sales', 'Finance', 'Marketing', 'Marketing'],
'Status': ['Employee', 'Employee', 'Contractor', 'Contractor', 'Employee', 'Contractor']
}

df = pd.DataFrame(mydict)

# Create column with total number of staff Status per Department
df['total_dept'] = df.groupby(['Department'])['Name'].transform('count')
print(df)
print('\n')


# Crate column with Status ratio per department
for k, v, in df.iterrows():
df.loc[k, 'Status_Ratio'] = (df.groupby(['Department', 'Status']).count().xs(v['Status'], level=1)['total_dept'][v['Department']]/v['total_dept']) *100
print(df)
print('\n')

# Final Groupby with Status Ratio. Size NOT needed
print(df.groupby(['Department', 'Status', 'Status_Ratio']).size())

所需输出:

Department  Status      Status_Ratio
Finance Contractor 100.00
Marketing Contractor 50.00
Employee 50.00
Sales Contractor 33.33
Employee 66.67

最佳答案

尝试(使用原始的df):

df.groupby("Department")["Status"].value_counts(normalize=True).mul(100)

输出:

Department  Status
Finance Contractor 100.000000
Marketing Contractor 50.000000
Employee 50.000000
Sales Employee 66.666667
Contractor 33.333333
Name: Status, dtype: float64

关于python-3.x - Pandas 分组并计算两列的比率,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60140400/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com