gpt4 book ai didi

python - 使用 Python Pandas 进行数据分析

转载 作者:行者123 更新时间:2023-11-30 22:48:05 24 4
gpt4 key购买 nike

我是 Pandas 库的新手,需要一些帮助。我有两列这样的:

Test Result       Risk Rating
Fail Low
Pass Medium
Skip High
Pass Low
Fail Medium
Pass High
Skip Low
Fail Medium
Fail High

现在,我需要从“测试结果”列中找到失败、通过和跳过的总数,并且我能够做到这一点。但是,我还需要从测试结果列中获取“失败”的总数,并从风险评级列中获取“低”的总数。同样,“中等失败”总数等等。我的最终结果应该是这样的:

Fail (Low Risk Rating) = 1
Fail (Medium Risk Rating) = 2
Fail (High Risk Rating) = 1
Pass (Low Risk Rating) = 1
Pass (Medium Risk Rating) = 1
Pass (High Risk Rating) = 1
Skip (Low Risk Rating) = 1
Skip (Medium Risk Rating) = 0
Skip (High Risk Rating) = 1

我该怎么做?任何帮助将不胜感激。

最佳答案

我认为你需要groupby按列和聚合 size :

df = df.groupby(['Test Result', 'Risk Rating']).size().reset_index(name='counts')
print (df)
Test Result Risk Rating counts
0 Fail High 1
1 Fail Low 1
2 Fail Medium 2
3 Pass High 1
4 Pass Low 1
5 Pass Medium 1
6 Skip High 1
7 Skip Low 1

也许更好的是数据透视表 unstack :

df = df.groupby(['Test Result', 'Risk Rating']).size().unstack(fill_value=0)
print (df)
Risk Rating High Low Medium
Test Result
Fail 1 1 2
Pass 1 1 1
Skip 1 1 0

或更慢的解决方案 crosstab :

df = pd.crosstab(df['Test Result'], df['Risk Rating'])
print (df)
Risk Rating High Low Medium
Test Result
Fail 1 1 2
Pass 1 1 1
Skip 1 1 0

如果需要 0 缺失值,请添加 stack :

df = df.groupby(['Test Result', 'Risk Rating'])
.size()
.unstack(fill_value=0)
.stack()
.reset_index(name='counts')
print (df)
Test Result Risk Rating counts
0 Fail High 1
1 Fail Low 1
2 Fail Medium 2
3 Pass High 1
4 Pass Low 1
5 Pass Medium 1
6 Skip High 1
7 Skip Low 1
8 Skip Medium 0

关于python - 使用 Python Pandas 进行数据分析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40303957/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com