gpt4 book ai didi

python - 从多列的 value_counts 中排除项目

转载 作者:行者123 更新时间:2023-12-05 08:37:32 24 4
gpt4 key购买 nike

我得到了以下数据框:

      ae264e3637204a6fb9bb56bc8210ddfd  ... 2906b810c7d4411798c6938adc9daaa5
1 not received ... not received
3 completed ... not received
5 not received ... viewed
8 not received ... completed
12 not received ... not received
... ... ...
16995 not received ... not received
16996 not received ... not received
16997 not received ... not received
16998 completed ... not received
16999 not received ... not received

我对 10 列应用 value_counts() 方法并获取值的百分比。

我是这样做的:

overall = profile[relevant_columns].apply(lambda x: round(pd.Series.value_counts(x) / len(x), 4)* 100)
overall

结果:

              ae264e3637204a6fb9bb56bc8210ddfd  ...  2906b810c7d4411798c6938adc9daaa5
completed 21.22 ... 22.82
not received 62.47 ... 63.04
unresponsive 1.59 ... 9.29
viewed 14.73 ... 4.86

预期输出:

              ae264e3637204a6fb9bb56bc8210ddfd  ...  2906b810c7d4411798c6938adc9daaa5
completed 56.52 ... 61.82
unresponsive 4.23 ... 25.12
viewed 39.23 ... 13.14

但是,我不希望我的结果中出现“未收到”的百分比。我知道我可以在一个循环中从每一列中删除值,然后将 table_counts() 应用于该列,但是将 apply 工作流保留在多个上会更好一行中的列。有谁知道如何实现这一点?

最佳答案

让我们屏蔽 relevant_columns 中的not received 值,然后使用normalize 应用pd.value_counts =True 计算每列唯一值的比例:

profile[relevant_columns].mask(lambda x: x.eq('not received'))\
.apply(pd.value_counts, normalize=True).mul(100).round(4)

           ae264e3637204a6fb9bb56bc8210ddfd  2906b810c7d4411798c6938adc9daaa5
completed 100.0 50.0
viewed NaN 50.0

关于python - 从多列的 value_counts 中排除项目,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/65474059/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com