gpt4 book ai didi

Python Pandas 过滤和 groupby

转载 作者:太空宇宙 更新时间:2023-11-04 02:54:54 26 4
gpt4 key购买 nike

我将其作为 csv 在 pandas 中工作 - 前十行

print frame1.head(10)

alert Subject filetype type country status
0 33965790 44676 aba Attachment doc RU,RU,RU,RU deleted
1 33965786 44676 rcrump Attachment zip NaN deleted
2 33965771 3aba Attachment zip NaN deleted
3 33965770 NaN Attachment js ,, deleted
4 33965766 NaN Attachment js ,, deleted
5 33965761 NaN Attachment zip NaN deleted
6 33965760 NaN Attachment zip NaN deleted
7 33965757 NaN Attachment zip NaN deleted
8 33965751 35200 3aba Attachment doc RU,RU,RU deleted
9 33965747 35200 INVaba Attachment zip NaN deleted

我需要获取主题列并计算所有以“aba”作为子字符串的行。

Occurrences of aba- 512

甚至是这样的结果

aba    12
3aba 5
INVaba 2

这是我的代码-

targeted = frame1[frame1['Subject'].str.contains('aba', case=False , na=False)].groupby('Subject')
print (targeted.to_string(header=False))

获取错误 - AttributeError:无法访问“DataFrameGroupBy”对象的可调用属性“to_string”,请尝试使用“apply”方法

*****注意:我早些时候让这个工作用于不同文件类型的计数,这个工作 -

filetype = frame1.groupby('filetype').size()
###clean up the printing
print "Delivered in Email"
print (filetype.to_string(header=False))

然后给我 -

Delivered in Email
Attachment 32647
Header 131
URL 9236

最佳答案

要获得完整计数,只需使用 str.contains其次是 count .

>>> df.Subject.str.contains('aba', case=False, na=False).count()
10

然后要获取包含 'aba' 的唯一字符串的计数,您可以访问由 contains 找到的那些值,然后使用 value_counts .

>>> df.loc[df.Subject.str.contains('aba', case=False, na=False), 'Subject'].value_counts()

3aba 1
INVaba 1
aba 1
Name: Subject, dtype: int64

关于Python Pandas 过滤和 groupby,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42704927/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com