gpt4 book ai didi

python - Pandas:Groupby 列子集的所有组合

转载 作者:太空宇宙 更新时间:2023-11-04 01:58:49 24 4
gpt4 key购买 nike

我的虚拟数据框如下:

+--------+------+------+------+------+
| item | p1 | p2 | p3 | p4 |
|--------+------+------+------+------|
| a | 1 | 0 | 1 | 1 |
| b | 0 | 1 | 1 | 0 |
| c | 1 | 0 | 1 | 1 |
| d | 0 | 0 | 0 | 1 |
| e | 1 | 0 | 1 | 1 |
| f | 1 | 1 | 1 | 1 |
| g | 1 | 0 | 0 | 0 |
+--------+------+------+------+------+

我想找出参数p1,p2,p3,p4是否组合使用的方式数。预期结果如下:

+--------+------+--------+--------+--------+
| Length | P-groups(s) | Count | Items |
+--------+---------------+--------+--------+
| 1 | p1 | 1 | g |
| | p4 | 1 | d |
| | | | |
| 2 | p2,p3 | 1 | b |
| | | | |
| 3 | p1,p2,p3 | 3 | [a,c,e]|
| | | | |
| 4 | p1,p2,p3,p4 | 1 | f |
+--------+---------------+--------+--------+

所以,我的粗略代码如下:

import pandas as pd
from itertools import chain, combinations

df= pd.DataFrame({'item': ['a','b','c','d','e','f','g'],
'p1': [1,0,1,0,1,1,1],
'p2': [0,1,0,0,0,1,0],
'p3': [1,1,1,0,1,1,0],
'p4': [1,0,1,1,1,1,0]})


def all_subsets(ss):
return chain(*map(lambda x: combinations(ss, x), range(0, len(ss)+1)))


subsets = []

for subset in all_subsets(list(df)[1:]):
subsets.append(list(subset))

for grp in subsets[1:]: #subset[1:] is to exclude empty set
print df.groupby(grp).size().reset_index().rename(columns={0:'count'})

我想知道是否有任何 pandas 方法可以得到预期的结果?

最佳答案

pd.groupbypd.filter 一起使用:

import pandas as pd

tmp = df.filter(like='p')
new = tmp.replace(1, pd.Series(tmp.columns, tmp.columns)).copy(deep=True)
df['length'] = tmp.sum(1)
df['groups'] = new.apply(lambda x:','.join(s for s in x if s), 1)

gdf = df.groupby(['length', 'groups'])['item'].agg(['count', list])
print(gdf)

输出:

                    count       list
length groups
1 p1 1 [g]
p4 1 [d]
2 p2,p3 1 [b]
3 p1,p3,p4 3 [a, c, e]
4 p1,p2,p3,p4 1 [f]

如果你想解压 gdf['list'] 添加下面一行:

gdf['list'] = [l[0] if len(l)==1 else l for l in gdf['list']]

这就像想要的输出:

                    count       list
length groups
1 p1 1 g
p4 1 d
2 p2,p3 1 b
3 p1,p3,p4 3 [a, c, e]
4 p1,p2,p3,p4 1 f

关于python - Pandas:Groupby 列子集的所有组合,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56180912/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com