gpt4 book ai didi

python - Pandas groupby 查找公共(public)字符串

转载 作者:行者123 更新时间:2023-12-01 09:15:19 24 4
gpt4 key购买 nike

我的数据框:

    Name              fav_fruit
0 justin apple
1 bieber justin apple
2 Kris Justin bieber apple
3 Kim Lee orange
4 lee kim orange
5 mary barnet orange
6 tom hawkins pears
7 Sr Tom Hawkins pears
8 Jose Hawkins pears
9 Shanita pineapple
10 Joe pineapple

df1=pd.DataFrame({'Name':['justin','bieber justin','Kris Justin bieber','Kim Lee','lee kim','mary barnet','tom hawkins','Sr Tom Hawkins','Jose Hawkins','Shanita','Joe'],
'fav_fruit':['apple'
,'apple'
,'apple'
,'orange'
,'orange'
,'orange'
,'pears'
,'pears','pears'
,'pineapple','pineapple']})

我想计算 fav_fruit 列上的 grouby 之后的 Name 列中常见单词的数量,因此对于 apple 计数是 2 justin bieber,对于 Orange kim,lee 和 pineapple 是 0

预期输出:

Name                  fav_fruit            count
0 justin apple 2
1 bieber justin apple 2
2 Kris Justin bieber apple 2
3 Kim Lee orange 2
4 lee kim orange 2
5 mary barnet orange 2
6 tom hawkins pears 2
7 Sr Tom Hawkins pears 2
8 Jose Hawkins pears 2
9 Shanita pineapple 0
10 Joe pineapple 0

最佳答案

我认为需要transform使用自定义函数 - 首先创建一大串连接值,转换为小写并拆分,最后使用 collections.Counter过滤所有重复值:

from collections import Counter

def f(x):
a = ' '.join(x).lower().split()
return len([k for k, v in Counter(a).items() if v != 1])

df['count'] = df.groupby('fav_fruit')['Name'].transform(f)
print (df)
Name fav_fruit count
0 justin apple 2
1 bieber justin apple 2
2 Kris Justin bieber apple 2
3 Kim Lee orange 2
4 lee kim orange 2
5 mary barnet orange 2
6 tom hawkins pears 2
7 Sr Tom Hawkins pears 2
8 Jose Hawkins pears 2
9 Shanita pineapple 0
10 Joe pineapple 0

关于python - Pandas groupby 查找公共(public)字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51325444/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com