gpt4 book ai didi

python - df groupby 集合比较

转载 作者:太空宇宙 更新时间:2023-11-03 14:14:00 24 4
gpt4 key购买 nike

我有一个要测试字谜词的单词列表。我想使用 pandas,这样我就不必使用计算上浪费的 for 循环。给定一个 .txt 单词列表:

“ACB”“BCA”“富”“钱币”“西类牙猎犬”

我想将它们放入 df 中,然后按字谜列表对它们进行分组 - 我可以稍后删除重复的行。

到目前为止我有代码:

import pandas as pd

wordlist = pd.read_csv('data/example.txt', sep='\r', header=None, index_col=None, names=['word'])
wordlist = wordlist.drop_duplicates(keep='first')
wordlist['split'] = ''
wordlist['anagrams'] = ''

for index, row in wordlist.iterrows() :
row['split'] = list(row['word'])

wordlist = wordlist.groupby('word')[('split')].apply(list)
print(wordlist)

我如何对一组进行分组,以便它知道

[[a, b, c]]
[[b, a, c]]

一样吗?

最佳答案

我认为你可以使用排序 列表:

df['a'] = df['word'].apply(lambda x: sorted(list(x)))
print (df)

word a
0 acb [a, b, c]
1 bca [a, b, c]
2 foo [f, o, o]
3 oof [f, o, o]
4 spaniel [a, e, i, l, n, p, s]

查找字谜的另一个解决方案:

#reverse strings
df['reversed'] = df['word'].str[::-1]

#reshape
s = df.stack()
#get all dupes - anagrams
s1 = s[s.duplicated(keep=False)]
print (s1)
0 word acb
reversed bca
1 word bca
reversed acb
2 word foo
reversed oof
3 word oof
reversed foo
dtype: object

#if want select of values by second level word
s2 = s1.loc[pd.IndexSlice[:, 'word']]
print (s2)
0 acb
1 bca
2 foo
3 oof
dtype: object

关于python - df groupby 集合比较,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48323981/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com