gpt4 book ai didi

python - 如何将同义词存储为数据框中的列?

转载 作者:行者123 更新时间:2023-12-04 03:35:54 28 4
gpt4 key购买 nike

想要将以下代码的获取结果存储在数据框中。

两列,一列是实际名称,另一列是新行中的每个同义词。

import nltk
from nltk.corpus import wordnet
import pandas as pd

List = ['protest','riot','conflict']
df=[]
def process_genre(str):
for genre in str:
result = []
for syn in wordnet.synsets(genre):
for l in syn.lemmas():
result.append(l.name())
print(set(result))
process_genre(List)

output:
-------
{'resist', 'objection', 'dissent', 'protestation', 'protest'}
{'bacchanalia', 'riot', 'saturnalia', 'belly_laugh', 'scream', 'wow', 'bacchanal', 'thigh-slapper', 'sidesplitter', 'drunken_revelry', 'carouse', 'rioting', 'roister', 'debauchery', 'orgy', 'public_violence', 'howler', 'debauch'}
{'fight', 'battle', 'difference', 'dispute', 'conflict', 'infringe', 'engagement', 'struggle', 'difference_of_opinion', 'contravene', 'run_afoul'}

想要将结果存储在数据框中:

# Expected Result:

Col1 Col2
--------------------
protest resist
protest objection
protest dissent
... ...
riot scream
riot carouse
riot saturnalia
... ...
conflict Fight
conflict battle
... ...

最佳答案

这是一个可能的解决方案:

from nltk.corpus import wordnet
import pandas as pd

def process_genres(genres):
return (pd.DataFrame([(genre, l.name())
for genre in genres
for syn in wordnet.synsets(genre)
for l in syn.lemmas()], columns=['Col1', 'Col2'])
.drop_duplicates())

使用方法如下:

>>> genres = ['protest', 'riot', 'conflict']
>>> df = process_genres(genres)
>>> df
Col1 Col2
0 protest protest
1 protest protestation
...
11 riot riot
12 riot public_violence
13 riot rioting
...
34 conflict conflict
35 conflict struggle
36 conflict battle
...
53 conflict contravene

关于python - 如何将同义词存储为数据框中的列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66917904/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com