gpt4 book ai didi

python - 制作缩写 - 选择非停用词的第一个字符

转载 作者:太空宇宙 更新时间:2023-11-04 09:35:39 25 4
gpt4 key购买 nike

给定一个停用词列表和一个数据框,其中 1 列具有完整形式,如图所示 -

stopwords = ['of', 'and', '&', 'com', 'org']
df = pd.DataFrame({'Full form': ['World health organization', 'Intellectual property', 'royal bank of canada']})
df

+---+---------------------------+
| | Full form |
+---+---------------------------+
| 0 | World health organization |
| 1 | Intellectual property |
| 2 | Royal bank of canada |
+---+---------------------------+

我正在寻找一种方法来使相邻的列的缩写忽略停用词(如果有的话)。

预期输出:

+---+---------------------------+----------------+
| | Full form | Abbreviation |
+---+---------------------------+----------------+
| 0 | World health organization | WHO |
| 1 | Intellectual property | IP |
| 2 | Royal bank of canada | RBC |
+---+---------------------------+----------------+

最佳答案

应该这样做:

import pandas as pd

stopwords = ['of', 'and', '&', 'com', 'org']
df = pd.DataFrame({'Full form': ['World health organization', 'Intellectual property', 'royal bank of canada']})


def abbrev(t, stopwords=stopwords):
return ''.join(u[0] for u in t.split() if u not in stopwords).upper()


df['Abbreviation'] = df['Full form'].apply(abbrev)

print(df)

输出

                   Full form Abbreviation
0 World health organization WHO
1 Intellectual property IP
2 royal bank of canada RBC

关于python - 制作缩写 - 选择非停用词的第一个字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53874357/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com