gpt4 book ai didi

使用\n 分隔词的 Python 字符串文本文件不会分割

转载 作者:行者123 更新时间:2023-12-01 01:03:34 25 4
gpt4 key购买 nike

我收到了一个很长的 .txt 文件,读取时会返回一个长字符串,该字符串是一个由\n 分隔的大型单词语料库,如下所示:

\na+\nabound\nabounds\nabundance\nabundant\naccessable\naccessible\nacclaim\nacclaimed\nacclamation\naccolade\naccolades\naccommodative\naccomodative\naccomplish\naccomplished\naccomplishment...\nworld-famous\nworth\nworth-while\nworthiness\nworthwhile\nworthy\nwow\nwowed\nwowing\nwows\nyay\nyouthful\nzeal\nzenith\nzest\nzippy\n

我需要将此字符串拆分为这些单词的列表,但我通常用于 .csv 文件的命令都不起作用。我尝试过剥离、替换()、分割()、分割线(),但没有什么可以将其分解为这些单词的列表。如果您有任何帮助,我将不胜感激。

punctuation_chars = ["'", '"', ",", ".", "!", ":", ";", '#', '[',']','@']
punctuation_chars2=["'", '"', ",", ".", "!",":",";",'#','[',']','@','\n']
# list of positive words to use
positive_words = []
wrd_list = []
new_list = []
with open("positive_words.txt", 'r', encoding="utf-16") as pos_f:
for lin in pos_f:
if lin[0] != ';' and lin[0] != '\n':
positive_words.append(lin.strip())

pos_wrds = positive_words[0]
pos_wrds.strip()
print(pos_wrds)
for p in punctuation_chars:
pos_wrds = pos_wrds.replace(p,"")
print(pos_wrds)


wrd_list = pos_wrds.splitlines()
new_list = wrd_list[-1].splitlines

我想看到一个Python列表,其中每个单词都是分开的:list = [a+,大量,大量,丰富,丰富...]

最佳答案

分割线效果很好:

In [1]: text = "\na+\nabound\nabounds\nabundance\nabundant\naccessable\naccessible\nacclaim\nacclaimed\nacclamation\naccolade\naccolades\naccommodative\naccomodative\naccomplish\naccomplished\naccomplishment...\nworld-famous\nworth\nw
...: orth-while\nworthiness\nworthwhile\nworthy\nwow\nwowed\nwowing\nwows\nyay\nyouthful\nzeal\nzenith\nzest\nzippy\n"

In [2]: text.splitlines()
Out[2]:
['',
'a+',
'abound',
'abounds',
'abundance',
'abundant',
'accessable',
'accessible',
'acclaim',
'acclaimed',
'acclamation',
'accolade',
'accolades',
'accommodative',
'accomodative',
'accomplish',
'accomplished',
'accomplishment...',
'world-famous',
'worth',
'worth-while',
'worthiness',
'worthwhile',
'worthy',
'wow',
'wowed',
'wowing',
'wows',
'yay',
'youthful',
'zeal',
'zenith',
'zest',
'zippy']

关于使用\n 分隔词的 Python 字符串文本文件不会分割,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55582677/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com