gpt4 book ai didi

python - 如何在 Python 中比较每一行并获取最后一个完整句子

转载 作者:太空宇宙 更新时间:2023-11-03 20:35:37 26 4
gpt4 key购买 nike

我有一个包含以下内容的文件。

BEFORE
BEFORE THE
BEFORE THE PARLIAMENT
BEFORE THE PARLIAMENT ON
BEFORE THE PARLIAMENT ON BRITAIN'S
BEFORE THE PARLIAMENT ON BRITAIN'S RELATIONS
BEFORE THE PARLIAMENT ON BRITAIN'S RELATIONS WITH
BEFORE THE PARLIAMENT ON BRITAIN'S RELATIONS WITH SCOTLAND
BRITAIN'S RELATIONS WITH SCOTLAND FOLLOWING
BRITAIN'S RELATIONS WITH SCOTLAND FOLLOWING THE
BRITAIN'S RELATIONS WITH SCOTLAND FOLLOWING THE REFERENDUM
SCOTLAND FOLLOWING THE REFERENDUM VOTE.
SCOTLAND FOLLOWING THE REFERENDUM VOTE. LAST
SCOTLAND FOLLOWING THE REFERENDUM VOTE. LAST MONTH
SCOTLAND FOLLOWING THE REFERENDUM VOTE. LAST MONTH SCOTLAND
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN FAVOR
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN FAVOR OF
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE UNITED
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE UNITED KINGDOM
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE UNITED KINGDOM AFTER
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE UNITED KINGDOM AFTER THE

我试图忽略重复的句子,只得到最后一个完整的句子。所以它看起来像这样

BEFORE THE PARLIAMENT ON BRITAIN'S RELATIONS WITH SCOTLAND
BRITAIN'S RELATIONS WITH SCOTLAND FOLLOWING THE REFERENDUM
SCOTLAND FOLLOWING THE REFERENDUM VOTE. LAST MONTH SCOTLAND
REFERENDUM VOTE. LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH
LAST MONTH SCOTLAND VOTED IN FAVOR OF STAYING WITH THE UNITED KINGDOM AFTER THE

我正在查看上一行是否在下一行中,如果是,我想继续迭代,如果不是,我想将最后一句添加到列表中。但是,我的下面的逻辑不起作用。

with open("data.txt", 'r') as f:
data = f.read()
data_list = []
comp_word = "BEFORE"
for line in data:
if comp_word in line:
comp_word == line
elif comp_word not in line:
data_list.append(line)

print(data_list)

解决这个问题的替代方法是什么?

最佳答案

data = []
with open("data.txt") as infile:
cache = ''
for line in infile:
line = line.strip()
# if the current line is an extention of the last line, update and ignore
if line.startswith(cache):
cache = line
else:
# we see a brand new content line. Write out the cache and reset it to the current line's contents
data.append(cache)
cache = line
data.append(line)

关于python - 如何在 Python 中比较每一行并获取最后一个完整句子,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57189895/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com