gpt4 book ai didi

python - 从文本文件中删除特定文本

转载 作者:行者123 更新时间:2023-11-28 17:46:05 25 4
gpt4 key购买 nike

我有一个文本文件文本文件

>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|E10D
Sequence unavailable
>E8|E2|EKB
Cricket

我编写了以下代码来检测文本文件中不可用的序列并将其写入新的文本文件

lastline = None
with open('output.txt', 'w') as W:
with open('input.txt', 'r') as f:
for line in f.readlines():
if not lastline:
lastline = line.rstrip('\n')
continue
if line.rstrip('\n') == 'Sequence unavailable':
_, _, id = lastline.split('|')
data= 'Sequence unavailable|' + id
W.write(data)
W.write('\n')
lastline = None

它工作正常,它从文本文件中检测到不可用的序列并将其写入一个新文件,但我希望它从它读取的文件中删除它

输入.txt

>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|E10D
Sequence unavailable
>E8|E2|EKB
Cricket

代码后的输入应该是这样的

>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|EKB
Cricket

最佳答案

这里我没有使用file.readlines 方法,因为它将文件中的所有行提取到列表中。因此,它的内存效率不高。

方法一:使用临时文件

import os
with open('input.txt') as f1, open('output.txt', 'w') as f2,\
open('temp_file','w') as f3:
lines = [] # store lines between two `>` in this list
for line in f1:
if line.startswith('>'):
if lines:
f3.writelines(lines)
lines = [line]
else:
lines.append(line)
elif line.rstrip('\n') == 'Sequence unavailable':
f2.writelines(lines + [line])
lines = []
else:
lines.append(line)

f3.writelines(lines)

os.remove('input.txt')
os.rename('temp_file', 'input.txt')

演示:

$ cat input.txt
>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|E10D
Sequence unavailable
>E8|E2|EKB
Cricket

$ python so.py

$ cat input.txt
>E8|E2|E9D
Football is a good game
Its good for health
you can play it every day
>E8|E2|EKB
Cricket
$ cat output.txt
>E8|E2|E10D
Sequence unavailable

要生成临时文件,您还可以使用 tempfile模块。

方法二:fileinput模块

此方法不需要临时文件:

import fileinput
with open('output.txt', 'w') as f2:
lines = []
for line in fileinput.input('input.txt', inplace = True):
if line.startswith('>'):
if lines:
print "".join(lines),
lines = [line]
else:
lines.append(line)
elif line.rstrip('\n') == 'Sequence unavailable':
f2.writelines(lines + [line])
lines = []
else:
lines.append(line)

with open('input.txt','a') as f:
f.writelines(lines)








关于python - 从文本文件中删除特定文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17859479/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com