gpt4 book ai didi

python - 用另一个文件中的单词替换替换单词

转载 作者:太空宇宙 更新时间:2023-11-04 08:08:00 24 4
gpt4 key购买 nike

我的文本文件 (mytext.txt) 中的单词需要替换为另一个文本文件 (replace.txt) 中提供的其他单词

cat mytext.txt
this is here. and it should be there.
me is this will become you is that.

cat replace.txt
this that
here there
me you

以下代码未按预期工作。

with open('mytext.txt', 'r') as myf:
with open('replace.txt' , 'r') as myr:
for line in myf.readlines():
for l2 in myr.readlines():
original, replace = l2.split()
print line.replace(original, replace)

预期输出:

that is there. and it should be there. 
you is that will become you is that.

最佳答案

编辑: 我的观点是正确的,OP 要求逐字替换而不是简单的字符串替换('become' -> 'become' 而不是 'becoyou')。我想一个字典版本可能看起来像这样,使用在 Splitting a string into words and punctuation 的已接受答案的评论中找到的正则表达式拆分方法。 :

import re

def clean_split(string_input):
"""
Split a string into its component tokens and return as list
Treat spaces and punctuations, including in-word apostrophes as separate tokens

>>> clean_split("it's a good day today!")
["it", "'", "s", " ", "a", " ", "good", " ", "day", " ", "today", "!"]
"""
return re.findall(r"[\w]+|[^\w]", string_input)

with open('replace.txt' , 'r') as myr:
replacements = dict(tuple(line.split()) for line in myr)

with open('mytext.txt', 'r') as myf:
for line in myf:
print ''.join(replacements.get(word, word) for word in clean_split(line)),

我无法很好地推理re 效率,如果有人指出明显的低效率,我将不胜感激。

编辑 2: 好吧,我在单词和标点符号之间插入空格,现在 通过将空格视为标记并执行 ''.join() 来修复 而不是 ' '.join()

关于python - 用另一个文件中的单词替换替换单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27773802/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com