gpt4 book ai didi

python - 从 python 字典更新文本文件

转载 作者:太空宇宙 更新时间:2023-11-04 11:18:45 25 4
gpt4 key购买 nike

您好社区成员,

假设我有一个 python 字典:

dict = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}

和文本列表,例如:

text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']

我想在文本文件的所有出现中将属于字典(比如新鲜空气)的每个短语显示为 #fresh_air# 而对于字典中的每个单词(比如 milk),输出应显示为 #milk#,即在所有出现的 text_file 的开头和结尾附加特殊字符。

我想要的输出应该是以下形式(列表列表):

[[is vitamin d in #milk# enough], [try to improve quality level by automatic intake of #fresh_air#], [turn on the tv or #entertainment_system# based on the individual preferences], [#blood_pressure# monitor], [I buy more #ice_cream#], [proper method to add frozen wild blueberries in #ice_cream# with #milk#]]

是否存在任何标准方法来以省时的方式实现这一点?

我是使用 python 进行列表和文本处理的新手,我尝试使用列表理解但未能达到预期的结果。非常感谢任何帮助。

最佳答案

使用正则表达式。

例如:

import re
data = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}
pattern = re.compile("("+"|".join(data)+")")
text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']

result = [pattern.sub(r"#\1#", i) for i in text_file]
print(result)

输出:

['is vitamin d in #milk# enough',
'try to improve quality level by automatic intake of #fresh air#',
'turn on the tv or #entertainment system# based on that individual preferences',
'#blood pressure# monitor',
'I buy more #ice cream#',
'proper method to add frozen wild blueberries in #ice cream#']

注意您的dict 变量是一个set 对象。


根据评论中的要求更新了代码段。

演示:

import re
data = {'fresh air', 'entertainment system', 'ice cream', 'milk', 'dog', 'blood pressure'}
data = {i: i.replace(" ", "_") for i in data}
#pattern = re.compile("("+"|".join(data)+")")
pattern = re.compile(r"\b("+"|".join(data)+r")\b")
text_file = ['is vitamin d in milk enough', 'try to improve quality level by automatic intake of fresh air', 'turn on the tv or entertainment system based on that individual preferences', 'blood pressure monitor', 'I buy more ice cream', 'proper method to add frozen wild blueberries in ice cream']

result = [pattern.sub(lambda x: "#{}#".format(data[x.group()]), i) for i in text_file]
print(result)

输出:

['is vitamin d in #milk# enough',
'try to improve quality level by automatic intake of #fresh_air#',
'turn on the tv or #entertainment_system# based on that individual preferences',
'#blood_pressure# monitor',
'I buy more #ice_cream#',
'proper method to add frozen wild blueberries in #ice_cream#']

关于python - 从 python 字典更新文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56444717/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com