gpt4 book ai didi

python - 打开并读取多个文本文件并匹配单词

转载 作者:太空宇宙 更新时间:2023-11-04 04:02:37 26 4
gpt4 key购买 nike

如何创建一个读取两个文本文件并打印出与文本文件编号1匹配的单词的脚本?

下面的代码是我得到的最详尽的代码,它可以匹配字符串中的单词并打印出来,但是我需要它来读取两个或更多大文本文件并打印找到的相同单词。谢谢。

import re

def get_words_from_string(s):
return set(re.findall(re.compile('\w+'), s.lower()))

def get_words_from_file(fname):
with open(fname, 'rb') as inf:
return get_words_from_string(inf.read())

def all_words(needle, haystack):
return set(needle).issubset(set(haystack))

def any_words(needle, haystack):
return set(needle).intersection(set(haystack))

search_words = get_words_from_string("this my test")
find_in = get_words_from_string("If this were my test, I is passing")

print (search_words)

最佳答案

这可以通过使用列表理解来压缩,但是可以完成工作

import os

def get_words(filename):
wordlist = []
with open(filename) as fp:
for line in fp:
wordsinline = line.strip().split()
for item in wordsinline:
if item not in wordlist:
wordlist.append(item)
return wordlist

def find_common_words(filename1, filename2):
wordlist1 = []
wordlist2 = []
matching_words = []

wordlist1 = get_words(filename1)
wordlist2 = get_words(filename2)

matching_words = set(wordlist1) & set(wordlist2)
print(matching_words)

def testit():
# Assert in same directory as code
os.chdir(os.path.abspath(os.path.dirname(__file__)))
filename1 = 'words1.txt'
filename2 = 'words2.txt'
find_common_words(filename1, filename2)

if __name__ == '__main__':
testit()

关于python - 打开并读取多个文本文件并匹配单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57943225/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com