gpt4 book ai didi

python - 为什么我的正则表达式可以在 regexr.com 上运行,但从命令行运行时会抛出错误?

转载 作者:行者123 更新时间:2023-12-01 07:32:03 25 4
gpt4 key购买 nike

我需要使用正则表达式来解决两个问题来定位文件路径。

1) 主要问题:我收到一条我不明白的错误消息。2)在我更改一些小内容之前,脚本会运行,但正则表达式搜索没有返回任何内容。

在 regexr.com 和 pythex.org 中测试时,正则表达式确实有效,其中匹配项位于正确的位置。当我从命令行运行它时它不起作用。

这是我要定位的正则表达式:

('([a-zA-Z]:\\)([a-zA-Z0-9 ]*\\)*([a-zA-Z0-9 ]*\/)*([a-zA-Z0-9 ])*(\.[a-zA-Z]*)*'

这是其使用的代码:

import os
import re

#run script from directory the script is in - place it in the dir being processed
start_path = os.path.dirname(os.path.realpath(__file__))
metadata_path = start_path + "\Metadata"

#change directory to the metadata folder where email.txt is
try:
os.chdir(metadata_path)
except: print ('Could not change directory. Please try again.')

with open("email.txt", 'r', encoding = 'utf-8') as file:
all_lines = file.readlines()
no_header = all_lines[5:] #remove the header lines from email.txt
new_lines =[]
all_files=[]
unique_files =[]
for i in range(len(no_header)):#remove square charcter
new_lines.append(re.sub('\S\-\d+', '',no_header[i]))

for i in range(len(new_lines)):#capture all the names of files containing personal emails
test = re.search('([a-zA-Z]:\\)([a-zA-Z0-9 ]*\\)*([a-zA-Z0-9 ]*\/)*([a-
zA-Z0-9 ])*(\.[a-zA-Z]*)*',new_lines[i])
print (test)

我收到错误消息“re.error:缺失),位置 0 处未终止子模式”

它有均匀数量的括号,据我所知,它们似乎彼此匹配。我猜测这与我如何在模式中对事物进行分组有关。

就它什么都不返回而言,我是否错过了在线测试人员无法捕获的 python 特定规则?

谢谢!

最佳答案

我的猜测是它可能缺少 r 或者表达式中某处的括号:

测试

import re

regex = r"([a-zA-Z]:\\)([a-zA-Z0-9 ]*\\)*([a-zA-Z0-9 ]*\/)*([a-zA-Z0-9 ])*(\.[a-zA-Z]*)*"

test_str = "a:\\a\\a/a.a"

print(re.search(regex, test_str))
<小时/>

该表达式在 regex101.com 的右上角面板中进行了解释,如果您想探索/简化/修改它,请在this link中,如果您愿意,您可以观察它如何与一些示例输入匹配。

代码

import os
import re

#run script from directory the script is in - place it in the dir being processed
start_path = os.path.dirname(os.path.realpath(__file__))
metadata_path = start_path + "\Metadata"

#change directory to the metadata folder where email.txt is
try:
os.chdir(metadata_path)
except: print ('Could not change directory. Please try again.')

with open("email.txt", 'r', encoding = 'utf-8') as file:
all_lines = file.readlines()
no_header = all_lines[5:] #remove the header lines from email.txt
new_lines =[]
all_files=[]
unique_files =[]
for i in range(len(no_header)):#remove square charcter
new_lines.append(re.sub(r'\S\-\d+', '',no_header[i]))

for i in range(len(new_lines)):#capture all the names of files containing personal emails
test = re.search(r'([a-zA-Z]:\\)([a-zA-Z0-9 ]*\\)*([a-zA-Z0-9 ]*\/)*([a-
zA-Z0-9 ])*(\.[a-zA-Z]*)*',new_lines[i])
print (test)

关于python - 为什么我的正则表达式可以在 regexr.com 上运行,但从命令行运行时会抛出错误?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57168572/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com