gpt4 book ai didi

Python:如何获取匹配之间的字符串?

转载 作者:行者123 更新时间:2023-12-01 05:04:33 26 4
gpt4 key购买 nike

我有

FILE = open("file.txt", "r") #long text file
TEXT = FILE.read()

#long identification code with dots (.) and slashes (-)
regex = "process \d\d\d\d\d\d\d\-\d\d\.\d\d\d\d\.\d+\.\d\d\.\d\d\d\d"
SRC = re.findall(regex, TEXT, flags=re.IGNORECASE|re.MULTILINE)

如何获取第一次出现的第一个字符 SRC[i] 和下一次出现的第一个字符 SRC[i+1] 之间的文本,依此类推?找不到任何直接令人满意的答案...

更多信息编辑:

pattern = 'process \d{7}\-\d{2}\.\d{4}\.\d+\.\d{2}\.\d{4}'

sample_input = "Process 1234567-89.1234.12431242.12.1234 - text title and long text description with no assured pattern Process 2234567-89.1234.12431242.12.1234 : chars and more text Process 3234567-89.1234.12431242.12.1234 - more text process 3234567-89.1234.12431242.12.1234 (...)"

sample_output[0] = "Process 1234567-89.1234.12431242.12.1234 - text title and long text description with no assured pattern "
sample_output[1] = "Process 2234567-89.1234.12431242.12.1234 : chars and more text "
sample_output[2] = "Process 3234567-89.1234.12431242.12.1234 - more text "
sample_output[3] = "process 3234567-89.1234.12431242.12.1234 "

最佳答案

您可以使用此正则表达式:

(Process \d{7}\-\d{2}\.\d{4}\.\d+\.\d{2}\.\d{4}.*?)(?=Process)|(Process \d{7}\-\d{2}\.\d{4}\.\d+\.\d{2}\.\d{4}.*)

Working demo

enter image description here)

比赛信息

MATCH 1
1. [0-105] `Process 1234567-89.1234.12431242.12.1234 - text title and long text description with no assured pattern `
MATCH 2
1. [105-168] `Process 2234567-89.1234.12431242.12.1234 : chars and more text `
MATCH 3
1. [168-221] `Process 3234567-89.1234.12431242.12.1234 - more text `
MATCH 4
2. [221-267] `Process 3234567-89.1234.12431242.12.1234 (...)`

您可以使用此代码:

sample_input = "Process 1234567-89.1234.12431242.12.1234 -  text title and long text description with no assured pattern Process 2234567-89.1234.12431242.12.1234 : chars and more text Process 3234567-89.1234.12431242.12.1234 - more text process 3234567-89.1234.12431242.12.1234 (...)"
m = re.match(r"(Process \d{7}\-\d{2}\.\d{4}\.\d+\.\d{2}\.\d{4}.*?)(?=Process)|(Process \d{7}\-\d{2}\.\d{4}\.\d+\.\d{2}\.\d{4}.*)", sample_input)
m.group(1) # The first parenthesized subgroup.
m.groups() # Return a tuple containing all the subgroups of the match, from 1 up to however many groups are in the pattern

关于Python:如何获取匹配之间的字符串?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25319049/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com