gpt4 book ai didi

正则表达式在 golang 中命名组

转载 作者:行者123 更新时间:2023-12-01 22:07:36 25 4
gpt4 key购买 nike

我需要帮助将正则表达式与 golang 集成。
我想解析日志文件并创建一个在 https://regex101.com/r/p4mbiS/1/ 上看起来很不错的正则表达式
日志行如下所示:

57.157.87.86 - - [06/Feb/2020:00:11:04 +0100] "GET /?parammore=1&customer_id=1&version=1.56&param=meaningful&customer_name=somewebsite.de&some_id=4&cachebuster=1580944263903 HTTP/1.1" 204 0 "https://www.somewebsite.com/more/andheresomemore/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"
像这样的正则表达式:
(?P<ip>([^\s]+)).+?\[(?P<localtime>(.*?))\].+?GET\s\/\?(?P<request>.+?)\".+?\"(?P<ref>.+?)\".\"(?P<agent>.+?)\"
命名组的结果应如下所示:

ip: 57.157.87.86

localtime: 06/Feb/2020:00:11:04 +0100

request: parammore=1&customer_id=1&...HTTP/1.1

ref: https://www.somewebsite.com/more/andheresomemore/

agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0)...


regex101.com 生成对我不起作用的 golang 代码。我试图改进它,但没有成功。
golang 代码只返回整个字符串而不是组。
package main

import (
"regexp"
"fmt"
)

func main() {
var re = regexp.MustCompile(`(?P<ip>([^\s]+)).+?\[(?P<localtime>(.*?))\].+?GET\s\/\?(?P<request>.+?)\".+?\"(?P<ref>.+?)\".\"(?P<agent>.+?)\"`)
var str = `57.157.87.86 - - [06/Feb/2020:00:11:04 +0100] "GET /?parammore=1&customer_id=1&version=1.56&param=meaningful&customer_name=somewebsite.de&some_id=4&cachebuster=1580944263903 HTTP/1.1" 204 0 "https://www.somewebsite.com/more/andheresomemore/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"`

if len(re.FindStringIndex(str)) > 0 {
fmt.Println(re.FindString(str),"found at index",re.FindStringIndex(str)[0])
}
}
在这里找到 fiddle https://play.golang.org/p/e0_8PM-Nv6i

最佳答案

由于您定义了捕获组并需要提取它们的值,因此您需要使用 You .FindStringSubmatch :

package main

import (
"regexp"
"fmt"
)

func main() {
var re = regexp.MustCompile(`(?P<ip>\S+).+?\[(?P<localtime>.*?)\].+?GET\s/\?(?P<request>.+?)".+?"(?P<ref>.+?)"\s*"(?P<agent>.+?)"`)
var str = `57.157.87.86 - - [06/Feb/2020:00:11:04 +0100] "GET /?parammore=1&customer_id=1&version=1.56&param=meaningful&customer_name=somewebsite.de&some_id=4&cachebuster=1580944263903 HTTP/1.1" 204 0 "https://www.somewebsite.com/more/andheresomemore/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0"`
match := re.FindStringSubmatch(str)
fmt.Printf("IP: %s\nLocal Time: %s\nRequest: %s\nRef: %s\nAgent: %s", match[1],match[2], match[3], match[4], match[5])
}

输出:
IP: 57.157.87.86
Local Time: 06/Feb/2020:00:11:04 +0100
Request: parammore=1&customer_id=1&version=1.56&param=meaningful&customer_name=somewebsite.de&some_id=4&cachebuster=1580944263903 HTTP/1.1
Ref: https://www.somewebsite.com/more/andheresomemore/
Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0

请注意,不需要命名捕获组,只需使用编号的:
^(\S+)[\s-]+\[([^][]*)]\s+"GET\s+/\?([^"]+)"[^"]+"([^"]+)"\s+"([^"]+)"$

this regex demo .使用 .+? 不是一个好主意经常出现在模式中,因为它会降低性能,因此我用否定的字符类替换了那些点模式,并试图使模式更加冗长。

关于正则表达式在 golang 中命名组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60109288/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com