gpt4 book ai didi

regex - Perl 正则表达式捕获

转载 作者:行者123 更新时间:2023-12-04 18:17:02 25 4
gpt4 key购买 nike

我有以下文字:

GET /mac/_base_v1/images/chrome/background_repeat.jpg HTTP/1.1  
Host: www.microsoft.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.microsoft.com/mac/base-css
DNT: 1
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: max-age=900
Content-Type: image/jpegGET /mac/_base_v1/modules/button/images /buttonlarge_yellownormal.png HTTP/1.1
Host: www.microsoft.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.microsoft.com/mac/css
DNT: 1

和下面的 Perl 正则表达式
while ($1 =~m/((GET|PUT|POST|CONNECT)\s+\S+)(?:(?!GET|PUT|POST|CONNECT\s+\S+).)*?Host:\s([^\n]+).*?User-Agent:\s([^\n]+).*?Referer:\s([^\n]+).*?Connection:/msg) {
# do something
}

它匹配这个很好
GET /mac/_base_v1/modules/button/images/buttonlarge_yellownormal.png  
www.microsoft.com
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
http://www.microsoft.com/mac/css

但是,我还需要它来检查以下文本:
GET /vi/k_dbVP4r4V4/hqdefault.jpg HTTP/1.1  
Host: i.ytimg.com
User-Agent: Apple iPad v4.3.5 YouTube v1.0.0.8L1
Accept-Language: en-us, *;q=0.5
Gdata-Version: 2
X-Gdata-Client: ytapi-apple-ipad
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Q2J}

并匹配以下内容:
GET /vi/k_dbVP4r4V4/hqdefault.jpg HTTP/1.1  
i.ytimg.com
Apple iPad v4.3.5 YouTube v1.0.0.8L1

同时仍然能够匹配成功呈现的先前文本。

最佳答案

HTTP 请求和响应 header 并不像预期的那样容易解析。例如,以下都是等价的:

Accept-Encoding: gzip, deflate

Accept-Encoding: gzip,
deflate

Accept-Encoding: gzip
Accept-Encoding: deflate

因此,我建议您使用现有的解析器
use strict;
use warnings;
use feature qw( say );
use HTTP::Request qw( );

my $s = <<'__EOI__';
GET /mac/_base_v1/images/chrome/background_repeat.jpg HTTP/1.1
Host: www.microsoft.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.microsoft.com/mac/base-css
DNT: 1
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: max-age=900
Content-Type: image/jpegGET /mac/_base_v1/modules/button/images /buttonlarge_yellownormal.png HTTP/1.1
Host: www.microsoft.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: image/png,image/*;q=0.8,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.microsoft.com/mac/css
DNT: 1
__EOI__

my ($raw_req, $raw_resp) = split qr{(?=^HTTP/)}m, $s;
my $req = HTTP::Request->parse($raw_req);
say $req->method;
say $req->url;
say $req->user_agent;
say $req->header('User-Agent'); # Same as previous

关于regex - Perl 正则表达式捕获,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11488953/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com