gpt4 book ai didi

regex - Perl Slurp 正则表达式捕获

转载 作者:行者123 更新时间:2023-12-01 13:25:49 24 4
gpt4 key购买 nike

使用 perl,我在一个包含以下文本的大文件中“吞食”了,我正在尝试为我给定的正则表达式捕获文件中的所有正则表达式 $1 匹配项。我的正则表达式是

=~ /((GET|PUT|POST|CONNECT).*?(Content-Type: (image\/jpeg)))/sgm 

当前正在捕获粗体文本,但是,最后捕获的是处理行

"GET /~sgtatham/putty/latest/x86/pscp.exe HTTP/1.1" to "Content-Type: text/html; charset=iso-8859-1" 

作为最后一次捕获的一部分,它不应该 b/c“text/html”不等于我对 (image\/jpeg) 的正则表达式捕获。我希望能够在没有

"GET /~sgtatham/putty/latest/x86/pscp.exe HTTP/1.1" to "Content-Type: text/html; charset=iso-8859-1" being included.

感谢任何帮助,谢谢。

**GET /~sgtatham/putty/latest/x86/pscp.exe HTTP/1.1  
Host: the.earth.li
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
\.+"
GET /~sgtatham/putty/0.62/x86/pscp.exe HTTP/1.1
Host: the.earth.li
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Content-Length: 315392
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Content-Type: image/jpeg**
Platform: Digital Engagement Platform; Version: 1.1.0.0

最佳答案

您可以使用 (?!pattern) 轻松做到这一点,这是一个否定的先行断言。如需回顾,请阅读这篇文章 Positive examples of positive and negative lookahead (ourcraft.wordpress.com)

正则表达式

$text =~ /
( # start capture
(?:GET|PUT|POST|CONNECT) # start phrase
(?:
(?!GET|PUT|POST|CONNECT) # make sure we'havent any these phrase
. # accept any character
)*? # any number of times (not greedy)
Content-Type:\simage\/jpeg # end phrase
) # end capture
/msx;
print $1;

所有事件

while($text =~ m/REGEXP/msxg) {

print $1;
}

输出

GET /~sgtatham/putty/0.62/x86/pscp.exe HTTP/1.1  
Host: the.earth.li
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:13.0) Gecko/20100101 Firefox/13.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Content-Length: 315392
Keep-Alive: timeout=15, max=99
Connection: Keep-Alive
Content-Type: image/jpeg

关于regex - Perl Slurp 正则表达式捕获,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11372265/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com