gpt4 book ai didi

ruby - 用于匹配具有重复模式的字符串的正则表达式

转载 作者:数据小太阳 更新时间:2023-10-29 08:57:14 24 4
gpt4 key购买 nike

我试图找到一个正则表达式来匹配具有三个或更多重复段(并且可能包含任意数量的目录)的 URL,例如:

  • s1 = 'http://www.foo.com/bar/bar/bar/'
  • s2 = 'http://www.foo.com/baz/biz/baz/biz/baz/biz/etc'
  • s3 = '/foo/bar/foo/bar/foo/bar/'

并且不匹配如下网址:

  • s4 = '/foo/bar/foo/bar/foo/barbaz'

首先我尝试了:

re1 = /((.+\/)+)\1\1/

哪个有效:

re1 === s1 #=> true
re1 === s2 #=> true

但随着段数的增加,正则表达式匹配的时间呈指数增长:

require 'benchmark'
Benchmark.bm do |b|
(10..15).each do |num|
str = '/foo/bar' * num
puts str
b.report("#{num} repeats:") { /((.+\/)+)\1\1/ === str }
end
end

user system total real
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
10 repeats: 0.060000 0.000000 0.060000 ( 0.054839)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
11 repeats: 0.210000 0.000000 0.210000 ( 0.213492)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
12 repeats: 0.870000 0.000000 0.870000 ( 0.871879)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
13 repeats: 3.370000 0.010000 3.380000 ( 3.399224)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
14 repeats: 13.580000 0.110000 13.690000 ( 13.790675)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
15 repeats: 54.090000 0.210000 54.300000 ( 54.562672)

然后,我尝试了一个类似于给定 here 的正则表达式:

re2 = /(\/.+)(?=.*\1)\1\1/

没有性能问题,并且匹配我想要匹配的字符串:

re2 === s3 #=> true

但也匹配我不希望它匹配的字符串,例如:

re2 === s4 #=> true, but should be false

我接近第二个正则表达式。我错过了什么?

最佳答案

. 更改为 [^\/]。这应该会降低正则表达式的复杂性,因为它不会尝试匹配“任何”字符。

require 'benchmark'

Benchmark.bm do |b|
(10..15).each do |num|
str = '/foo/bar' * num
puts str
b.report("#{num} repeats:") { /(([^\/]+\/)+)\1\1/ === str }
end
end

10 repeats: 0.000000 0.000000 0.000000 ( 0.000015)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
11 repeats: 0.000000 0.000000 0.000000 ( 0.000004)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
12 repeats: 0.000000 0.000000 0.000000 ( 0.000004)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
13 repeats: 0.000000 0.000000 0.000000 ( 0.000004)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
14 repeats: 0.000000 0.000000 0.000000 ( 0.000004)
/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar/foo/bar
15 repeats: 0.000000 0.000000 0.000000 ( 0.000005)

关于ruby - 用于匹配具有重复模式的字符串的正则表达式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49180974/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com