gpt4 book ai didi

regex - 字符类与 "shorthand"类的性能

转载 作者:行者123 更新时间:2023-12-01 01:33:20 26 4
gpt4 key购买 nike

在回答另一个问题时,有人提出,显式字符类( [0-9] )和“速记”类( \d )在性能方面可能存在差异。我最初的 react 是,如果根本存在差异,则可以忽略不计,但我没有(也找不到)有关它的任何信息,也没有弄清楚如何对此进行测试。

同样,[^0-9] 之间是否存在不可忽略的差异? , [^\d]\D ?

最佳答案

如有疑问,基准!

我在 Perl 中对一个简单的正则表达式比较进行了基准测试。我发现 \d+确实更快。天啊。

use strict; use warnings;
use Benchmark qw(:all);
use feature "switch";

my $r1='\d+';
my $r2='[0-9]+';
my $r3='[[:digit:]]+';

sub test {
my @lines = <DATA>;
$_=shift;
my $RegEx=(caller(0))[3];
given($_) {
when(1) { $RegEx=$r1; }
when(2) { $RegEx=$r2; }
when(3) { $RegEx=$r3; }
default { die "$RegEx can't deal with $_\n"; }
}
my $ln;
my $total;
my @numbers;
foreach my $line (@lines) {
$total=0;
@numbers=$line=~/($RegEx)/g;
$total+=$_ for (@numbers) ;
$ln=$numbers[$#numbers];
$total-=$ln;
if ($ln != $total) {
print "Bad RegEx result: Last Num != total in line!\n";
print "Total=$total, ln=$ln, line:$line";
die;
}
}
}

cmpthese(-10, {$r1=>'test(1)', $r2=>'test(2)', $r3=>'test(3)'});


__DATA__
Clip clap clock 1 mouse ran up the clock with 3 hands. The total here is 4.
The mouse with 2 ears followed. The total here is 2.
After that, the 6 wiskered mouse did dances with 14 second timing. 20.
It is hard to make up 5 lines with 2 or 3 numbers in each line. 10.
You start thinking about nurserey rhymes with 1 o 2 or 3 number. 6.
1 12 13 123 23 13 55 66 21 45 1 373

我在使用 Perl 5.10 64 位的 OS X 上得到以下结果:
                 Rate [[:digit:]]+       [0-9]+          \d+
[[:digit:]]+ 200781/s -- -1% -2%
[0-9]+ 202831/s 1% -- -1%
\d+ 204605/s 2% 1% --

Ubuntu 10.04 和 Perl 5.10.1 上的以下内容:
                 Rate [[:digit:]]+       [0-9]+          \d+
[[:digit:]]+ 264412/s -- -3% -6%
[0-9]+ 273202/s 3% -- -3%
\d+ 280541/s 6% 3% --

关于regex - 字符类与 "shorthand"类的性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3516545/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com