为什么这两个RegEx基准测试差异如此之大?
发布时间:2020-12-16 06:16:44 所属栏目:大数据 来源:网络整理
导读:为什么这两个RegEx基准测试差异如此之大? 他们使用相同的RegEx,一个就地,一个通过qr存储// 结果: Rate rege1.FIND_AT_END rege2.FIND_AT_ENDrege1.FIND_AT_END 661157/s -- -85%rege2.FIND_AT_END 4384042/s 563% -- Rate rege1.NOFIND rege2.NOFINDrege1.
为什么这两个RegEx基准测试差异如此之大?
他们使用相同的RegEx,一个就地,一个通过qr存储// 结果: Rate rege1.FIND_AT_END rege2.FIND_AT_END rege1.FIND_AT_END 661157/s -- -85% rege2.FIND_AT_END 4384042/s 563% -- Rate rege1.NOFIND rege2.NOFIND rege1.NOFIND 678702/s -- -87% rege2.NOFIND 5117707/s 654% -- Rate rege1.FIND_AT_START rege2.FIND_AT_START rege1.FIND_AT_START 657765/s -- -85% rege2.FIND_AT_START 4268032/s 549% -- # Benchmark use Benchmark qw(:all); my $count = 10000000; my $re = qr/abc/o; my %tests = ( "NOFIND " => "cvxcvidgds.sdfpkisd[s","FIND_AT_END " => "cvxcvidgds.sdfpabcd[s","FIND_AT_START " => "abccvidgds.sdfpkisd[s" ); foreach my $type (keys %tests) { my $str = $tests{$type}; cmpthese($count,{ "rege1.$type" => sub { my $idx = ($str =~ $re); },"rege2.$type" => sub { my $idx = ($str =~ /abc/o); } }); } 解决方法
您正在处理本质上非常快的操作,因此您需要再运行一些测试来缩小速度的范围.我还将基准模型从外部(让cmpthese做到)切换到内部(for loop)速度放大.这可以最大限度地减少子程序调用的开销以及cmpthese必须执行的任何工作.最后,测试以确定差异是否与量级成比例是重要的(在这种情况下它不是).
use Benchmark 'cmpthese'; my $re = qr/abc/o; my %tests = ( 'fail ' => 'cvxcvidgds.sdfpkisd[s','end ' => 'cvxcvidgds.sdfpabcd[s','start' => 'abccvidgds.sdfpkisd[s',); for my $mag (map 10**$_,1 .. 5) { say "n$mag:"; for my $type (keys %tests) { my $str = $tests{$type}; cmpthese -1,{ '$re '.$type => sub {my $i; $i = ($str =~ $re ) for 0 .. $mag},'/abc/o '.$type => sub {my $i; $i = ($str =~ /abc/o) for 0 .. $mag},'/$re/ '.$type => sub {my $i; $i = ($str =~ /$re/ ) for 0 .. $mag},'/$re/o '.$type => sub {my $i; $i = ($str =~ /$re/o) for 0 .. $mag},} } } 10: Rate $re fail /$re/ fail /$re/o fail /abc/o fail $re fail 106390/s -- -8% -72% -74% /$re/ fail 115814/s 9% -- -70% -71% /$re/o fail 384635/s 262% 232% -- -5% /abc/o fail 403944/s 280% 249% 5% -- Rate $re end /$re/ end /$re/o end /abc/o end $re end 105527/s -- -5% -71% -72% /$re/ end 110902/s 5% -- -69% -71% /$re/o end 362544/s 244% 227% -- -5% /abc/o end 382242/s 262% 245% 5% -- Rate $re start /$re/ start /$re/o start /abc/o start $re start 111002/s -- -3% -72% -73% /$re/ start 114094/s 3% -- -71% -73% /$re/o start 390693/s 252% 242% -- -6% /abc/o start 417123/s 276% 266% 7% -- 100: Rate /$re/ fail $re fail /$re/o fail /abc/o fail /$re/ fail 12329/s -- -4% -77% -79% $re fail 12789/s 4% -- -76% -78% /$re/o fail 53194/s 331% 316% -- -9% /abc/o fail 58377/s 373% 356% 10% -- Rate $re end /$re/ end /$re/o end /abc/o end $re end 12440/s -- -1% -75% -77% /$re/ end 12623/s 1% -- -75% -77% /$re/o end 50127/s 303% 297% -- -7% /abc/o end 53941/s 334% 327% 8% -- Rate $re start /$re/ start /$re/o start /abc/o start $re start 12810/s -- -3% -76% -78% /$re/ start 13190/s 3% -- -75% -77% /$re/o start 52512/s 310% 298% -- -8% /abc/o start 57045/s 345% 332% 9% -- 1000: Rate $re fail /$re/ fail /$re/o fail /abc/o fail $re fail 1248/s -- -8% -76% -80% /$re/ fail 1354/s 9% -- -74% -79% /$re/o fail 5284/s 323% 290% -- -16% /abc/o fail 6311/s 406% 366% 19% -- Rate $re end /$re/ end /$re/o end /abc/o end $re end 1316/s -- -1% -74% -77% /$re/ end 1330/s 1% -- -74% -77% /$re/o end 5119/s 289% 285% -- -11% /abc/o end 5757/s 338% 333% 12% -- Rate /$re/ start $re start /$re/o start /abc/o start /$re/ start 1283/s -- -1% -75% -81% $re start 1302/s 1% -- -75% -80% /$re/o start 5119/s 299% 293% -- -22% /abc/o start 6595/s 414% 406% 29% -- 10000: Rate /$re/ fail $re fail /$re/o fail /abc/o fail /$re/ fail 130/s -- -6% -76% -80% $re fail 139/s 7% -- -74% -79% /$re/o fail 543/s 317% 291% -- -17% /abc/o fail 651/s 400% 368% 20% -- Rate /$re/ end $re end /$re/o end /abc/o end /$re/ end 128/s -- -3% -76% -79% $re end 132/s 3% -- -76% -78% /$re/o end 541/s 322% 311% -- -10% /abc/o end 598/s 366% 354% 11% -- Rate /$re/ start $re start /$re/o start /abc/o start /$re/ start 132/s -- -1% -77% -80% $re start 133/s 1% -- -76% -79% /$re/o start 566/s 330% 325% -- -13% /abc/o start 650/s 394% 388% 15% -- 100000: Rate /$re/ fail $re fail /$re/o fail /abc/o fail /$re/ fail 13.2/s -- -8% -76% -78% $re fail 14.2/s 8% -- -74% -76% /$re/o fail 55.9/s 325% 292% -- -8% /abc/o fail 60.5/s 360% 324% 8% -- Rate /$re/ end $re end /$re/o end /abc/o end /$re/ end 12.8/s -- -3% -75% -79% $re end 13.2/s 3% -- -75% -78% /$re/o end 52.3/s 308% 297% -- -12% /abc/o end 59.7/s 365% 353% 14% -- Rate $re start /$re/ start /$re/o start /abc/o start $re start 13.4/s -- -2% -77% -78% /$re/ start 13.6/s 2% -- -77% -78% /$re/o start 58.2/s 334% 328% -- -6% /abc/o start 62.2/s 364% 357% 7% -- 您可以很容易地看到测试分为两类,一类是源中的/…/,另一类是没有.由于这是一个合成差异,它为您提供线索,可能是编译器正在优化的情况(或者允许运行时以某种方式缓存). (在完成一次变量后删除对变量的检查,简化堆栈,很难说不看源). 结果可能还取决于所使用的perl的版本.上述测试在v5.10.1上运行 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |