我(最后)开始学习正则表达式,我想知道这两个模式字符串之间是否有任何明显的区别.我正在尝试匹配诸如“Title = Blah”之类的行,并在两组中匹配“Title”和“Blah”.
标题如“Title = The = operator”.以下是解决问题的两种选择:
^([^=]+)=(.+)$
^(.+?)=(.+)$
这两者之间是否有任何区别,无论是性能方面还是功能方面?
第一个要求在=之前至少有一个非=字符才能匹配,而第二个不要;它会在领先的==上匹配.
根据您的内容,第一个可以显着更快地运行. Here’s why:
An Alternative to Laziness In this case,there is a better option than making the plus lazy. We can use a greedy plus and a negated character class: <[^=]+>. The reason why this is better is because of the backtracking. When using the lazy plus,the engine has to backtrack for each character in the HTML tag that it is trying to match. When using the negated character class,no backtracking occurs at all when the string contains valid HTML code. Backtracking slows down the regex engine. You will not notice the difference when doing a single search in a text editor. But you will save plenty of CPU cycles when using such a regex repeatedly in a tight loop in a script that you are writing…