正则表达式 – 贪婪vs.勉强vs.容性量词

发布时间：2020-12-14 00:40:25 所属栏目：百科来源：网络整理

导读：我发现这个 excellent tutorial正则表达式，虽然我直观地理解什么“贪婪”，“不情愿”和“占有”量词，似乎有一个严重的洞我的理解。具体来说，在以下示例中： Enter your regex: .*foo // greedy quantifierEnter input string to search: xfooxxxxxxfooI

我发现这个 excellent tutorial正则表达式，虽然我直观地理解什么“贪婪”，“不情愿”和“占有”量词，似乎有一个严重的洞我的理解。

具体来说，在以下示例中：

Enter your regex: .*foo  // greedy quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfooxxxxxxfoo" starting at index 0 and ending at index 13.

Enter your regex: .*?foo  // reluctant quantifier
Enter input string to search: xfooxxxxxxfoo
I found the text "xfoo" starting at index 0 and ending at index 4.
I found the text "xxxxxxfoo" starting at index 4 and ending at index 13.

Enter your regex: .*+foo // possessive quantifier
Enter input string to search: xfooxxxxxxfoo
No match found.

解释提到吃整个输入字符串，字母被消耗，匹配器退出，“foo”的最近出现已被反弹等。

不幸的是，尽管有很好的隐喻，我仍然不明白什么是被谁吃了…你知道另一个教程解释(简洁)正则表达式引擎如何工作？

或者，如果有人可以用稍微不同的措辞解释下面的段落，那将是非常感激：

The first example uses the greedy
quantifier .* to find “anything”,zero
or more times,followed by the letters
“f” “o” “o”. Because the quantifier is
greedy,the .* portion of the
expression first eats the entire input
string. At this point,the overall
expression cannot succeed,because the
last three letters (“f” “o” “o”) have
already been consumed (by whom?). So the matcher
slowly backs off (from right-to-left?) one letter at a time
until the rightmost occurrence of
“foo” has been regurgitated (what does this mean?),at which
point the match succeeds and the
search ends.

The second example,however,is
reluctant,so it starts by first
consuming (by whom?) “nothing”. Because “foo”
doesn’t appear at the beginning of the
string,it’s forced to swallow (who swallows?) the
first letter (an “x”),which triggers
the first match at 0 and 4. Our test
harness continues the process until
the input string is exhausted. It
finds another match at 4 and 13.

The third example fails to find a
match because the quantifier is
possessive. In this case,the entire
input string is consumed by .*+,(how?)
leaving nothing left over to satisfy
the “foo” at the end of the
expression. Use a possessive
quantifier for situations where you
want to seize all of something without
ever backing off (what does back off mean?); it will outperform
the equivalent greedy quantifier in
cases where the match is not
immediately found.

我会给它一枪。

贪婪量词首先匹配尽可能多。所以。*匹配整个字符串。然后匹配器尝试匹配f后面，但没有剩下的字符。所以它“回溯”，使贪心量词匹配一个事情(留下“o”在字符串的末尾不匹配)。这仍然不匹配的正则表达式中的f，所以它“回溯”一步，使贪心量词匹配一次少一点事情(使“oo”在字符串的末尾不匹配)。仍然不匹配正则表达式中的f，所以它回溯一个步骤(使“foo”在字符串的末尾不匹配)。现在，匹配器最终匹配正则表达式中的f，o和下一个o也匹配。成功！

一个不情愿的或“非贪婪”的量词首先匹配尽可能少。所以。*首先匹配什么，使整个字符串不匹配。然后匹配器尝试匹配f后面，但字符串的不匹配部分以“x”开头，所以不工作。所以匹配器回溯，使非贪心量词匹配一个事情(现在它匹配“x”，离开“fooxxxxxxfoo”不匹配)。然后它尝试匹配成功的f，正则表达式中的o和下一个o。成功！

在您的示例中，它然后用相同的过程之后的字符串的剩余未匹配部分开始该过程。

占有量词就像贪心量词，但它不回溯。所以它开始与。*匹配整个字符串，留下无与伦比的。然后没有什么留给它与正则表达式中的f匹配。由于占有量词不回溯，所以匹配在那里失败。

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!