破折号如何在正则表达式中起作用？

发布时间：2020-12-14 06:01:56 所属栏目：百科来源：网络整理

导读：我对使用-…在正则表达式中决定包含哪些字符的算法感到好奇. Example: [a-zA-Z0-9] 这匹配任何情况的任何字符,a到z,以及数字0到9. 我原本以为它们的使用类似于宏,例如,a-z转换为a,b,c,d,e等.但是在我看到open source project中的以下内容后, text.tr('A-Za-z

我对使用-…在正则表达式中决定包含哪些字符的算法感到好奇.

Example: [a-zA-Z0-9]

这匹配任何情况的任何字符,a到z,以及数字0到9.

我原本以为它们的使用类似于宏,例如,a-z转换为a,b,c,d,e等.但是在我看到open source project中的以下内容后,

text.tr('A-Za-z1-90','?-??-?①-⑨?')

我对正则表达式的范例已经完全改变了,因为这些字符不是你的典型字符,所以这对我的工作是如何正确的,我想.

我的理论是 – 字面意思

Any ASCII value between the left character,and the right character. (e.g. a-z [97-122])

谁能证实我的理论是否正确？正则表达式模式实际上是否使用任何字符之间的字符代码进行计算？

此外,如果它是正确的,你可以执行正则表达式匹配,如,

A-z

因为A是65,而z是122,所以从理论上讲,它也应匹配这些值之间的所有字符.

解决方法

从 MSDN – Character Classes in Regular Expressions(大胆是我的)：

The syntax for specifying a range of characters is as follows:

[firstCharacter-lastCharacter]

where firstCharacter is the character that begins the range and lastCharacter is the character that ends the range. A character range is a contiguous series of characters defined by specifying the first character in the series,a hyphen (-),and then the last character in the series. Two characters are contiguous if they have adjacent Unicode code points.

所以你的假设是正确的,但事实上,效果更广泛：Unicode字符代码,而不仅仅是ASCII.

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!