正则表达式 – 使用Perl在方括号“[]”之间提取数据

发布时间：2020-12-14 06:07:03 所属栏目：百科来源：网络整理

导读：我正在使用正则表达式从弯曲括号(或“括号”)中提取数据,如从(a,b)中提取a,b,如下所示.我有一个文件,其中每一行都会像 this is the range of values (a1,b1) and [b1|a1]this is the range of values (a2,b2) and [b2|a2]this is the range of values (a3,b3

我正在使用正则表达式从弯曲括号(或“括号”)中提取数据,如从(a,b)中提取a,b,如下所示.我有一个文件,其中每一行都会像

this is the range of values (a1,b1) and [b1|a1]
this is the range of values (a2,b2) and [b2|a2]
this is the range of values (a3,b3) and [b3|a3]

我正在使用以下字符串来提取a1,b1,a2,b2等…

@numbers = $_ =~ /((.*),(.*))/

但是,如果我想从方括号[]中提取数据,我该怎么办呢？例如

this is the range of values (a1,b1) and [b1|a1]
this is the range of values (a1,b1) and [b2|a2]

我需要提取/匹配方括号中的数据而不是曲线括号.

解决方法

[更新]与此同时,我写了一篇关于具体问题的博客文章.*我在下面描述： Why Using .* in Regular Expressions Is Almost Never What You Actually Want

如果您的标识符a1,b1等本身不包含逗号或方括号,则应使用以下行的模式以避免回溯地狱：

/[([^,]]+),([^,]]+)]/

这是working example on Regex101.

像.*这样的贪婪量词的问题是,你很可能在开始时消耗太多,以便正则表达式引擎必须进行大量的回溯.即使你使用非贪婪的量词,引擎也会做更多的匹配尝试,因为它一次只消耗一个字符,然后尝试提升模式中的位置.

(你甚至可以使用atomic groups来使匹配更加高效.)

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!