regex – 用于识别文本引用的正则表达式
发布时间:2020-12-14 06:26:04 所属栏目:百科 来源:网络整理
导读:我正在尝试创建一个正则表达式来捕获文本引用. 以下是文本引用的几个例句: … and the reported results in (Nivre et al.,2007) were not representative … … two systems used a Markov chain approach (Sagae and Tsujii 2007) . Nivre (2007) showed
我正在尝试创建一个正则表达式来捕获文本引用.
以下是文本引用的几个例句:
目前,我的正则表达式是 (D*dddd) 哪个匹配示例1-3,但不匹配示例4.如何修改此示例以捕获示例4? 谢谢!
我最近为此目的使用了这样的东西:
#!/usr/bin/env perl use 5.010; use utf8; use strict; use autodie; use warnings qw< FATAL all >; use open qw< :std IO :utf8 >; my $citation_rx = qr{ ( (?: s* # optional author list (?: # has to start capitalized p{Uppercase_Letter} # then have a lower case letter,or maybe an apostrophe (?= [p{Lowercase_Letter}p{Quotation_Mark}] ) # before a run of letters and admissible punctuation [p{Alphabetic}p{Dash_Punctuation}p{Quotation_Mark}s,.] + ) ? # hook if and only if you want the authors to be optional!! # a reasonable year b (18|19|20) dd # citation series suffix,up to a six-parter [a-f] ? b # trailing semicolon to separate multiple citations ; ? s* ) + ) }x; while (<DATA>) { while (/$citation_rx/gp) { say ${^MATCH}; } } __END__ ... and the reported results in (Nivré et al.,2007) were not representative ... ... two systems used a Markov chain approach (Sagae and Tsujii 2007). Nivre (2007) showed that ... ... for attaching and labelling dependencies (Chen et al.,2007; Dre?e et al.,2007). 运行时,它会产生: (Nivré et al.,2007) (Sagae and Tsujii 2007) (2007) (Chen et al.,2007) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |