regex – 用于识别文本引用的正则表达式
发布时间:2020-12-14 06:26:04  所属栏目:百科  来源:网络整理 
            导读:我正在尝试创建一个正则表达式来捕获文本引用. 以下是文本引用的几个例句: … and the reported results in (Nivre et al.,2007) were not representative … … two systems used a Markov chain approach (Sagae and Tsujii 2007) . Nivre (2007) showed
                
                
                
            | 
                         
 我正在尝试创建一个正则表达式来捕获文本引用. 
  
  
以下是文本引用的几个例句: 
 目前,我的正则表达式是 (D*dddd) 哪个匹配示例1-3,但不匹配示例4.如何修改此示例以捕获示例4? 谢谢! 
 我最近为此目的使用了这样的东西: 
  
  
                          #!/usr/bin/env perl
use 5.010;
use utf8;
use strict;
use autodie;
use warnings qw< FATAL all >;
use open qw< :std IO :utf8 >;
my $citation_rx = qr{
    ( (?:
        s*
        # optional author list
        (?: 
            # has to start capitalized
            p{Uppercase_Letter}        
            # then have a lower case letter,or maybe an apostrophe
            (?=  [p{Lowercase_Letter}p{Quotation_Mark}] )
            # before a run of letters and admissible punctuation
            [p{Alphabetic}p{Dash_Punctuation}p{Quotation_Mark}s,.] +
        ) ?  # hook if and only if you want the authors to be optional!!
        # a reasonable year
        b (18|19|20) dd 
        # citation series suffix,up to a six-parter
        [a-f] ?         b                 
        # trailing semicolon to separate multiple citations
        ; ?  
        s*
    ) +
    )
}x;
while (<DATA>) {
    while (/$citation_rx/gp) {
        say ${^MATCH};
    } 
} 
__END__
... and the reported results in (Nivré et al.,2007) were not representative ...
... two systems used a Markov chain approach (Sagae and Tsujii 2007).
Nivre (2007) showed that ...
... for attaching and labelling dependencies (Chen et al.,2007; Dre?e et al.,2007). 
 运行时,它会产生: (Nivré et al.,2007) (Sagae and Tsujii 2007) (2007) (Chen et al.,2007) (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!  | 
                  
