将Perl正则表达式转换为Python正则表达式
发布时间:2020-12-16 06:22:13  所属栏目:大数据  来源:网络整理 
            导读:我在将Perl正则表达式转换为 Python时遇到了麻烦.我想要匹配的文本具有以下模式: Author(s) : Firstname Lastname Firstname Lastname Firstname Lastname Firstname Lastname 在perl中我能够匹配这个并提取作者 /Author(s) :((.+n)+?)/ 当我尝试 re.com
                
                
                
            | 
 我在将Perl正则表达式转换为 
 Python时遇到了麻烦.我想要匹配的文本具有以下模式: 
  
  
  
Author(s)    : Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname
在perl中我能够匹配这个并提取作者 /Author(s) :((.+n)+?)/ 当我尝试 re.compile(r'Author(s) :((.+n)+?)') 在Python中,它匹配第一个作者两次并忽略其余的. 谁能解释我在这里做错了什么? 解决方法
 你可以这样做: 
  
  
  # find lines with authors
import re
# multiline string to simulate possible input
text = '''
Stuff before
This won't be matched...
Author(s)    : Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname
Other(s)     : Something else we won't match
               More shenanigans....
Only the author names will be matched.
'''
# run the regex to pull author lines from the sample input
authors = re.search(r'Author(s)s*:s*(.*?)^[^s]',text,re.DOTALL | re.MULTILINE).group(1)上面的正则表达式匹配起始文本(作者,空格,冒号,空格),它通过匹配后面以空格开头的所有行给出了下面的结果: '''Firstname Lastname  
           Firstname Lastname  
           Firstname Lastname  
           Firstname Lastname
'''然后,您可以使用以下正则表达式对这些结果中的所有作者进行分组 # grab authors from the lines
import re
authors = '''Firstname Lastname  
           Firstname Lastname  
           Firstname Lastname  
           Firstname Lastname
'''
# run the regex to pull a list of individual authors from the author lines
authors = re.findall(r'^s*(.+?)s*$',authors,re.MULTILINE)哪个给出了作者列表: ['Firstname Lastname','Firstname Lastname','Firstname Lastname'] 组合示例代码: text = '''
Stuff before
This won't be matched...
Author(s)    : Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname  
               Firstname Lastname
Other(s)     : Something else we won't match
               More shenanigans....
Only the author names will be matched.
'''
import re
stage1 = re.compile(r'Author(s)s*:s*(.*?)^[^s]',re.DOTALL | re.MULTILINE)
stage2 = re.compile('^s*(.+?)s*$',re.MULTILINE)
preliminary = stage1.search(text).group(1)
authors = stage2.findall(preliminary)这使作者成为: ['Firstname Lastname','Firstname Lastname'] 成功! (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! | 
