正则表达式及其在python上的应用
今天学习了一早上正则表达式。如下内容部分转载自《读懂正则表达式就这么简单》 1.1 什么是正则表达式 3. http://tools.aspzz.cn/regex/create_reg 4. txt2re :这个在线网站支持解析一句话,从中可以生成匹配的正则表达式,且可以生成诸多类型的代码。语言支持:Perl PHP Python Java Javascript ColdFusion C C++ Ruby VB VBScript J#.net C#.net C++.net VB.net
python正则表达式
关于python的正则表达式,主要使用re模块。
我们以任务为导向介绍python正则表达式的用法。 假设给我们下面这段话: I1113 23:35:50.763059 4460 solver.cpp:218] Iteration 400 (27.3075 iter/s,0.7324s/20 iters),loss = 0.0202583 I1113 23:35:50.763141 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00101873 (* 1 = 0.00101873 loss) I1113 23:35:50.763165 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.0192396 (* 1 = 0.0192396 loss) I1113 23:35:50.763175 4460 sgd_solver.cpp:105] Iteration 400,lr = 0.001 I1113 23:35:51.751206 4460 solver.cpp:218] Iteration 420 (20.2456 iter/s,0.987868s/20 iters),loss = 0.00228514 I1113 23:35:51.751341 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00140554 (* 1 = 0.00140554 loss) I1113 23:35:51.751379 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.000879596 (* 1 = 0.000879596 loss) I1113 23:35:51.751410 4460 sgd_solver.cpp:105] Iteration 420,lr = 0.001 I1113 23:35:52.523890 4460 solver.cpp:218] Iteration 440 (25.8933 iter/s,0.772401s/20 iters),loss = 0.0132958 I1113 23:35:52.523974 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00312161 (* 1 = 0.00312161 loss) I1113 23:35:52.523988 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.0101742 (* 1 = 0.0101742 loss) I1113 23:35:52.523998 4460 sgd_solver.cpp:105] Iteration 440,lr = 0.001 I1113 23:35:53.461998 4460 solver.cpp:218] Iteration 460 (21.3325 iter/s,0.937539s/20 iters),loss = 0.0154897 I1113 23:35:53.462057 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00780452 (* 1 = 0.00780452 loss) I1113 23:35:53.462069 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.00768522 (* 1 = 0.00768522 loss) I1113 23:35:53.462082 4460 sgd_solver.cpp:105] Iteration 460,lr = 0.001 I1113 23:35:54.356657 4460 solver.cpp:218] Iteration 480 (22.3584 iter/s,0.894517s/20 iters),loss = 0.00275768 I1113 23:35:54.356729 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00107937 (* 1 = 0.00107937 loss) I1113 23:35:54.356739 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.00167831 (* 1 = 0.00167831 loss) I1113 23:35:54.356748 4460 sgd_solver.cpp:105] Iteration 480,lr = 0.001 I1113 23:35:55.153437 4460 solver.cpp:218] Iteration 500 (25.1734 iter/s,0.79449s/20 iters),loss = 0.0230187 I1113 23:35:55.153519 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.0105348 (* 1 = 0.0105348 loss) I1113 23:35:55.153530 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.0124839 (* 1 = 0.0124839 loss) I1113 23:35:55.153542 4460 sgd_solver.cpp:105] Iteration 500,lr = 0.001 I1113 23:35:56.104395 4460 solver.cpp:218] Iteration 520 (21.0352 iter/s,0.950785s/20 iters),loss = 0.0144106 I1113 23:35:56.104485 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00135394 (* 1 = 0.00135394 loss) I1113 23:35:56.104504 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.0130567 (* 1 = 0.0130567 loss) I1113 23:35:56.104521 4460 sgd_solver.cpp:105] Iteration 520,lr = 0.001 I1113 23:35:56.854631 4460 solver.cpp:218] Iteration 540 (26.6699 iter/s,0.749909s/20 iters),loss = 0.0167331 I1113 23:35:56.854696 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00285695 (* 1 = 0.00285695 loss) I1113 23:35:56.854710 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.0138762 (* 1 = 0.0138762 loss) I1113 23:35:56.854720 4460 sgd_solver.cpp:105] Iteration 540,lr = 0.001 I1113 23:35:57.824692 4460 solver.cpp:218] Iteration 560 (20.6206 iter/s,0.969902s/20 iters),loss = 0.00817935 I1113 23:35:57.824774 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.00557839 (* 1 = 0.00557839 loss) I1113 23:35:57.824791 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.00260096 (* 1 = 0.00260096 loss) I1113 23:35:57.824806 4460 sgd_solver.cpp:105] Iteration 560,lr = 0.001 I1113 23:35:58.670575 4460 solver.cpp:218] Iteration 580 (23.6486 iter/s,0.845714s/20 iters),loss = 0.00420315 I1113 23:35:58.670637 4460 solver.cpp:237] Train net output #0: rpn_cls_loss = 0.0020043 (* 1 = 0.0020043 loss) I1113 23:35:58.670648 4460 solver.cpp:237] Train net output #1: rpn_loss_bbox = 0.00219884 (* 1 = 0.00219884 loss) I1113 23:35:58.670658 4460 sgd_solver.cpp:105] Iteration 580,lr = 0.001 I1114 00:34:17.348683 4460 sgd_solver.cpp:105] Iteration 79980,lr = 0.0001 speed: 0.044s / iter Wrote snapshot to: /data1/caiyong.wang/program/py-faster-rcnn/output/faster_rcnn_alt_opt/voc_2007_trainval/zf_rpn_stage1_iter_80000.caffemodel希望我们解析出 Iteration 500 (25.1734 iter/s,loss = 0.0230187中的Iteration与loss值。 其实这是faster rcnn生成的log文件一部分。
我们通过上面的语法学习,在MTracer中生成了正则表达式: bIterations(?<Iteration>d+)s(.*).*losss=s(?<loss>d*.*d+)b并且采用多行模式解析出了我们想要的结果。 上面的表达式中,我们使用了捕获分组。 如下图:
那么如何转化成python代码? 正确的代码如下: import re pattern = re.compile(r'bIterations(?P<Iteration>d+)s(.*).*losss=s(?P<loss>d*.*d+)b') arr=pattern.search("I1113 23:35:50.763059 4460 solver.cpp:218] Iteration 400 (27.3075 iter/s,loss = 0.0202583") arr.groups() arr.group() arr.group("Iteration") arr.group("loss")结果为: arr.groups() Out[147]: ('400','0.0202583') arr.group() Out[148]: 'Iteration 400 (27.3075 iter/s,loss = 0.0202583' arr.group("Iteration") Out[149]: '400' arr.group("loss") Out[150]: '0.0202583'这里python的命名组与以往的不一样,使用的是 (?P<name>exp) 取代(?<name>exp) 而且compile里面必须加上r。
参考文献: http://blog.csdn.net/lwnylslwnyls/article/details/8901273 https://www.cnblogs.com/tk091/p/3702307.html PYTHON的RE模块理解(RE.COMPILE、RE.MATCH、RE.SEARCH)
下面再列举python正则表达式的一些用法。
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |