regex – perl Regular Expression按关键字查找Java StackTrace
发布时间:2020-12-16 06:12:47 所属栏目:大数据 来源:网络整理
导读:我需要通过关键字从logfile grep完整的堆栈跟踪. 这段代码工作正常,但是对于大文件来说速度慢(比文件慢一点). 我认为提高正则表达式找到关键字的最佳方法,但我无法完成它. #!/usr/bin/perluse strict;use warnings;my $regexp;my $stacktrace;undef $/;$rege
我需要通过关键字从logfile grep完整的堆栈跟踪.
这段代码工作正常,但是对于大文件来说速度慢(比文件慢一点). #!/usr/bin/perl use strict; use warnings; my $regexp; my $stacktrace; undef $/; $regexp = shift; $regexp = quotemeta($regexp); while (<>) { while ( $_ =~ /(?<LEVEL>^[E|W|D|I])s (?<TIMESTAMP>d{6}sd{6}.d{3})s (?<THREAD>.*?)/ (?<CLASS>.*?)s-s (?<MESSAGE>.*?[r|n](?=^[[E|W|D|I]sd{6}sd{6}.d{3}]?))/gsmx ) { $stacktrace = $&; if ( $+{MESSAGE} =~ /$regexp/ ) { print "$stacktrace"; } } } 用法:./ grep_log4j.pl< pattern> <文件> 示例:./ grep_log4j.pl Exception sample.log 我认为$stacktrace = $& ;;因为如果删除此字符串并只是打印所有匹配的行脚本工作得很快. #!/usr/bin/perl use strict; use warnings; undef $/; while (<>) { while ( $_ =~ /(?<LEVEL>^[E|W|D|I])s (?<TIMESTAMP>d{6}sd{6}.d{3})s (?<THREAD>.*?)/ (?<CLASS>.*?)s-s (?<MESSAGE>.*?[r|n](?=^[[E|W|D|I]sd{6}sd{6}.d{3}]?))/gsmx ) { print_result(); } } sub print_result { print "LEVEL: $+{LEVEL}n"; print "TIMESTAMP: $+{TIMESTAMP}n"; print "THREAD: $+{THREAD}n"; print "CLASS: $+{CLASS}n"; print "MESSAGE: $+{MESSAGE}n"; } 用法:./ grep_log4j.pl< file> 示例:./ grep_log4j.pl sample.log Lo4j模式:%-1p%d%t /%c {1} – %m%n 日志文件示例: I 111012 141506.000 thread/class - Received message: something E 111012 141606.000 thread/class - Failed handling mobile request java.lang.NullPointerException at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at java.lang.Thread.run(Thread.java:619) W 111012 141706.000 thread/class - Received message: something E 111012 141806.000 thread/class - Failed with Exception java.lang.NullPointerException at javax.servlet.http.HttpServlet.service(HttpServlet.java:710) at java.lang.Thread.run(Thread.java:619) D 111012 141906.000 thread/class - Received message: something S 111012 142006.000 thread/class - Received message: something I 111012 142106.000 thread/class - Received message: something I 111013 142206.000 thread/class - Metrics:0/1 我的正则表达式你可以在http://gskinner.com/RegExr/上找到log4j关键字: 解决方法
您正在使用:
$/ = undef; 这使得perl将整个文件读入内存. 我会像这样逐行处理这个文件(假设堆栈跟踪与跟踪上方的消息相关联): my $matched; while (<>) { if (m/^(?<LEVEL>S+) s+ (?<TIMESTAMP>(d+) s+ ([d.])+) s+ (?<THREADCLASS>S+) s+ - s+ (?<REST>.*)/x) { my %captures = %+; $matched = ($+{REST} =~ $regexp); if ($matched) { print "LEVEL: $captures{LEVEL}n"; ... } } elsif ($matched) { print; } } 这是解析多行块的一般技术. my $first; my $stack = ""; while (<STDIN>) { if (m/^S /) { process($first,$stack) if $first; $first = $_; $stack = ""; } else { $stack .= $_; } } process($first,$stack) if $first; sub process { my ($first,$stack) = @_; # ... do whatever you want here ... } (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |