加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 大数据 > 正文

perl – 根据标题和日期计算行数

发布时间:2020-12-16 06:15:33 所属栏目:大数据 来源:网络整理
导读:我有一个标签分隔文件格式: Business System Name: OK_CR Serial Numbr Service Name Program Name Epoch Start Time ------------ -------------------- -------------------- ------------------- GI1001TAA266 PPV 10 (50106) We Bought A Zoo Aug 14 20
我有一个标签分隔文件格式:

Business System Name:  OK_CR                      

Serial Numbr  Service Name          Program Name          Epoch Start Time     
------------  --------------------  --------------------  -------------------  
GI1001TAA266  PPV 10 (50106)        We Bought A Zoo       Aug 14 2012  4:15AM  
GI1002TB3596  PPV 5 (50101)         Help,The (2011)      Aug 14 2012  6:30PM  
GI1002TDH825  PPV 2 (50098)         Safe House            Sep  7 2012  2:15AM  

Business System Name:  OK_SV                      

Serial Numbr  Service Name          Program Name          Epoch Start Time     
------------  --------------------  --------------------  -------------------  
GI1001TAA266  PPV 10 (50106)        We Bought A Zoo       Aug 14 2012  4:15AM  
GI1002TB3596  PPV 5 (50101)         Help,The (2011)      Aug 14 2012  6:30PM  
GI1002TDH825  PPV 2 (50098)         Safe House            Sep  7 2012  2:15AM

我想计算按业务系统标题分隔的日期行数,我的意思是脚本的结果应该是这样的:

Business System Name:  OK_CR
Aug 14: 2
Sep 7: 1

Business System Name:  OK_SV
Aug 14: 2
Sep 7: 1

到目前为止,我已经创建了一个哈希,但我很惊讶如何计算每个日期并在每个业务系统头之后重置计数器.这是我的脚本:

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

open my $fh,'<','ppv.txt' or die $!;

my %data;
my $sect;
while (<$fh>) {
  next if /^s+/;
  if (/^Business System Name:s+(w+)/) {
    $sect = $1;
    next;
  }
  #print "$sectn";
  if (defined $sect) {
    next if /^Serial Numbr/;
    next if /^------------/;
    push @{ $data{$sect} },$_;
  }
}
print Dumper %data;

这是脚本的结果:

$VAR1 = {
          'OK_CR' => [
                       'GI1001TAA266  PPV 10 (50106)        We Bought A Zoo       Aug 14 2012  4:15AM
','GI1002TB3596  PPV 5 (50101)         Help,The (2011)      Aug 14 2012  6:30PM
','GI1002TDH825  PPV 2 (50098)         Safe House            Sep  7 2012  2:15AM
'
                     ],'OK_SV' => [
                       'GI1001TAA266  PPV 10 (50106)        We Bought A Zoo       Aug 14 2012  4:15AM
','GI1002TDH825  PPV 2 (50098)         Safe House            Sep  7 2012  2:15AM
'
                     ]
        };

关于如何从这里前进的任何想法?

解决方法

使用解压缩功能,如评论中所示,您只需跟踪每个日期的数字:

use strict;
use warnings;
use Data::Dumper;

open my $fh,'ppv.txt' or die $!;

my %data;
my $sect;
while (<$fh>) {
  next if /^s+/;
  if (/^Business System Name:s+(w+)/) {
    $sect = $1;
    next;
  }
  #print "$sectn";
  if (defined $sect) {
    next if /^Serial Numbr/;
    next if /^------------/;
    my $format = 'A57 A13 A*';
    my($prefixes,$date,$suffixes) = unpack($format,$_);
    $data{$sect}{$date}++;
  }
}
print Dumper %data;

__END__

$VAR1 = {
          'OK_CR' => {
                       ' Aug 14 2012' => 2,' Sep  7 2012' => 1
                     },'OK_SV' => {
                       ' Aug 14 2012' => 2,' Sep  7 2012' => 1
                     }
        };

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读