加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 综合聚焦 > 服务器 > Linux > 正文

如何用awk对重复行的值求和?

发布时间:2020-12-14 00:01:54 所属栏目:Linux 来源:网络整理
导读:我有一个11行的csv文件,如下所示: Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat05 Jun 2018,Mildred@email.com,205583995140400,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.0
我有一个11行的csv文件,如下所示:

Order Date,Username,Order Number,No Resi,Quantity,Title,Update Date,Status,Price Per Item,Status Tracking,Alamat
05 Jun 2018,Mildred@email.com,205583995140400,2,Gold,05 Jun 2018 – 10:01,In Process,Rp3.000.000,Done,Syahrul Address
05 Jun 2018,1,Martha@email.com,205486016644400,Faishal  Address
05 Jun 2018,Misty@email.com,205588935534900,Rutwan Address
05 Jun 2018,Rutwan Address

我想删除该文件中的重复项并将数量行中的值相加.我希望结果是这样的:

Order Date,3,4,Rutwan Address

我只想将数量行中的值相加,而将其余部分保留原样.我在this question尝试过这个解决方案,但只有文件只有2行才能解决问题,我有11个,所以它不起作用.我怎么用awk做到这一点?

解决方法

从Karafka的解决方案中直接调整并在其中添加一些代码以按照OP的请求以正确的顺序(它们存在于Input_file中)获得行.

awk -F,'
FNR==1{
  print;
  next}
{
  val=$5;
  $5="~";
  a[$0]+=val
}
!b[$0]++{
  c[++count]=$0}
END{
  for(i=1;i<=count;i++){
     sub("~",a[c[i]],c[i]);
     print c[i]}
}' OFS=,Input_file

说明:现在也向上面的代码添加说明.

awk -F,'                         ##Setting field separator as comma here.
FNR==1{                           ##Checking condition if line number is 1 then do following.
  print;                          ##Print the current line.
  next}                           ##next will skip all further statements from here.
{
  val=$5;                         ##Creating a variable named val whose value is 5th field of current line.
  $5="~";                         ##Setting value of 5th field as ~ here to keep all lines same(to create index for array a).
  a[$0]+=val                      ##Creating an array named a whose index is current line and its value is variable val value.
}
!b[$0]++{                         ##Checking if array b whose index is current line its value is NULL then do following.
  c[++count]=$0}                  ##Creating an array named c whose index is variable count increasing value with 1 and value is current line.
END{                              ##Starting END block of awk code here.
  for(i=1;i<=count;i++){          ##Starting a for loop whose value starts from 1 to till value of count variable.
     sub("~",c[i]);       ##Substituting ~ in value of array c(which is actually lines value) with value of SUMMED $5.
     print c[i]}                  ##Printing newly value of array c where $5 is now replaced with its actual value.
}' OFS=,Input_file               ##Setting OFS as comma here and mentioning Input_file name here too.

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读