加入收藏 | 设为首页 | 会员中心 | 我要投稿 李大同 (https://www.lidatong.com.cn/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 百科 > 正文

正则表达式 – 从两个文件中的列中获取值

发布时间:2020-12-14 05:58:23 所属栏目:百科 来源:网络整理
导读:我的原始观察结果如下: name Analytespring 0.1winter 0.4 为了计算p值,我做了自举模拟: name Analytespring 0.001winter 0spring 0winter 0.2spring 0.03winter 0spring 0.01winter 0.02spring 0.1winter 0.5spring 0winter 0.04spring 0.2winter 0spring
我的原始观察结果如下:

name Analyte
spring 0.1
winter 0.4

为了计算p值,我做了自举模拟:

name Analyte
spring 0.001
winter 0
spring 0
winter 0.2
spring 0.03
winter 0
spring 0.01
winter 0.02
spring 0.1
winter 0.5
spring 0
winter 0.04
spring 0.2
winter 0
spring 0
winter 0.06
spring 0
winter 0
.....

现在我想计算经验p值:在原始数据中冬季分析物= 0.4-如果在自举数据中冬季分析物> = 0.4(例如1次)并且完成自举(例如100次),则经验p-冬季分析物的价值计算如下:

1/100 = 0.01

(数据与原始数据相同或更高的次数
除以观察总数)
对于弹簧分析物,p值为:

2/100 = 0.02

我想用awk计算那些p值.
我对春天的解决方案是:

awk -v VAR="spring" '($1==VAR && $2>=0.1) {n++} END {print VAR,"p-value=",n/100}'

弹簧p值= 0.02
我需要的帮助是将原始文件(名称为spring and winter及其分析物,观察值和观察次数)传递给awk并分配.

解决方法

说明和脚本内容:

像以下一样运行:awk -f script.awk原始引导程序

# Slurp the original file in an array a
# Ignore the header

NR==FNR && NR>1 {

# Index of this array will be type
# Value of that type will be original value

    a[$1]=$2
    next
}

# If in the bootstrap file value
# of second column is greater than original value

FNR>1 && $2>a[$1] { 

# Increment an array indexed at first column
# which is nothing but type

    b[$1]++
}

# Increment another array regardless to identify
# the number of times bootstrapping was done
{
    c[$1]++
}

# for each type in array a

END {
    for (type in a) {

# print the type and calculate empirical p-value 
# which is done by dividing the number of times higher value
# of a type was seen and total number of times
# bootstrapping was done. 

        print type,b[type]/c[type]
    }
}

测试:

$cat original 
name Analyte
spring 0.1
winter 0.4

$cat bootstrap 
name Analyte
spring 0.001
winter 0
spring 0
winter 0.2
spring 0.03
winter 0
spring 0.01
winter 0.02
spring 0.1
winter 0.5
spring 0
winter 0.04
spring 0.2
winter 0
spring 0
winter 0.06
spring 0
winter 0

$awk -f s.awk original bootstrap 
spring 0.222221
winter 0.222221

分析:

Spring Original Value is 0.1
Winter Original Value is 0.4
Bootstrapping done is 9 times for this sample file
Count of values higher than Spring original value = 1
Count of values higher than Winter's original value = 1
So,1/9 = 0.222221

(编辑:李大同)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!

    推荐文章
      热点阅读