基因数据处理9之BWA小数据集测试(成功)
发布时间:2020-12-14 02:00:03 所属栏目:大数据 来源:网络整理
导读:基因数据处理9之BWA小数据集测试(成功) 1.fastq为20条,即reads为5条: hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161.fastq |head -20 SRR003161h20.fastqhadoop@Mcnode1:~/cloud/adam/x
基因数据处理9之BWA小数据集测试(成功) 1.fastq为20条,即reads为5条: hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161.fastq |head -20 >SRR003161h20.fastq hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq >SRR003161h20.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 0.01 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 5 sequences have been processed. [main] Version: 0.7.12-r1039 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq [main] Real time: 824.956 sec; CPU: 8.711 sec hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h20.* 20 SRR003161h20.fastq 2 SRR003161h20.sai 22 total hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161h20.sai SAI ?=???? Xshellhadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ Xshell 但是sai文件只有两行? 2.fastq为1000条,即reads为250条: hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq >SRR003161h1000.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 1.04 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 250 sequences have been processed. [main] Version: 0.7.12-r1039 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq [main] Real time: 327.850 sec; CPU: 5.880 sec hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h1000 SRR003161h10000.fastq SRR003161h10000.sai SRR003161h1000.fastq SRR003161h1000.sai hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ wc -l SRR003161h1000* 10000 SRR003161h10000.fastq 0 SRR003161h10000.sai 1000 SRR003161h1000.fastq 2 SRR003161h1000.sai 11002 total hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ cat SRR003161h1000.sai SAI ?=???? @-`x-`xhadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ Xshell ll 2500条reads: hadoop@Mcnode1:~/cloud/adam/xubo/data/data_HDFS/GRCH38/GCA_000001405.15_GRCh38/test20160310$ bwa aln -t 4 GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq >SRR003161h10000.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 14.03 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 2500 sequences have been processed. [main] Version: 0.7.12-r1039 [main] CMD: bwa aln -t 4 GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq [main] Real time: 1443.877 sec; CPU: 19.653 sec 在另外149的节点很快: 5条reads: hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq >SRR003161h20.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 0.00 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 5 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.fastq [main] Real time: 47.452 sec; CPU: 3.616 sec hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.sai SRR003161h20.fastq >SRR003161h20bwa.sam [bwa_aln_core] convert to sequence coordinate... 5.85 sec [bwa_aln_core] refine gapped alignments... 0.66 sec [bwa_aln_core] print alignments... 0.00 sec [bwa_aln_core] 5 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h20.sai SRR003161h20.fastq [main] Real time: 100.443 sec; CPU: 6.523 sec 250条reads: hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq >SRR003161h1000.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 0.19 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 250 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.fastq [main] Real time: 40.316 sec; CPU: 4.745 sec hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.sai SRR003161h1000.fastq >SRR003161h1000bwa.sam [bwa_aln_core] convert to sequence coordinate... 5.96 sec [bwa_aln_core] refine gapped alignments... 0.73 sec [bwa_aln_core] print alignments... 0.00 sec [bwa_aln_core] 250 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h1000.sai SRR003161h1000.fastq [main] Real time: 156.192 sec; CPU: 6.695 sec 2500条reads: hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq >SRR003161h10000.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 6.23 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 2500 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.fastq [main] Real time: 43.496 sec; CPU: 10.163 sec hadoop@Master:~/cloud/adam/xubo/data/test20160310$ ll -h total 2.2G drwxrwxr-x 3 hadoop hadoop 4.0K 3月 13 14:45 ./ drwxrwxr-x 3 hadoop hadoop 4.0K 3月 12 14:51 ../ drwxrwxr-x 2 hadoop hadoop 4.0K 3月 12 15:24 GCA_000001405.15_GRCh38/ -rw-rw-r-- 1 hadoop hadoop 0 3月 12 15:49 SRR003161a.sam -rw-rw-r-- 1 hadoop hadoop 12K 3月 12 16:15 SRR003161b.sam -rw-rw-r-- 1 hadoop hadoop 0 3月 12 22:50 SRR003161c.sam -rw-rw-r-- 1 hadoop hadoop 1.6G 3月 12 15:49 SRR003161.fastq -rw-rw-r-- 1 hadoop hadoop 527M 3月 12 16:10 SRR003161.fastq.gz -rw-rw-r-- 1 hadoop hadoop 3.1M 3月 12 22:50 SRR003161h10000.fastq -rw-rw-r-- 1 hadoop hadoop 11K 3月 13 14:46 SRR003161h10000.sai -rw-rw-r-- 1 hadoop hadoop 3.3M 3月 13 00:50 SRR003161h10000.sam -rw-rw-r-- 1 hadoop hadoop 336K 3月 12 22:08 SRR003161h1000.fastq -rw-rw-r-- 1 hadoop hadoop 1.1K 3月 13 14:41 SRR003161h1000.sai -rw-rw-r-- 1 hadoop hadoop 0 3月 13 14:45 SRR003161h1000.sam -rw-rw-r-- 1 hadoop hadoop 5.7K 3月 12 21:56 SRR003161h20.fastq -rw-rw-r-- 1 hadoop hadoop 25K 3月 12 22:02 SRR003161h20.sam -rw-rw-r-- 1 hadoop hadoop 1.1M 3月 12 15:49 SRR003161.sai hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.sai SRR003161h10000.fastq >SRR003161h10000bwa.sam [bwa_aln_core] convert to sequence coordinate... 6.17 sec [bwa_aln_core] refine gapped alignments... 0.70 sec [bwa_aln_core] print alignments... 0.01 sec [bwa_aln_core] 2500 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h10000.sai SRR003161h10000.fastq [main] Real time: 131.808 sec; CPU: 6.896 sec hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.fastq >SRR003161h100000.sai [bwa_aln] 17bp reads: max_diff = 2 [bwa_aln] 38bp reads: max_diff = 3 [bwa_aln] 64bp reads: max_diff = 4 [bwa_aln] 93bp reads: max_diff = 5 [bwa_aln] 124bp reads: max_diff = 6 [bwa_aln] 157bp reads: max_diff = 7 [bwa_aln] 190bp reads: max_diff = 8 [bwa_aln] 225bp reads: max_diff = 9 [bwa_aln_core] calculate SA coordinate... 65.66 sec [bwa_aln_core] write to the disk... 0.00 sec [bwa_aln_core] 25000 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa aln GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.fastq [main] Real time: 105.888 sec; CPU: 70.249 sec hadoop@Master:~/cloud/adam/xubo/data/test20160310$ bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.sai SRR003161h100000.fastq >SRR003161h100000bwa.sam [bwa_aln_core] convert to sequence coordinate... 6.44 sec [bwa_aln_core] refine gapped alignments... 0.89 sec [bwa_aln_core] print alignments... 0.07 sec [bwa_aln_core] 25000 sequences have been processed. [main] Version: 0.7.13-r1126 [main] CMD: bwa samse GCA_000001405.15_GRCh38/GCA_000001405.15_GRCh38_full_analysis_set.fna SRR003161h100000.sai SRR003161h100000.fastq [main] Real time: 177.881 sec; CPU: 7.552 sec 明天再测试 (编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |