基因数据处理21之BWASW算法ref分块建立索引然后比对ref切分为四段,read为250条

1.时间分析

对ref为单条染色体进行比对,第一次比对在3-5s不等,对chr1-4比对,在20s左右

连续比对多次后,对单染色体比对降到1s左右,chr1-4降到2s左右

不懂为什么比一次比对时间比较长,后面几次比对时间变短


运行代码

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 2.885 sec; cpu: 1.118 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.022 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.019 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 2.511 sec; cpu: 1.056 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 0.999 sec; cpu: 0.950 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.017 sec; cpu: 0.964 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.009 sec; cpu: 0.965 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.071 sec; cpu: 1.019 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.072 sec; cpu: 1.015 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.068 sec; cpu: 1.018 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.065 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.070 sec; cpu: 1.017 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq >SRR003161h1000chr1.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr1L3556522.fna SRR003161h1000.fastq
[main] Real time: 1.050 sec; cpu: 1.009 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.017 sec; cpu: 0.969 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.015 sec; cpu: 0.969 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq >SRR003161h1000chr2.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr2L3459909.fna SRR003161h1000.fastq
[main] Real time: 1.023 sec; cpu: 0.966 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.940 sec; cpu: 0.885 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.933 sec; cpu: 0.888 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq >SRR003161h1000chr3.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr3L2832795.fna SRR003161h1000.fastq
[main] Real time: 0.915 sec; cpu: 0.872 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.918 sec; cpu: 0.871 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.919 sec; cpu: 0.868 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq >SRR003161h1000chr4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38chr4L2717352.fna SRR003161h1000.fastq
[main] Real time: 0.889 sec; cpu: 0.853 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 20.819 sec; cpu: 3.195 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 17.380 sec; cpu: 2.803 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 14.140 sec; cpu: 2.454 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 4.305 sec; cpu: 2.166 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.034 sec; cpu: 1.970 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.059 sec; cpu: 1.995 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.079 sec; cpu: 2.000 sec
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ rm SRR003161h1000chr1-4.sam 
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq >SRR003161h1000chr1-4.sam
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[bsw2_aln] read 250 sequences/pairs (161179 bp) ...
[main] Version: 0.7.13-r1126
[main] CMD: bwa bwasw GRCH38L12566578.fna SRR003161h1000.fastq
[main] Real time: 2.046 sec; cpu: 1.997 sec

2.准确性分析:

hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1.sam 
264 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
105 + 0 mapped (39.77% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr2.sam 
260 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
83 + 0 mapped (31.92% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr3.sam 
256 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
80 + 0 mapped (31.25% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr4.sam 
254 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
58 + 0 mapped (22.83% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub$ samtools flagstat SRR003161h1000chr1-4.sam 
264 + 0 in total (QC-passed reads + QC-Failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
146 + 0 mapped (55.30% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
3.比对结果文件,太长,就不粘了

相关文章

自1998年我国取消了福利分房的政策后,房地产市场迅速开展蓬...
文章目录获取数据查看数据结构获取数据下载数据可以直接通过...
网上商城系统MySql数据库设计
26个来源的气象数据获取代码
在进入21世纪以来,中国电信业告别了20世纪最后阶段的高速发...