Compare BAM, BEDSE and tagAlign
Worth to Note that: BAM is 1-based coordinates; BED is 0-based and half-open (?)
BAM
cmd:
#samtools view xxx.trim.PE2SE.bam | head -n1
samtools view Mouse_brain_Islet_R1.trim.PE2SE.bam chr14:22142572-22142725| grep "7001113:845:HYH22BCXY:1:1101:10000:44312"
Output:
7001113:845:HYH22BCXY:1:1101:10000:44312 163 chr14 22142573 44 50M = 22142676 153 GTCTTTTCCTTGGAAGGAAAAGATGTAATAATCTCAGTTTTGGATAAAAT \
DDDDDIIIIIIIIIIIIIIIGIIIIIIIIIIIIIIIIIIGIIIIHIIIII AS:i:100 XS:i:42 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:50 YS:i:100 YT:Z:CP
7001113:845:HYH22BCXY:1:1101:10000:44312 83 chr14 22142676 44 50M = 22142573 -153 CCCAGCATCTACATTACAGACTTCAATGAAGGAAGTAAAAATATCTCAAT \
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDDDDD AS:i:100 XS:i:44 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:50 YS:i:100 YT:Z:CP
Actually, our example starts from col3.
Col | Field | Type | Brief Description |
---|---|---|---|
1 | QNAME | String | Query template NAME |
2 | FLAG | Int | bitwise FLAG |
3 | RNAME | String | References sequence NAME |
4 | POS | Int | 1- based leftmost mapping POSition |
5 | MAPQ | Int | MAPping Quality |
6 | CIGAR | String | CIGAR String |
7 | RNEXT | String | Ref. name of the mate/next read |
8 | PNEXT | Int | Position of the mate/next read |
9 | TLEN | Int | observed Template LENgth |
10 | SEQ | String | segment SEQuence |
11 | QUAL | String | ASCII of Phred-scaled base QUALity+33 |
BEDSE
cmd:
zcat xxx.trim.PE2SE.nodup.bedpe.gz | head -n1
Output:
chr14 22142675 22142725 chr14 22142572 22142622 7001113:845:HYH22BCXY:1:1101:10000:44312 44 - +
tagAlign
cmd:
zcat xxx.trim.PE2SE.nodup.tagAlign.gz | head -n1
Output:
chr14 22142675 22142725 N 1000 -
chr14 22142572 22142622 N 1000 +
- col 5 is the score = 1000/alignemntCount
tagAlign - TN5
cmd:
zcat xxx.trim.PE2SE.nodup.tagAlign.gz | head -n2
Output:
chr14 22142675 22142720 N 1000 -
chr14 22142576 22142622 N 1000 +