tracking the number of removed duplicated reads in a SAM flag - je markdupes removeduplicates
Dear all, thanks for the great je suite.
When using
je markdupes
with the option
REMOVE_DUPLICATES=True
it would be great for further analysis of read duplication, if the number of duplication would be stored in the BAM file as a flag.
Currently, we are using the flag to count for read occurances
"YB:<number of reads>"
to deal with that. YB would be one if the read is not duplicated, and 2, 3, 4, .. if it had duplications.
This enables users to do statistics on what regions are duplicated most and for certain sequencing types, what RNA types are duplicated by the protocol.
Thank you very much.