Collapse fastq(.gz) files¶
Definition
Normally quality values are lost in small RNA-seq pipelines due to collapsing after adapter recognition. This option allow to collapse reads after adapter removal with cutadapt
or any other tool. This way the mapping can use quality values, allowing to map using bwa
for instance, or any other alignment tool that doesn’t support FASTA files.
Methods
The new quality values are the average of each of the sequence collapse.
Example
seqcluster collapse -f sample_trimmed.fastq -o collapse
-f
is the fastq(.gz) file-o
the folder where the outout will be created. A new FASTQ file, where the name stand for:@seq_[0-9]_x[0-9]
The number right after _x
means the abundance of this sequence in the sample