Final thoughts on short-read trimming¶
A summary of recommendations¶
- Run FastQC before trimming; trim; look again.
- Always trim paired sequences together.
- Always adapter trim!
- Impose a length filter, 50.
- for quantification (RNAseq), trim lightly
- for RNAseq assembly, trim lightly
- for variant calling, trim stringently
- use the same trimming parameters on all your data unless you have a VERY good reason otherwise!
- ignore the first 10 bp composition bias in RNAseq;
- ignore sequence duplication levels in high-coverage RNAseq;
- look at your read positional bias with mapping (or de novo) as well;
Some references¶
MacManes, 2014, http://journal.frontiersin.org/article/10.3389/fgene.2014.00013/full - recommends gentle trimming for RNAseq.
Williams et al., 2015, http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-016-0956-2 - recommends imposing a length filter.
Mbandi et al., 2014, http://journal.frontiersin.org/article/10.3389/fgene.2014.00017/full - complicated, but start with gentle trimming.
Del Fabbro et al., 2013, http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0085024 - evaluation across data sets.
LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.