Trimming reads with Trimmomatic¶
Note: the trimmomatic manual is the go-to-guide for parameters and commands.
E. coli trimming¶
Trim the E. coli data lightly:
cd ~/notebooks
cd ecoli
java -jar /home/Trimmomatic-0.36/trimmomatic-0.36.jar PE \
ecoli-R1.fq.gz ecoli-R2.fq.gz \
ecoli-R1-pe.fq ecoli-R1-orphans.fq ecoli-R2-pe.fq ecoli-R2-orphans.fq \
ILLUMINACLIP:/home/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:40:15 \
LEADING:2 TRAILING:2 \
SLIDINGWINDOW:4:2 \
MINLEN:50
You should see:
Input Read Pairs: 100000 Both Surviving: 99802 (99.80%) Forward Only Surviving: 186 (0.19%) Reverse Only Surviving: 12 (0.01%) Dropped: 0 (0.00%)
Trim the E. coli data stringently:
java -jar /home/Trimmomatic-0.36/trimmomatic-0.36.jar PE \
ecoli-R1.fq.gz ecoli-R2.fq.gz \
ecoli-R1-pe.fq ecoli-R1-orphans.fq ecoli-R2-pe.fq ecoli-R2-orphans.fq \
ILLUMINACLIP:/home/Trimmomatic-0.36/adapters/TruSeq3-PE.fa:2:40:15 \
LEADING:2 TRAILING:2 \
SLIDINGWINDOW:4:20 \
MINLEN:50
You should see:
Input Read Pairs: 100000 Both Surviving: 83439 (83.44%) Forward Only Surviving: 8960 (8.96%) Reverse Only Surviving: 4821 (4.82%) Dropped: 2780 (2.78%)
You can use khmer’s readstats.py to evaluate the loss of sequence –
readstats.py ecoli-R?.fq.gz
readstats.py ecoli-R?-*.fq
Now, run fastqc on the trimmed data:
fastqc ecoli-R1-pe.fq
fastqc ecoli-R2-pe.fq
Trimming the yeast RNAseq data¶
Next, try building your own commands to trim the yeast RNAseq data - stringently, and lightly.
Next: Evaluating read mismatch statistics: mapping and de novo
LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github. Presentations (PPT/PDF) and PDFs are the property of their respective owners and are under the terms indicated within the presentation.