Day6: Resequencing

Welcome to Day 6 of the Short-Read Sequencing Analysis Workshop.The videos for day 6 will go over a resequencing pipeline including variant detection methods. The two videos will show you how to work with both single-end and paired-end reads, and how to work with both artificial/model data and real Illumina sequencing data.

Day 6 Videos

Notes for Video1 and Video2 (2015) Please read through these notes regarding Video1 and Video2. The data you will be using for both videos has changed location, the current file paths can be found here. Additionally in Video1 demonstrates how to run programs on the head node without using PBS scripts, as well as why (and when) this can be useful.

Video1: This video will use an artificial fastq file created by splitting the yeast Sigma 1278b genome into 50-base reads and mapping that back to the yeast S288C genome to identify variants. At the beginning you will be introduced to an organization structure for these types of datasets. This pipeline will use Bowtie for mapping the reads to the reference genome, Samtools to convert the output formats and GATK Unified Genotyper to call variants including SNPs and indels. Video1 Slides

Video2: This video will use a real paired-end sequencing data set (2x150 reads) of beer yeast and map to a reference genome using BWA. Starting with FastQC, the reads will be analyze for quality and trimmed as necessary, mapped with BWA, remove PCR duplicates with Samtools, realign around indels using GATK toolkit and finally call SNPs and Indels using GATK Unified Genotyper. Video2 Slides

Day 6 Files

Day 6 In-class slides (available by the start of class)

Additional Resources

GATK Guide This guide can help you find out useful information about all of the GATK tools available as well as links to GATK "Best Practices" for DNA and RNA sequencing analysis

GATK IndelRealigner documentation Documentation for the IndelRealigner tool in the GATK toolkit

GATK UnifiedGenotyper documentation Documentation for the UnifiedGenotyper tool in the GATK toolkit

