Skip to the content.

SACNAS_RNAseq_Workshop_2023

Audience Computational skills required Duration
SACNAS Attendees None 75 minute workshop

Description

This repository has learning materials for a 75 minute, hands-on Introduction to RNA-Seq analysis with R/RStudio workshop. R is a simple programming environment that enables the effective handling of data while providing excellent graphical support. RStudio is a tool that provides a user-friendly environment for working with R.

These materials are intended to provide a general overview of the RNA-Seq data analysis, starting from processed counts files.

Learning Objectives

Contents

Time Topic
~10 mins Module 1: RNAseq experimental setup and considerations
~10 mins Module 2: Post sequencing processing steps
~40 mins Module 3: Hands-on portion of workshop
~15 mins Questions from attendees

Workshop Slides

The slides presented in Module 1 and 2 can be found here

Dataset

Download the R project and data for this workshop here. Decompress and move the folder to the location on your computer where you would like to perform the analysis.

Installation Requirements

Download R and RStudio for your laptop:

Install the required R packages by running the following code in RStudio:

# Install CRAN packages
install.packages(c("BiocManager", "RColorBrewer", "tidyverse", "devtools", "pheatmap",  ))

# Install Bioconductor packages
BiocManager::install(c("clusterProfiler", "DESeq2", "org.Hs.eg.db", "EnhancedVolcano", "biomaRt", "enrichplot"))

Load the libraries to make sure the packages installed properly:

library(DESeq2) 
library(RColorBrewer)
library(pheatmap)
library(ggplot2)
library(EnhancedVolcano)
library(biomaRt)
library(clusterProfiler)
library(org.Hs.eg.db)
library(enrichplot)
library(tidyverse)

NOTE: The library used for the annotations associated with genes (here we are using org.Hs.eg.db) will change based on organism (e.g. if studying mouse, would need to install and load org.Mm.eg.db). The list of different organism packages are given here.

Additional Resources

For an overview of bioinformatics, the tools required for RNA-seq analysis and high perfomance computing, see these tutorials (the HPC parts will vary depending on your local cluster):

Bioinformatics Training
High Performance Computing
RNA-seq analysis
Informatics Technology for Cancer Research Training Network Courses
R for Data Science

Need help with Unix?

Unix Cheat Sheet
Vim - command line text editor
Common commands

Need help with R/RStudio?

R/RStudio

R for Beginners

Stack Overflow

dplyr Cheat Sheet

ggplot2 Cheat Sheet

Multiple RNAseq comparisons/ DESeq2:

Differential Expression Analysis
Overview

Non-model organisms:

Full-length transcriptome assembly from RNA-seq data without a reference genome

FASTQC and multiQC

FastQC video

Introduction to Nextflow and workflow management:

Nextflow video Nextflow Documentation

Here are some resources for publically available gene expression data: