Skip to content
Snippets Groups Projects
README.md 3.04 KiB
Newer Older
Nicolas Descostes's avatar
Nicolas Descostes committed
# Analyzing ChIP-Seq data with Galaxy

Nicolas Descostes's avatar
Nicolas Descostes committed
This is the material for the galaxy workshop on ChIP-seq data analysis at EMBL Rome. This is the first time that this workshop is proposed. We will start with a 2 days time frame and will adjust.
Nicolas Descostes's avatar
Nicolas Descostes committed

Nicolas Descostes's avatar
Nicolas Descostes committed
All materials used in this workshop are on the [Galaxy training website](https://galaxyproject.github.io/training-material/).
Nicolas Descostes's avatar
Nicolas Descostes committed


## What is Galaxy?

Nicolas Descostes's avatar
Nicolas Descostes committed
**Slides**: [Introduction to Galaxy](https://galaxyproject.github.io/training-material/topics/introduction/slides/introduction.html#1).
Nicolas Descostes's avatar
Nicolas Descostes committed

## Galaxy at EMBL
Nicolas Descostes's avatar
Nicolas Descostes committed

Nicolas Descostes's avatar
Nicolas Descostes committed
EMBL has its own instance of Galaxy at [https://galaxy.embl.de/](https://galaxy.embl.de/).

=> Demo of data import at EMBL.

## Introduction to Genomics and Galaxy

This practical aims to familiarize you with the Galaxy user interface. It will teach you how to perform basic tasks such as importing data, running tools, working with histories, creating workflows, and sharing your work.

**For this workshop of September 2019, ignore the "Log in to Galaxy" section. We are going to use the EMBL instance. Also ignore the last part about repeating the analysis with a workflow**
**Tutorial**: [Introduction to Genomics and Galaxy](https://galaxyproject.github.io/training-material/topics/introduction/tutorials/galaxy-intro-strands/tutorial.html).

**Differences with the tutorial:**
Nicolas Descostes's avatar
Nicolas Descostes committed
  * In UCSC, select "table: Comprehensive (wgEncodeGencodeCompV24" instead of "table: known genes"
  * When editing the name of the dataset, the button is not "Save attributes" but "Save".
  * To split the sequences, search for the term "filter"  instead of "split".
  * To intersect the data, use "Intersect intervals" in the "Bedtools" section

## Manipulating your first genomic data

In this section we will look at practical aspects of manipulation of next-generation sequencing data. We will start with Fastq format produced by most sequencing machines and will finish with SAM/BAM format representing mapped reads.

**Tutorial**:[Manipulating NGS data with Galaxy](https://galaxyproject.org/tutorials/ngs/).

## What is ChIP-Seq?

**Slides**: [ChIP-seq data analysis](https://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/formation_of_super-structures_on_xi/slides.html#1)

## Process and analyze ChIP-Seq data

In the upcoming tutorial, we will use wild type data from Wang et al. 2018 and analyze the ChIP-seq data step by step:

  * CTCF with 2 replicates: wt_CTCF_rep1 and wt_CTCF_rep2
  * H3K4me3 with 2 replicates: wt_H3K4me3_rep1 and wt_H3K4me3_rep2
  * H3K27me3 with 2 replicates: wt_H3K27me3_rep1 and wt_H3K27me3_rep2
  * ‘input’ with 2 replicates: wt_input_rep1 and wt_input_rep2

Nicolas Descostes's avatar
Nicolas Descostes committed
**Tutorial**: [Formation of the Super-Structures on the Inactive X](https://galaxyproject.github.io/training-material/topics/epigenetics/tutorials/formation_of_super-structures_on_xi/tutorial.html)

## If time allows

If we have enough time, I will introduce what collections are and we will build a full chip-seq workflow using this data structure.

**Tutorial**: [Processing many samples at once with collections](https://galaxyproject.org/tutorials/collections/)