Skip to content
Snippets Groups Projects
Commit f7fa6c66 authored by Sascha Meiers's avatar Sascha Meiers
Browse files

Require external VCF to be zipped and indexed

parent b56d81f0
No related branches found
No related tags found
No related merge requests found
...@@ -15,17 +15,17 @@ Preliminary SV calling using Strand-seq data - summarized in a [Snakemake](https ...@@ -15,17 +15,17 @@ Preliminary SV calling using Strand-seq data - summarized in a [Snakemake](https
source("https://bioconductor.org/biocLite.R") source("https://bioconductor.org/biocLite.R")
biocLite('BSgenome.Hsapiens.UCSC.hg38') biocLite('BSgenome.Hsapiens.UCSC.hg38')
``` ```
* [Strand-Phaser](https://github.com/daewoooo/StrandPhaseR) is installed automagically * [Strand-Phaser](https://github.com/daewoooo/StrandPhaseR) is installed automatically
2. **Set up the configuration of the smakemake pipeline** 2. **Set up the configuration of the smakemake pipeline**
* Open `Snake.config.json` and specify the path to the executatables * Open `Snake.config.json` and specify the path to the executatables
(such as Mosaicatcher) and to the R scripts. (such as Mosaicatcher) and to the R scripts.
* Create a subdirectory `bam/` and another subdirectory per sample (e.g. * Create a subdirectory `bam/` and another subdirectory per sample (e.g.
`bam/NA12878`). **Multiple samples can be run together not**. `bam/NA12878/`). **Multiple samples can be run together not**.
Then copy (or soft-link) the Strand-seq single-cell libraries (one BAM Then copy (or soft-link) the Strand-seq single-cell libraries (one BAM
file per cell) in there. Note that bam files need to contain a read group file per cell) in there. Note that bam files need to be sorted and indexed,
and should have duplicates marked. contain a read group and should have duplicates marked.
3. **Run Snakemake** 3. **Run Snakemake**
...@@ -37,3 +37,16 @@ Preliminary SV calling using Strand-seq data - summarized in a [Snakemake](https ...@@ -37,3 +37,16 @@ Preliminary SV calling using Strand-seq data - summarized in a [Snakemake](https
--cluster-config Snake.cluster.json \ --cluster-config Snake.cluster.json \
--cluster "???" --cluster "???"
``` ```
### SNV calls
The pipeline will run simple SNV calling using [samtools](https://github.com/samtools/samtools)
and [bcftools](https://github.com/samtools/bcftools). If you **already have
SNV calls**, you can avoid that by entering your VCF files into the pipeline.
To so, make sure the files are [tabix](https://github.com/samtools/tabix)-indexed
and specifigy them inside the `Snake.config.json` file:
```
"snv_calls" : {
"NA12878" : "path/to/snp/calls.vcf.gz"
},
```
...@@ -422,7 +422,8 @@ rule merge_SNV_calls: ...@@ -422,7 +422,8 @@ rule merge_SNV_calls:
rule split_external_snv_calls: rule split_external_snv_calls:
input: input:
vcf = lambda wc: config["snv_calls"][wc.sample] vcf = lambda wc: config["snv_calls"][wc.sample],
tbi = lambda wc: config["snv_calls"][wc.sample] + ".tbi"
output: output:
vcf = "external_snv_calls/{sample}/{chrom}.vcf" vcf = "external_snv_calls/{sample}/{chrom}.vcf"
log: "log/{sample}/external_snv_calls.{chrom}.vcf.log" log: "log/{sample}/external_snv_calls.{chrom}.vcf.log"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment