Commit a3674516 authored by Christian Arnold's avatar Christian Arnold

Minor documentation and changes and code cleanup

parent 2c7e10e7
......@@ -12,7 +12,7 @@ Transcription factor (TF) activity constitutes an important readout of cellular
For more information, please check our paper on biorxiv:
[Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF](https://www.biorxiv.org/content/early/2018/07/13/368498).
[Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF](https://www.biorxiv.org/content/early/2018/12/01/368498).
Documentation
-------------
......@@ -33,4 +33,4 @@ Citation
--------
*Please cite the following article if you use diffTF in your research*:
Ivan Berest*, Christian Arnold*, Armando Reyes-Palomares, Giovanni Palla, Kasper Dindler Rassmussen, Kristian Helin & Judith B. Zaugg. Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF. 2018. *Molecular Systems Biology*. in review.
Ivan Berest*, Christian Arnold*, Armando Reyes-Palomares, Giovanni Palla, Kasper Dindler Rassmussen, Kristian Helin & Judith B. Zaugg. Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF. 2018. in review.
......@@ -40,8 +40,8 @@ We now show which rules are executed by *Snakemake* for a specific example (see
diffTF is implemented as a *Snakemake* pipeline. For a gentle introduction about *Snakemake*, see Section :ref:`workingWithPipeline`. As you can see, the workflow consists of the following steps or *rules*:
- ``checkParameterValidity``: R script that checks whether the specified peak file has the correct format, whether the provided *fasta* file and the *BAM* files are compatible, and other checks
- ``produceConsensusPeaks``: R script that generates the consensus peaks if none are provided
- ``filterSexChromosomesAndSortPeaks``: Filters various chromosomes 8sex, unassembled ones, contigs, etc) from the peak file.
- ``produceConsensusPeaks``: R script that generates the consensus peaks using the R package ``DiffBind`` if none are provided
- ``filterSexChromosomesAndSortPeaks``: Filters various chromosomes (sex, unassembled ones, contigs, etc) from the peak file.
- ``sortTFBSParallel``: Sort the TFBS lists by position
- ``resortBAM``: Sort the *BAM* file for optimized processing (only run if data are paired-end)
- ``intersectPeaksAndBAM``: Count all reads for peak regions across all input files
......@@ -347,7 +347,7 @@ Summary
String. Default "" (empty). Path to the consensus peak file.
Details
If set to the empty string "", the pipeline will generate a consensus peaks out of the peak files from each individual sample. For this, you need to provide the following two things:
If set to the empty string "", the pipeline will generate a consensus peaks out of the peak files from each individual sample using the R package ``DiffBind``. For this, you need to provide the following two things:
- a peak file for each sample in the metadata file in the column *peaks*, see the section :ref:`section_metadata` for details.
- The format of the peak files, as specified in ``peakType`` (:ref:`parameter_peakType`)
......
......@@ -134,7 +134,6 @@ for sectionCur in configDict:
#############################
# Maximum number of cores per rule.
# For local computation, the minimum of this value and the --cores parameter will define the number of CPUs per rule,
# while in a cluster setting, the minimum of this value and the number of cores onn the node the jobs runs is usedself.
......@@ -577,9 +576,6 @@ rule summary1:
script: dir_scripts + script_summary1
# Python: Read in 640 files one by one, and for each loop through all 1000 permutations and output to the correpsonding output file.
rule concatenateMotifsPerm:
input:
diagnosticPlots = rules.summary1.output,
......@@ -614,7 +610,6 @@ rule calcNucleotideContent:
singularity: "shub://chrarnold/Singularity_images:difftf_conda"
params:
motifsShort = TF_DIR + "/*/" + extDir + "/" + compType + "*.output.tsv.gz",
# TFMotifes = construct a string that resembles the call to tail,
refGenome = config["additionalInputFiles"]["refGenome_fasta"]
shell:
"""
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment