Skip to content
Snippets Groups Projects
Commit a3674516 authored by Christian Arnold's avatar Christian Arnold
Browse files

Minor documentation and changes and code cleanup

parent 2c7e10e7
No related branches found
No related tags found
No related merge requests found
......@@ -12,7 +12,7 @@ Transcription factor (TF) activity constitutes an important readout of cellular
For more information, please check our paper on biorxiv:
[Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF](https://www.biorxiv.org/content/early/2018/07/13/368498).
[Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF](https://www.biorxiv.org/content/early/2018/12/01/368498).
Documentation
-------------
......@@ -33,4 +33,4 @@ Citation
--------
*Please cite the following article if you use diffTF in your research*:
Ivan Berest*, Christian Arnold*, Armando Reyes-Palomares, Giovanni Palla, Kasper Dindler Rassmussen, Kristian Helin & Judith B. Zaugg. Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF. 2018. *Molecular Systems Biology*. in review.
Ivan Berest*, Christian Arnold*, Armando Reyes-Palomares, Giovanni Palla, Kasper Dindler Rassmussen, Kristian Helin & Judith B. Zaugg. Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF. 2018. in review.
......@@ -40,8 +40,8 @@ We now show which rules are executed by *Snakemake* for a specific example (see
diffTF is implemented as a *Snakemake* pipeline. For a gentle introduction about *Snakemake*, see Section :ref:`workingWithPipeline`. As you can see, the workflow consists of the following steps or *rules*:
- ``checkParameterValidity``: R script that checks whether the specified peak file has the correct format, whether the provided *fasta* file and the *BAM* files are compatible, and other checks
- ``produceConsensusPeaks``: R script that generates the consensus peaks if none are provided
- ``filterSexChromosomesAndSortPeaks``: Filters various chromosomes 8sex, unassembled ones, contigs, etc) from the peak file.
- ``produceConsensusPeaks``: R script that generates the consensus peaks using the R package ``DiffBind`` if none are provided
- ``filterSexChromosomesAndSortPeaks``: Filters various chromosomes (sex, unassembled ones, contigs, etc) from the peak file.
- ``sortTFBSParallel``: Sort the TFBS lists by position
- ``resortBAM``: Sort the *BAM* file for optimized processing (only run if data are paired-end)
- ``intersectPeaksAndBAM``: Count all reads for peak regions across all input files
......@@ -347,7 +347,7 @@ Summary
String. Default "" (empty). Path to the consensus peak file.
Details
If set to the empty string "", the pipeline will generate a consensus peaks out of the peak files from each individual sample. For this, you need to provide the following two things:
If set to the empty string "", the pipeline will generate a consensus peaks out of the peak files from each individual sample using the R package ``DiffBind``. For this, you need to provide the following two things:
- a peak file for each sample in the metadata file in the column *peaks*, see the section :ref:`section_metadata` for details.
- The format of the peak files, as specified in ``peakType`` (:ref:`parameter_peakType`)
......
......@@ -134,7 +134,6 @@ for sectionCur in configDict:
#############################
# Maximum number of cores per rule.
# For local computation, the minimum of this value and the --cores parameter will define the number of CPUs per rule,
# while in a cluster setting, the minimum of this value and the number of cores onn the node the jobs runs is usedself.
......@@ -577,9 +576,6 @@ rule summary1:
script: dir_scripts + script_summary1
# Python: Read in 640 files one by one, and for each loop through all 1000 permutations and output to the correpsonding output file.
rule concatenateMotifsPerm:
input:
diagnosticPlots = rules.summary1.output,
......@@ -614,7 +610,6 @@ rule calcNucleotideContent:
singularity: "shub://chrarnold/Singularity_images:difftf_conda"
params:
motifsShort = TF_DIR + "/*/" + extDir + "/" + compType + "*.output.tsv.gz",
# TFMotifes = construct a string that resembles the call to tail,
refGenome = config["additionalInputFiles"]["refGenome_fasta"]
shell:
"""
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment