@@ -166,4 +166,4 @@ Adaptations and notes when running with Singularity
Please read the following additional notes and warnings related to ``Singularity``:
- .. warning:: If you use ``Singularity`` version 3, make sure you have at least version 3.0.2 installed or the latest pull from version 3.0.1, as there was an issue with Snakemake and particular ``Singularity`` versions. For more details, see `here <https://bitbucket.org/snakemake/snakemake/issues/1017/snakemake-process-suspended-upon-execution>`_.
- .. warning:: If you use ``Singularity`` version 3, make sure you have at least version 3.0.3 installed, as there was an issue with Snakemake and particular ``Singularity`` versions. For more details, see `here <https://bitbucket.org/snakemake/snakemake/issues/1017/snakemake-process-suspended-upon-execution>`_.
This affects currently only rules involving *featureCounts* - that is, *intersectPeaksAndBAM* while for rule *intersectTFBSAndBAM*, the number of cores is hard-coded to 4. When running *Snakemake* locally, each rule will use at most this number of cores, while in a cluster setting, this value refers to the maximum number of CPUs an individual job / rule will occupy. If the node the job is executed on has fewer nodes, then the maximum number of cores on the node will be taken.
.. _parameter_dir_TFBS_sorted:
PARAMETER ``dir_TFBS_sorted``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
Logical. true or false. Default false. Are the files in ``dir_TFBS`` (:ref:`parameter_dir_TFBS`) already pre-sorted?
Details
If set to true, no additional sorting will be done, saving computation time because the rule *sortTFBSParallel* is not executed. Note that sorting is assumed to be according to the chromosome (first column) and start position (second column), essentially invoking *sort -k1,1 -k2,2n*. If set to false, all files in ``dir_TFBS`` (:ref:`parameter_dir_TFBS`) will be sorted.
.. _parameter_regionExtension:
...
...
@@ -444,7 +432,6 @@ Details
However, you may also manually create these files to include additional TF of your choice or to be more or less stringent with the predicted TFBS. For this, you only need PWMs for the TF of interest and then a motif prediction tool such as *FIMO* or *MOODS*.
Also see the parameter ``dir_TFBS_sorted`` (:ref:`parameter_dir_TFBS_sorted`) to specify whether the files are already sorted or not.
.. _parameter_RNASeqCounts:
...
...
@@ -594,7 +581,7 @@ Summary
Only present if no consensus peak file was provided (``consensusPeaks``, :ref:`parameter_consensusPeaks`). Produced in rule ``filterSexChromosomesAndSortPeaks``. Generated consensus peaks, before filtering (see below).
Details
Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn\*, \*random*, \*hap|_gl\*
Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn\*, and all contig names that do not start with "chr" such as \*random* or \*hap|_gl\*
@@ -803,7 +790,7 @@ Stores results related to the user-specified extension size (``regionExtension``
- ``conditionComparison.rds``: Produced in rule ``DiffPeaks``. Stores the condition comparison as a string. Some steps in diffTF need this file as input.
- ``{comparisonType}.motifs.coord.permutation{perm}.bed.gz`` and ``{comparisonType}.motifs.coord.nucContent.permutation{perm}.bed.gz`` for each permutation ``{perm}``: Produced in rule ``calcNucleotideContent``, and needed subsequently for the binning. Temporary and result file of *bedtools nuc*, respectively. The latter contains the GC content for all TFBS.
- ``{comparisonType}.checkParameterValidity.done``: temporary flag file
- ``{TF}_TFBS.sorted.bed`` for each TF ``{TF}``: Produced in rule ``sortTFBSParallel``. Coordinate-sorted version of the input TFBS.
- ``{TF}_TFBS.sorted.bed`` for each TF ``{TF}``: Produced in rule ``sortTFBSParallel``. Coordinate-sorted version of the input TFBS. Only "regular" chromosomes starting with "chr" are kept, while sex chromosomes (chrX, chrY), chrM and unassembled contigs such as chrUn are additionally removed.
- ``{comparisonType}.allTFBS.peaks.bed.gz``: Produced in rule ``intersectPeaksAndTFBS``. *BED* file containing all TFBS from all TF that overlap with the peaks before motif extension
@@ -53,6 +53,9 @@ We also put the paper on *bioRxiv*, please read all methodological details here:
Change log
============================
Version 1.2.5 (2019-03-13)
- Updated the TFBS_hg38_FIMO_HOCOMOCOv11 archive one more time to exclude non-assembled contigs such as HLA*. To make the pipeline more stable for such edge cases, the parameter ``dir_TFBS_sorted`` has been removed, and sorting and filtering of chromosomes is now always performed. Only chromosomes are kept in both the consensus peak files and the TFBS bed files that start with ``chr`` and are neither sex chromosomes (``chrX`` or ``chrY``) nor ``chrM``. If you want to keep sex chromosomes in your analysis (although we think this is not recommended), simply edit the Snakefile and remove the "chrX" and "chrY" occurences in the two filtering rules.
Version 1.2.4 (2019-03-04)
- Fixed an issue with ``checkParameterValidity.R`` that caused an error message when loading TFBS files with a numeric score. Thanks to Scott Berry for pointing it out.
- Updated the TFBS_hg38_FIMO_HOCOMOCOv11 archive. The bed files are now properly pre-sorted
print("The parameter \"coresPerRule\" in section \"par_general\" has not been defined. Jobs/rules with multithreading support will use the default of 16 cores.")