Skip to content
Snippets Groups Projects
projectInfo.rst 5.71 KiB
Newer Older
.. _docs-project:

Biological motivation
============================
Transcription factor (TF) activity constitutes an important readout of cellular signalling pathways and thus for assessing regulatory differences across conditions. However, current technologies lack the ability to simultaneously assessing activity changes for multiple TFs and surprisingly little is known about whether a TF acts as repressor or activator. To this end, we introduce the widely applicable genome-wide method diffTF to assess differential TF binding activity and classifying TFs as activator or repressor by integrating any type of genome-wide chromatin with RNA-Seq data and in-silico predicted TF binding sites.

For a graphical summary of the idea, see the section :ref:`workflow`

We also put the paper on *bioRxiv*, please read all methodological details here:
`Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF <https://www.biorxiv.org/content/early/2018/07/13/368498>`_.


Help, contribute and contact
============================

If you have questions or comments, feel free to contact us. We will be happy to answer any questions related to this project as well as questions related to the software implementation. For method-related questions, contact Judith B. Zaugg (judith.zaugg@embl.de) or Ivan Berest (berest@embl.de). For technical questions, contact Christian Arnold (christian.arnold@embl.de).

Christian Arnold's avatar
Christian Arnold committed
If you have questions, doubts, ideas or problems, please use the `Bitbucket Issue Tracker <https://bitbucket.org/chrarnold/diffTF>`_. We will respond in a timely manner.
Christian Arnold's avatar
Christian Arnold committed
============================

If you use this software, please cite the following reference:

Ivan Berest*, Christian Arnold*, Armando Reyes-Palomares, Giovanni Palla, Kasper Dindler Rassmussen, Kristian Helin & Judith B. Zaugg. *Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF*. 2018. *Molecular Systems Biology*. in review.

We also put the paper on *bioRxiv*, please read all methodological details here:
`Quantification of differential transcription factor activity and multiomic-based classification into activators and repressors: diffTF <https://www.biorxiv.org/content/early/2018/07/13/368498>`_.
Christian Arnold's avatar
Christian Arnold committed
============================

Version 1.1.5 (2018-08-14)
    - optimized ``checkParameterValidity.R`` script, only TFBS files for TFs included in the analysis are now checked
    - addressed an R library compatibility issue independent of *diffTF* that users reported. In some cases, for particular versions of R and Bioconductor, R exited with a *segfault* (memory not mapped) error in the ``checkParameterValidity.R`` that seems to be caused by the combination of *DiffBind* and *DESeq2*. Specifically, when *DiffBind* is loaded *before* *DESeq2*, R crashes with a segmentation fault upon exiting, whereas loading *DiffBind* *after* *DESeq2* causes no issue. If there are further issues, please let us know. Thanks to Gyan Prakash Mishra, who first reported this.
    - fixed an issue when the number of peaks is very small so that some TFs have no overlapping TFBS at all in the peak regions. This caused the rule ``intersectTFBSAndBAM`` to exit with an error due to grep's policy of returning exit code 1 if no matches are returned (thanks to Jonas Ungerbeck, again).
    - removed the ``--timestamp`` option in the helper script ``startAnalysis.sh`` because this option has been removed for Snakemake >5.2.1
    - Documentation updates

Version 1.1.4 (2018-08-09)
    - minor, updated the ``checkParameterValidity.R`` script and the documentation (one package was not mentioned)
Version 1.1.3 (2018-08-06)
    - minor, fixed a small issue in the Volcano plot (legends wrong and background color in the plot was not colored properly)

Version 1.1.2 (2018-08-03)
    - fixed a bug that made the ``3.analyzeTF.R`` script fail in case when the number of permutations has been changed throughout the analysis or when the value is higher than the actual maximum number (thanks to Jonas Ungerbeck)
Version 1.1.1 (2018-08-01)
    - Documentation updates (referenced the bioRxiv paper, extended the section about errors)
    - updated the information on how to load the snakemake object into the R workspace in the corresponding R scripts
    - fixed a bug that made the labels in the Volcano plot switch sides (thanks to Jonas Ungerbeck)
    - merged some diagnostic plots for the AR classification in the last step
    - renamed R scripts and R log files to make them consistent with the cluster output and error files
Version 1.1 (2018-07-27)
    - added a new parameter ``dir_TFBS_sorted`` in the config file to specify that the TFBS input files are already sorted, which saves some computation time by not resorting them
    - updated the TFBS files that are available via download (some files were not presorted correctly)
    - added support for single-end BAM files. There is a new parameter ``pairedEnd`` in the config file that specifies whether reads are paired-end or not.
    - restructured some of the permutation-related output files to save space and computation time. The rule ``concatenateMotifsPerm`` should now be much faster, and the TF-specific ``...outputPerm.tsv.gz`` files are now much smaller due to an improved column structure
Version 1.0.1 (2018-07-25)
    - fixed a bug in ``2.DiffPeaks.R`` that sometimes caused the step to fail, thanks to Jonas Ungerbeck for letting us know
    - fixed a bug in ``3.analyzeTF`` for rare corner cases when *DESeq* fails

Version 1.0 (2018-07-01)
    - released stable version

Christian Arnold's avatar
Christian Arnold committed


diffTF is licensed under the MIT License:

.. literalinclude:: ../LICENSE.md
    :language: text