Commit 764c2d98 authored by Christian Arnold's avatar Christian Arnold

Documentation update

parent bb9c2c3a
......@@ -105,7 +105,7 @@ PARAMETER ``designContrast``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
String. Default "~ Treatment + conditionSummary". Design formula for the differential accessibility analysis in DESeq2.
String. Default "~ Treatment + conditionSummary". Design formula for the differential accessibility analysis in *DESeq2*.
Details
This important parameter defines the actual contrast that is done in the differential analysis. That is, which groups of samples are being compared? Examples include mutant vs wildtype, mutated vs. unmutated, etc. The last element in the formula must always be “conditionSummary”, which defines the two groups that are being compared. This name is currently hard-coded and required by the pipeline. Our pipeline allows including additional variables to model potential confounding variables, like gender, batches etc. For each additional variable that is part of the formula, a corresponding and identically named column in the sample summary file must be specified.
......@@ -362,6 +362,9 @@ Most files have one of the following file formats:
FOLDER ``FINAL_OUTPUT``
=============================================
In this folder, the final output files are stored. Most users want to examine the files in here for further analysis.
Subfolder ``extension{regionExtension}``
----------------------------------------
......@@ -369,16 +372,27 @@ Stores results related to the user-specified extension size (:ref:`parameter_reg
.. note:: Note that in all output files, in the column ``permutation``, 0 always refers to the non-permuted, real data, while permutations > 0 reflect real permutations.
- ``{comparisonType}.allMotifs.tsv.gz``: Summary table for each TFBS with the columns as follows:
FILE ``{comparisonType}.allMotifs.tsv.gz``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
Summary table for each TFBS
Details
Columns are as follows:
- *permutation*: The number of the permutaton.
- *TF: name of the TF
- *chr, *MSS*, *MES*, *strand*, *TFBSID*: Genomic location and identifier of the (extended) TFBS
- *PSS, *PES*, *peakID*: Genomic location and annotation of the overlapping peak region
- *baseMean*, *log2FoldChange*, *lfcSE*, *stat*, *pvalue*, *padj*: Results from the DESeq2 analysis. See the `DESeq2 documentation <https://www.bioconductor.org/help/course-materials/2015/LearnBioconductorFeb2015/B02.1.1_RNASeqLab.html>`_ for details.
- *TF*: name of the TF
- *chr*, *MSS*, *MES*, *strand*, *TFBSID*: Genomic location and identifier of the (extended) TFBS
- *PSS*, *PES*, *peakID*: Genomic location and annotation of the overlapping peak region
- *baseMean*, *log2FoldChange*, *lfcSE*, *stat*, *pvalue*, *padj*: Results from the *DESeq2* analysis. See the `DESeq2 documentation <https://www.bioconductor.org/help/course-materials/2015/LearnBioconductorFeb2015/B02.1.1_RNASeqLab.html>`_ for details.
- ``{comparisonType}.TF_vs_peak_distribution.tsv``: TODO. The columns are as follows:
FILE ``{comparisonType}.TF_vs_peak_distribution.tsv``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
TODO
Details
Columns are as follows:
- *TF*: name of the TF
- *permutation*: The number of the permutation.
- *Pos_l2F*C, *Mean_l2FC*, *Median_l2FC*, *sd_l2FC*, *Mode_l2FC*, *skewness_l2FC*: fraction of positive values, mean, median, standard deviation, mode value and Bickel's measure of skewness of the log2 fold change distribution across all TFBS
......@@ -387,7 +401,13 @@ Stores results related to the user-specified extension size (:ref:`parameter_reg
- *TFBS_num*: number of TFBS
- *Diff_mean*, *Diff_median*, *Diff_mode*, *Diff_skew*: Difference of the mean, median, mode, and skewness between the log2 fold-change distribution across all TFBS and the peaks, respectively
- ``{comparisonType}.summary.tsv``: The final summary table that is also used for the final circular visualization (see below). The columns are as follows:
FILE ``{comparisonType}.summary.tsv``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
The final summary table that is also used for the final circular visualization
Details
The columns are as follows:
- *TF*: name of the TF
- *permutation*: The number of the permutation.
......@@ -404,44 +424,127 @@ Stores results related to the user-specified extension size (:ref:`parameter_reg
- *classification*: RNA-Seq classification (either activator, undetermined, repressor or not-expressed)
- *Cohend_factor* , *weighted_CD*, *weighted_median*, *weighted_sd*: Not used.
- ``{comparisonType}.summary.circular.pdf``: The final visualization of the diffTF results. The PDF contains multiple pages, the structure of which varies depending on the parameters:
Number of permutations > 0
RNA-Seq integration
Pages 1-15: Visualization of the results including the permutations
Pages 16-30: Visualization of the results excluding the permutations
Within each of these two categories, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
No RNA-Seq integration:
Pages 1-5: Visualization of the results including the permutations for different FDR thresholds
Pages 6-10: Visualization of the results excluding the permutations for different FDR thresholds
Within each of these two categories, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
Number of permutations = 0
RNA-Seq integration: Pages 1-15 show the results in a format as described below for varying combinations of FDR threshold and RNA-Seq categories
No RNA-Seq integration: Pages 1-5 show the results for different FDR thresholds
If RNA-Seq data is integrated, different combinations of categories are shown on each page (1:activator-undetermined-repressor-not-expressed, 2:activator-undetermined-repressor, 3:activator-repressor), which along with different FDR thresholds gives rise to multiple individual plots. The values on the x-axis denote the effect size (if permutations are incorporated enrichment over background, otherwise the mean difference between the two conditions), with higher values indicating a higher differential TF activity between the two conditions. TFs that are similarly active between the two conditions are close to 0, while TFs more active in either condition are located in the left and right part of the plot. The y-axis (radial position) denotes the statistical significance (adjusted p-values). The significance threshold is indicated as red circular line. TFs that pass the significance threshold are labeled and, if RNA-Seq data is integrated, colored according to their predicted role (see above).
{comparisonType}.diagnosticPlots.pdf: Various diagnostic plots for the final TF activity values. If the number of permutations is larger than 0, the first three pages show various versions of the pemruted weighted_meanDifference values and how they relate to the real ones. Permutation 0, as used everywhere throughout the pipeline, contains the real values, while any permutation > 0 refers to an actual permutation. Page 1 shows real and permuted values, page 2 only permuted ones, page 3 a density plot of the real values with the permutation thresholds as dashed lines, inside of which TFs are not labeled as they fall within the permutation and therefore noise area. The next page shows various diagnostic plots from the locfdr package to estimate the distribution median, while the remaining plots show histograms of all relevant columns in the final output table for different sets of TFs depending on a specific FDR threshold.
FILE ``{comparisonType}.summary.circular.pdf``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
The final visualization of the diffTF results
Details
The PDF contains multiple pages, the structure of which varies depending on the parameters:
- Number of permutations > 0
- RNA-Seq integration: Within each of the two following sections, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
- Pages 1-15: Visualization of the results including the permutations
- Pages 16-30: Visualization of the results excluding the permutations
- No RNA-Seq integration: Within each of the two following section, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
- Pages 1-5: Visualization of the results including the permutations for different FDR thresholds
- Pages 6-10: Visualization of the results excluding the permutations for different FDR thresholds
- Number of permutations = 0
- RNA-Seq integration: Pages 1-15 show the results in a format as described below for varying combinations of FDR threshold and RNA-Seq categories
- No RNA-Seq integration: Pages 1-5 show the results for different FDR thresholds
- If RNA-Seq data is integrated, different combinations of categories are shown on each page (1: activator-undetermined-repressor-not-expressed, 2: activator-undetermined-repressor, 3: activator-repressor), which along with different FDR thresholds gives rise to multiple individual plots. The values on the x-axis denote the effect size (if permutations are incorporated enrichment over background, otherwise the mean difference between the two conditions), with higher values indicating a higher differential TF activity between the two conditions. TFs that are similarly active between the two conditions are close to 0, while TFs more active in either condition are located in the left and right part of the plot. The y-axis (radial position) denotes the statistical significance (adjusted p-values). The significance threshold is indicated as red circular line. TFs that pass the significance threshold are labeled and, if RNA-Seq data is integrated, colored according to their predicted role (see above).
FILE ``{comparisonType}.diagnosticPlots.pdf``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Summary
Various diagnostic plots for the final TF activity values
Details
If the number of permutations is larger than 0, the first three pages show various versions of the pemruted weighted_meanDifference values and how they relate to the real ones. Permutation 0, as used everywhere throughout the pipeline, contains the real values, while any permutation > 0 refers to an actual permutation. Page 1 shows real and permuted values, page 2 only permuted ones, page 3 a density plot of the real values with the permutation thresholds as dashed lines, inside of which TFs are not labeled as they fall within the permutation and therefore noise area. The next page shows various diagnostic plots from the *locfdr* package to estimate the distribution median, while the remaining plots show histograms of all relevant columns in the final output table for different sets of TFs depending on a specific FDR threshold.
FOLDER ``PEAKS``
=============================================
Stores peak-associated files.
- if no consensus peak file was provided (:ref:`parameter_consensusPeaks`):
- ``{comparisonType}.consensusPeaks.bed`` and ``consensusPeaks_lengthDistribution.pdf``: generated consensus peaks, before filtering (see below) as well as a diagnostic plot showing the length distribution of the peaks
- ``{comparisonType}.consensusPeaks.filtered.sorted.bed``: Produced in rule ``filterSexChromosomesAndSortPeaks``. Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn\*, \*random*, \*hap|_gl\*
- ``{comparisonType}.allBams.peaks.overlaps.bed``: Produced in rule ``intersectPeaksAndBAM``. Counts for each consensus peak with each of the input BAM files
- ``{comparisonType}.sampleMetadata.rds``: Produced in rule ``DESeqPeaks``. Stores data for the input data (similar to the input sample table), for both the real data and the permutations.
- ``{comparisonType}.peaks.rds``: Produced in rule ``DESeqPeaks``. Stores all peaks that will be used in the analysis.
- ``{comparisonType}.peaks.tsv``: Produced in rule ``DESeqPeaks``. Stores the DESEq2 results of the differential accessibility analysis for the peaks.
- ``{comparisonType}.normFacs.rds``: Produced in rule ``DESeqPeaks``. Gene-specific normalization factors for each sample and peak. This file is produces after the differential accessibility analysis for the peaks. The normalization factors will be used for the TF-specific differential accessibility analysis.
- ``{comparisonType}.diagnosticPlots.peaks.pdf`` and ``{comparisonType}.diagnosticPlots.peaks_permutation{perm}.pdf`` for each permutation ``{perm}``: Produced in rule ``DESeqPeaks``. Various diagnostic plots for the differential accessibility peak analysis for the real and permuted data, respectively:
FILES ``{comparisonType}.consensusPeaks.bed`` and ``consensusPeaks_lengthDistribution.pdf``
--------------------------------------------
Summary
Only present if no consensus peak file was provided (:ref:`parameter_consensusPeaks`). Produced in rule ``filterSexChromosomesAndSortPeaks``. Generated consensus peaks, before filtering (see below) as well as a diagnostic plot showing the length distribution of the peaks.
Details
Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn\*, \*random*, \*hap|_gl\*
FILE ``{comparisonType}.allBams.peaks.overlaps.bed``
--------------------------------------------
Summary
Produced in rule ``intersectPeaksAndBAM``. Counts for each consensus peak with each of the input BAM files.
Details
FILE ``{comparisonType}.sampleMetadata.rds``
--------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. Stores data for the input data (similar to the input sample table), for both the real data and the permutations.
Details
FILE ``{comparisonType}.peaks.rds``
--------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. Stores all peaks that will be used in the analysis.
Details
FILE ``{comparisonType}.peaks.tsv``
--------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. Stores the *DESeq2* results of the differential accessibility analysis for the peaks.
Details
FILE ``{comparisonType}.normFacs.rds``
--------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. Gene-specific normalization factors for each sample and peak.
Details
This file is produces after the differential accessibility analysis for the peaks. The normalization factors will be used for the TF-specific differential accessibility analysis.
FILES ``{comparisonType}.diagnosticPlots.peaks.pdf`` and ``{comparisonType}.diagnosticPlots.peaks_permutation{perm}.pdf`` for each permutation ``{perm}``
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. Various diagnostic plots for the differential accessibility peak analysis for the real and permuted data, respectively
Details
The pages are as follows:
(1) MA plots
(2) density plots of normalized and non-normalized counts
(3) mean-average plots (average of the log-transformed counts vs the fold-change per peak) for each of the sample pairs
(4) mean SD plots (row standard deviations versus row means)
- ``{comparisonType}.DESeq.object.rds``: Produced in rule ``DESeqPeaks``. The DESeq2 object from the differential accessibility peak analysis.
FILE ``{comparisonType}.DESeq.object.rds`
--------------------------------------------
Summary
Produced in rule ``DESeqPeaks``. The *DESeq2* object from the differential accessibility peak analysis.
Details
FOLDER ``TF-SPECIFIC``
=============================================
......@@ -452,11 +555,11 @@ Subfolder ``extension{regionExtension}``
----------------------------------------
- ``{TF}.{comparisonType}.allBAMs.overlaps.bed.gz`` and ``{TF}.{comparisonType}.allBAMs.overlaps.bed.summary``: Overlap and featureCounts summary file of read counts across all TFBS for all input BAM files.
- ``{TF}.{comparisonType}.output.tsv``: Produced in rule ``analyzeTF``. A summary table for the DeSeq2 analysis. See the file ``{comparisonType}.allMotifs.tsv.gz`` in the ``FINAL_OUTPUT`` folder for a column description.
- ``{TF}.{comparisonType}.summary.rds``: Produced in rule ``analyzeTF``. A summary table for the log2 fold-changes across all TFBS DESeq2 results.
- ``{TF}.{comparisonType}.output.tsv``: Produced in rule ``analyzeTF``. A summary table for the *DESeq2* analysis. See the file ``{comparisonType}.allMotifs.tsv.gz`` in the ``FINAL_OUTPUT`` folder for a column description.
- ``{TF}.{comparisonType}.summary.rds``: Produced in rule ``analyzeTF``. A summary table for the log2 fold-changes across all TFBS *DESeq2* results.
- ``{TF}.{comparisonType}.diagnosticPlots.pdf`` and ``{TF}.{comparisonType}.diagnosticPlots_permutation{perm}.pdf``: Produced in rule ``analyzeTF``. Various diagnostic plots for the differential accessibility TFBS analysis for the real and permuted data, respectively. See the description of the file ``{comparisonType}.diagnosticPlots.peaks.pdf`` in the ``PEAKS`` folder, which has an identical structure.
- ``{TF}.{comparisonType}.summaryPlots.pdf`` and ``{TF}.{comparisonType}.summaryPlots_permutation{perm}.pdf``: Produced in rule ``analyzeTF``. A PDF with a summary of the DESeq2 analysis for the real and permuted data, respectively: Page 1 shows a density plot of the log2 fold-changes for the specific pairwise condition that the user selected, separately for the peaks only and across all TFBS from the specific TF. Page 2 shows the same but in a cumulative representation.
- ``{TF}.{comparisonType}.DESeq.object.rds``: Produced in rule ``analyzeTF``. Original DESeq2 object.
- ``{TF}.{comparisonType}.summaryPlots.pdf`` and ``{TF}.{comparisonType}.summaryPlots_permutation{perm}.pdf``: Produced in rule ``analyzeTF``. A PDF with a summary of the *DESeq2* analysis for the real and permuted data, respectively: Page 1 shows a density plot of the log2 fold-changes for the specific pairwise condition that the user selected, separately for the peaks only and across all TFBS from the specific TF. Page 2 shows the same but in a cumulative representation.
- ``{TF}.{comparisonType}.DESeq.object.rds``: Produced in rule ``analyzeTF``. Original *DESeq2* object.
- ``{TF}.{comparisonType}.permutationResults.rds``: Produced in rule ``binningTF``. contains a data frame that stores the results of bin-specific results.
- ``{TF}.{comparisonType}.permutationSummary.tsv``: Produced in rule ``binningTF``. A final summary table that summarizes the results across bins by calculating weighted means. The data of this table are used for the final visualization.
- ``{TF}.{comparisonType}.covarianceResults.rds``: Produced in rule ``binningTF``. Contains a data frame that stores the results of the pairwise bin covariances and the bin-specific weights.
......
......@@ -138,10 +138,27 @@
</li>
<li class="toctree-l1"><a class="reference internal" href="#output">Output</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#folder-final-output">FOLDER <code class="docutils literal"><span class="pre">FINAL_OUTPUT</span></code></a><ul>
<li class="toctree-l3"><a class="reference internal" href="#subfolder-extension-regionextension">Subfolder <code class="docutils literal"><span class="pre">extension{regionExtension}</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#subfolder-extension-regionextension">Subfolder <code class="docutils literal"><span class="pre">extension{regionExtension}</span></code></a><ul>
<li class="toctree-l4"><a class="reference internal" href="#file-comparisontype-allmotifs-tsv-gz">FILE <code class="docutils literal"><span class="pre">{comparisonType}.allMotifs.tsv.gz</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#file-comparisontype-tf-vs-peak-distribution-tsv">FILE <code class="docutils literal"><span class="pre">{comparisonType}.TF_vs_peak_distribution.tsv</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#file-comparisontype-summary-tsv">FILE <code class="docutils literal"><span class="pre">{comparisonType}.summary.tsv</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#file-comparisontype-summary-circular-pdf">FILE <code class="docutils literal"><span class="pre">{comparisonType}.summary.circular.pdf</span></code></a></li>
<li class="toctree-l4"><a class="reference internal" href="#file-comparisontype-diagnosticplots-pdf">FILE <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.pdf</span></code></a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#folder-peaks">FOLDER <code class="docutils literal"><span class="pre">PEAKS</span></code></a><ul>
<li class="toctree-l3"><a class="reference internal" href="#files-comparisontype-consensuspeaks-bed-and-consensuspeaks-lengthdistribution-pdf">FILES <code class="docutils literal"><span class="pre">{comparisonType}.consensusPeaks.bed</span></code> and <code class="docutils literal"><span class="pre">consensusPeaks_lengthDistribution.pdf</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-allbams-peaks-overlaps-bed">FILE <code class="docutils literal"><span class="pre">{comparisonType}.allBams.peaks.overlaps.bed</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-samplemetadata-rds">FILE <code class="docutils literal"><span class="pre">{comparisonType}.sampleMetadata.rds</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-peaks-rds">FILE <code class="docutils literal"><span class="pre">{comparisonType}.peaks.rds</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-peaks-tsv">FILE <code class="docutils literal"><span class="pre">{comparisonType}.peaks.tsv</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-normfacs-rds">FILE <code class="docutils literal"><span class="pre">{comparisonType}.normFacs.rds</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#files-comparisontype-diagnosticplots-peaks-pdf-and-comparisontype-diagnosticplots-peaks-permutation-perm-pdf-for-each-permutation-perm">FILES <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks.pdf</span></code> and <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks_permutation{perm}.pdf</span></code> for each permutation <code class="docutils literal"><span class="pre">{perm}</span></code></a></li>
<li class="toctree-l3"><a class="reference internal" href="#file-comparisontype-deseq-object-rds">FILE ``{comparisonType}.DESeq.object.rds`</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#folder-peaks">FOLDER <code class="docutils literal"><span class="pre">PEAKS</span></code></a></li>
<li class="toctree-l2"><a class="reference internal" href="#folder-tf-specific">FOLDER <code class="docutils literal"><span class="pre">TF-SPECIFIC</span></code></a><ul>
<li class="toctree-l3"><a class="reference internal" href="#id20">Subfolder <code class="docutils literal"><span class="pre">extension{regionExtension}</span></code></a></li>
</ul>
......@@ -312,7 +329,7 @@
<span id="id4"></span><h4>PARAMETER <code class="docutils literal"><span class="pre">designContrast</span></code><a class="headerlink" href="#parameter-designcontrast" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>String. Default “~ Treatment + conditionSummary”. Design formula for the differential accessibility analysis in DESeq2.</dd>
<dd>String. Default “~ Treatment + conditionSummary”. Design formula for the differential accessibility analysis in <em>DESeq2</em>.</dd>
<dt>Details</dt>
<dd>This important parameter defines the actual contrast that is done in the differential analysis. That is, which groups of samples are being compared? Examples include mutant vs wildtype, mutated vs. unmutated, etc. The last element in the formula must always be “conditionSummary”, which defines the two groups that are being compared. This name is currently hard-coded and required by the pipeline. Our pipeline allows including additional variables to model potential confounding variables, like gender, batches etc. For each additional variable that is part of the formula, a corresponding and identically named column in the sample summary file must be specified.</dd>
</dl>
......@@ -559,6 +576,7 @@
</ul>
<div class="section" id="folder-final-output">
<h2>FOLDER <code class="docutils literal"><span class="pre">FINAL_OUTPUT</span></code><a class="headerlink" href="#folder-final-output" title="Permalink to this headline"></a></h2>
<p>In this folder, the final output files are stored. Most users want to examine the files in here for further analysis.</p>
<div class="section" id="subfolder-extension-regionextension">
<h3>Subfolder <code class="docutils literal"><span class="pre">extension{regionExtension}</span></code><a class="headerlink" href="#subfolder-extension-regionextension" title="Permalink to this headline"></a></h3>
<p>Stores results related to the user-specified extension size (<a class="reference internal" href="#parameter-regionextension"><span class="std std-ref">PARAMETER regionExtension</span></a>)</p>
......@@ -566,26 +584,47 @@
<p class="first admonition-title">Note</p>
<p class="last">Note that in all output files, in the column <code class="docutils literal"><span class="pre">permutation</span></code>, 0 always refers to the non-permuted, real data, while permutations &gt; 0 reflect real permutations.</p>
</div>
<ul class="simple">
<li><code class="docutils literal"><span class="pre">{comparisonType}.allMotifs.tsv.gz</span></code>: Summary table for each TFBS with the columns as follows:<ul>
<div class="section" id="file-comparisontype-allmotifs-tsv-gz">
<h4>FILE <code class="docutils literal"><span class="pre">{comparisonType}.allMotifs.tsv.gz</span></code><a class="headerlink" href="#file-comparisontype-allmotifs-tsv-gz" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>Summary table for each TFBS</dd>
<dt>Details</dt>
<dd><p class="first">Columns are as follows:</p>
<ul class="last simple">
<li><em>permutation</em>: The number of the permutaton.</li>
<li><a href="#id18"><span class="problematic" id="id19">*</span></a>TF: name of the TF</li>
<li><em>chr, *MSS</em>, <em>MES</em>, <em>strand</em>, <em>TFBSID</em>: Genomic location and identifier of the (extended) TFBS</li>
<li><em>PSS, *PES</em>, <em>peakID</em>: Genomic location and annotation of the overlapping peak region</li>
<li><em>baseMean</em>, <em>log2FoldChange</em>, <em>lfcSE</em>, <em>stat</em>, <em>pvalue</em>, <em>padj</em>: Results from the DESeq2 analysis. See the <a class="reference external" href="https://www.bioconductor.org/help/course-materials/2015/LearnBioconductorFeb2015/B02.1.1_RNASeqLab.html">DESeq2 documentation</a> for details.</li>
</ul>
</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.TF_vs_peak_distribution.tsv</span></code>: TODO. The columns are as follows:<ul>
<li><em>TF</em>: name of the TF</li>
<li><em>permutation</em>: The number of the permutation.</li>
<li><em>Pos_l2F*C, *Mean_l2FC</em>, <em>Median_l2FC</em>, <em>sd_l2FC</em>, <em>Mode_l2FC</em>, <em>skewness_l2FC</em>: fraction of positive values, mean, median, standard deviation, mode value and Bickel’s measure of skewness of the log2 fold change distribution across all TFBS</li>
<li><em>pvalue_raw</em> and <em>pvalue_adj</em>: raw and adjusted (fdr) p-value of the t-test</li>
<li><em>T_statistic</em>: the value of the T statistic from the t-test</li>
<li><em>TFBS_num</em>: number of TFBS</li>
<li><em>Diff_mean</em>, <em>Diff_median</em>, <em>Diff_mode</em>, <em>Diff_skew</em>: Difference of the mean, median, mode, and skewness between the log2 fold-change distribution across all TFBS and the peaks, respectively</li>
<li><em>chr</em>, <em>MSS</em>, <em>MES</em>, <em>strand</em>, <em>TFBSID</em>: Genomic location and identifier of the (extended) TFBS</li>
<li><em>PSS</em>, <em>PES</em>, <em>peakID</em>: Genomic location and annotation of the overlapping peak region</li>
<li><em>baseMean</em>, <em>log2FoldChange</em>, <em>lfcSE</em>, <em>stat</em>, <em>pvalue</em>, <em>padj</em>: Results from the <em>DESeq2</em> analysis. See the <a class="reference external" href="https://www.bioconductor.org/help/course-materials/2015/LearnBioconductorFeb2015/B02.1.1_RNASeqLab.html">DESeq2 documentation</a> for details.</li>
</ul>
</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.summary.tsv</span></code>: The final summary table that is also used for the final circular visualization (see below). The columns are as follows:<ul>
</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-tf-vs-peak-distribution-tsv">
<h4>FILE <code class="docutils literal"><span class="pre">{comparisonType}.TF_vs_peak_distribution.tsv</span></code><a class="headerlink" href="#file-comparisontype-tf-vs-peak-distribution-tsv" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>TODO</dd>
<dt>Details</dt>
<dd>Columns are as follows:
- <em>TF</em>: name of the TF
- <em>permutation</em>: The number of the permutation.
- <em>Pos_l2F*C, *Mean_l2FC</em>, <em>Median_l2FC</em>, <em>sd_l2FC</em>, <em>Mode_l2FC</em>, <em>skewness_l2FC</em>: fraction of positive values, mean, median, standard deviation, mode value and Bickel’s measure of skewness of the log2 fold change distribution across all TFBS
- <em>pvalue_raw</em> and <em>pvalue_adj</em>: raw and adjusted (fdr) p-value of the t-test
- <em>T_statistic</em>: the value of the T statistic from the t-test
- <em>TFBS_num</em>: number of TFBS
- <em>Diff_mean</em>, <em>Diff_median</em>, <em>Diff_mode</em>, <em>Diff_skew</em>: Difference of the mean, median, mode, and skewness between the log2 fold-change distribution across all TFBS and the peaks, respectively</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-summary-tsv">
<h4>FILE <code class="docutils literal"><span class="pre">{comparisonType}.summary.tsv</span></code><a class="headerlink" href="#file-comparisontype-summary-tsv" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>The final summary table that is also used for the final circular visualization</dd>
<dt>Details</dt>
<dd><p class="first">The columns are as follows:</p>
<ul class="last simple">
<li><em>TF</em>: name of the TF</li>
<li><em>permutation</em>: The number of the permutation.</li>
<li><em>weighted_meanDifference</em>: the weighted mean difference of the real and background distribution across all CG bins. This value is the basis for the final calculation of the x-axis position for the circular plot. Without permutations, this value is shown, while for permutations, the weighted_meanDifference_enrichment is used.</li>
......@@ -601,48 +640,128 @@
<li><em>classification</em>: RNA-Seq classification (either activator, undetermined, repressor or not-expressed)</li>
<li><em>Cohend_factor</em> , <em>weighted_CD</em>, <em>weighted_median</em>, <em>weighted_sd</em>: Not used.</li>
</ul>
</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-summary-circular-pdf">
<h4>FILE <code class="docutils literal"><span class="pre">{comparisonType}.summary.circular.pdf</span></code><a class="headerlink" href="#file-comparisontype-summary-circular-pdf" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>The final visualization of the diffTF results</dd>
<dt>Details</dt>
<dd><p class="first">The PDF contains multiple pages, the structure of which varies depending on the parameters:</p>
<ul class="last simple">
<li>Number of permutations &gt; 0<ul>
<li>RNA-Seq integration: Within each of the two following sections, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)<ul>
<li>Pages 1-15: Visualization of the results including the permutations</li>
<li>Pages 16-30: Visualization of the results excluding the permutations</li>
</ul>
</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.summary.circular.pdf</span></code>: The final visualization of the diffTF results. The PDF contains multiple pages, the structure of which varies depending on the parameters:</li>
<li>No RNA-Seq integration: Within each of the two following section, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)<ul>
<li>Pages 1-5: Visualization of the results including the permutations for different FDR thresholds</li>
<li>Pages 6-10: Visualization of the results excluding the permutations for different FDR thresholds</li>
</ul>
<p>Number of permutations &gt; 0
RNA-Seq integration
Pages 1-15: Visualization of the results including the permutations
Pages 16-30: Visualization of the results excluding the permutations
Within each of these two categories, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
No RNA-Seq integration:
Pages 1-5: Visualization of the results including the permutations for different FDR thresholds
Pages 6-10: Visualization of the results excluding the permutations for different FDR thresholds
Within each of these two categories, the structure is identical (varying combinations of FDR threshold and RNA-Seq categories, see below)
Number of permutations = 0
RNA-Seq integration: Pages 1-15 show the results in a format as described below for varying combinations of FDR threshold and RNA-Seq categories
No RNA-Seq integration: Pages 1-5 show the results for different FDR thresholds
If RNA-Seq data is integrated, different combinations of categories are shown on each page (1:activator-undetermined-repressor-not-expressed, 2:activator-undetermined-repressor, 3:activator-repressor), which along with different FDR thresholds gives rise to multiple individual plots. The values on the x-axis denote the effect size (if permutations are incorporated enrichment over background, otherwise the mean difference between the two conditions), with higher values indicating a higher differential TF activity between the two conditions. TFs that are similarly active between the two conditions are close to 0, while TFs more active in either condition are located in the left and right part of the plot. The y-axis (radial position) denotes the statistical significance (adjusted p-values). The significance threshold is indicated as red circular line. TFs that pass the significance threshold are labeled and, if RNA-Seq data is integrated, colored according to their predicted role (see above).
{comparisonType}.diagnosticPlots.pdf: Various diagnostic plots for the final TF activity values. If the number of permutations is larger than 0, the first three pages show various versions of the pemruted weighted_meanDifference values and how they relate to the real ones. Permutation 0, as used everywhere throughout the pipeline, contains the real values, while any permutation &gt; 0 refers to an actual permutation. Page 1 shows real and permuted values, page 2 only permuted ones, page 3 a density plot of the real values with the permutation thresholds as dashed lines, inside of which TFs are not labeled as they fall within the permutation and therefore noise area. The next page shows various diagnostic plots from the locfdr package to estimate the distribution median, while the remaining plots show histograms of all relevant columns in the final output table for different sets of TFs depending on a specific FDR threshold.</p>
</li>
</ul>
</li>
<li>Number of permutations = 0<ul>
<li>RNA-Seq integration: Pages 1-15 show the results in a format as described below for varying combinations of FDR threshold and RNA-Seq categories</li>
<li>No RNA-Seq integration: Pages 1-5 show the results for different FDR thresholds</li>
</ul>
</li>
<li>If RNA-Seq data is integrated, different combinations of categories are shown on each page (1: activator-undetermined-repressor-not-expressed, 2: activator-undetermined-repressor, 3: activator-repressor), which along with different FDR thresholds gives rise to multiple individual plots. The values on the x-axis denote the effect size (if permutations are incorporated enrichment over background, otherwise the mean difference between the two conditions), with higher values indicating a higher differential TF activity between the two conditions. TFs that are similarly active between the two conditions are close to 0, while TFs more active in either condition are located in the left and right part of the plot. The y-axis (radial position) denotes the statistical significance (adjusted p-values). The significance threshold is indicated as red circular line. TFs that pass the significance threshold are labeled and, if RNA-Seq data is integrated, colored according to their predicted role (see above).</li>
</ul>
</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-diagnosticplots-pdf">
<h4>FILE <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.pdf</span></code><a class="headerlink" href="#file-comparisontype-diagnosticplots-pdf" title="Permalink to this headline"></a></h4>
<dl class="docutils">
<dt>Summary</dt>
<dd>Various diagnostic plots for the final TF activity values</dd>
<dt>Details</dt>
<dd>If the number of permutations is larger than 0, the first three pages show various versions of the pemruted weighted_meanDifference values and how they relate to the real ones. Permutation 0, as used everywhere throughout the pipeline, contains the real values, while any permutation &gt; 0 refers to an actual permutation. Page 1 shows real and permuted values, page 2 only permuted ones, page 3 a density plot of the real values with the permutation thresholds as dashed lines, inside of which TFs are not labeled as they fall within the permutation and therefore noise area. The next page shows various diagnostic plots from the <em>locfdr</em> package to estimate the distribution median, while the remaining plots show histograms of all relevant columns in the final output table for different sets of TFs depending on a specific FDR threshold.</dd>
</dl>
</div>
</div>
</div>
<div class="section" id="folder-peaks">
<h2>FOLDER <code class="docutils literal"><span class="pre">PEAKS</span></code><a class="headerlink" href="#folder-peaks" title="Permalink to this headline"></a></h2>
<p>Stores peak-associated files.</p>
<ul class="simple">
<li>if no consensus peak file was provided (<a class="reference internal" href="#parameter-consensuspeaks"><span class="std std-ref">PARAMETER consensusPeaks</span></a>):<ul>
<li><code class="docutils literal"><span class="pre">{comparisonType}.consensusPeaks.bed</span></code> and <code class="docutils literal"><span class="pre">consensusPeaks_lengthDistribution.pdf</span></code>: generated consensus peaks, before filtering (see below) as well as a diagnostic plot showing the length distribution of the peaks</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.consensusPeaks.filtered.sorted.bed</span></code>: Produced in rule <code class="docutils literal"><span class="pre">filterSexChromosomesAndSortPeaks</span></code>. Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn*, *random*, *hap|_gl*</li>
</ul>
</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.allBams.peaks.overlaps.bed</span></code>: Produced in rule <code class="docutils literal"><span class="pre">intersectPeaksAndBAM</span></code>. Counts for each consensus peak with each of the input BAM files</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.sampleMetadata.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores data for the input data (similar to the input sample table), for both the real data and the permutations.</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.peaks.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores all peaks that will be used in the analysis.</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.peaks.tsv</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores the DESEq2 results of the differential accessibility analysis for the peaks.</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.normFacs.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Gene-specific normalization factors for each sample and peak. This file is produces after the differential accessibility analysis for the peaks. The normalization factors will be used for the TF-specific differential accessibility analysis.</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks.pdf</span></code> and <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks_permutation{perm}.pdf</span></code> for each permutation <code class="docutils literal"><span class="pre">{perm}</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Various diagnostic plots for the differential accessibility peak analysis for the real and permuted data, respectively:<ol class="arabic">
<div class="section" id="files-comparisontype-consensuspeaks-bed-and-consensuspeaks-lengthdistribution-pdf">
<h3>FILES <code class="docutils literal"><span class="pre">{comparisonType}.consensusPeaks.bed</span></code> and <code class="docutils literal"><span class="pre">consensusPeaks_lengthDistribution.pdf</span></code><a class="headerlink" href="#files-comparisontype-consensuspeaks-bed-and-consensuspeaks-lengthdistribution-pdf" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Only present if no consensus peak file was provided (<a class="reference internal" href="#parameter-consensuspeaks"><span class="std std-ref">PARAMETER consensusPeaks</span></a>). Produced in rule <code class="docutils literal"><span class="pre">filterSexChromosomesAndSortPeaks</span></code>. Generated consensus peaks, before filtering (see below) as well as a diagnostic plot showing the length distribution of the peaks.</dd>
<dt>Details</dt>
<dd>Filtered consensus peaks (removal of peaks from one of the following chromosomes: chrX, chrY, chrM, chrUn*, *random*, *hap|_gl*</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-allbams-peaks-overlaps-bed">
<h3>FILE <code class="docutils literal"><span class="pre">{comparisonType}.allBams.peaks.overlaps.bed</span></code><a class="headerlink" href="#file-comparisontype-allbams-peaks-overlaps-bed" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">intersectPeaksAndBAM</span></code>. Counts for each consensus peak with each of the input BAM files.</dd>
</dl>
<p>Details</p>
</div>
<div class="section" id="file-comparisontype-samplemetadata-rds">
<h3>FILE <code class="docutils literal"><span class="pre">{comparisonType}.sampleMetadata.rds</span></code><a class="headerlink" href="#file-comparisontype-samplemetadata-rds" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores data for the input data (similar to the input sample table), for both the real data and the permutations.</dd>
</dl>
<p>Details</p>
</div>
<div class="section" id="file-comparisontype-peaks-rds">
<h3>FILE <code class="docutils literal"><span class="pre">{comparisonType}.peaks.rds</span></code><a class="headerlink" href="#file-comparisontype-peaks-rds" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores all peaks that will be used in the analysis.</dd>
</dl>
<p>Details</p>
</div>
<div class="section" id="file-comparisontype-peaks-tsv">
<h3>FILE <code class="docutils literal"><span class="pre">{comparisonType}.peaks.tsv</span></code><a class="headerlink" href="#file-comparisontype-peaks-tsv" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Stores the <em>DESeq2</em> results of the differential accessibility analysis for the peaks.</dd>
</dl>
<p>Details</p>
</div>
<div class="section" id="file-comparisontype-normfacs-rds">
<h3>FILE <code class="docutils literal"><span class="pre">{comparisonType}.normFacs.rds</span></code><a class="headerlink" href="#file-comparisontype-normfacs-rds" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Gene-specific normalization factors for each sample and peak.</dd>
<dt>Details</dt>
<dd>This file is produces after the differential accessibility analysis for the peaks. The normalization factors will be used for the TF-specific differential accessibility analysis.</dd>
</dl>
</div>
<div class="section" id="files-comparisontype-diagnosticplots-peaks-pdf-and-comparisontype-diagnosticplots-peaks-permutation-perm-pdf-for-each-permutation-perm">
<h3>FILES <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks.pdf</span></code> and <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks_permutation{perm}.pdf</span></code> for each permutation <code class="docutils literal"><span class="pre">{perm}</span></code><a class="headerlink" href="#files-comparisontype-diagnosticplots-peaks-pdf-and-comparisontype-diagnosticplots-peaks-permutation-perm-pdf-for-each-permutation-perm" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. Various diagnostic plots for the differential accessibility peak analysis for the real and permuted data, respectively</dd>
<dt>Details</dt>
<dd><p class="first">The pages are as follows:</p>
<ol class="last arabic simple">
<li>MA plots</li>
<li>density plots of normalized and non-normalized counts</li>
<li>mean-average plots (average of the log-transformed counts vs the fold-change per peak) for each of the sample pairs</li>
<li>mean SD plots (row standard deviations versus row means)</li>
</ol>
</li>
<li><code class="docutils literal"><span class="pre">{comparisonType}.DESeq.object.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. The DESeq2 object from the differential accessibility peak analysis.</li>
</ul>
</dd>
</dl>
</div>
<div class="section" id="file-comparisontype-deseq-object-rds">
<h3>FILE <a href="#id18"><span class="problematic" id="id19">``</span></a>{comparisonType}.DESeq.object.rds`<a class="headerlink" href="#file-comparisontype-deseq-object-rds" title="Permalink to this headline"></a></h3>
<dl class="docutils">
<dt>Summary</dt>
<dd>Produced in rule <code class="docutils literal"><span class="pre">DESeqPeaks</span></code>. The <em>DESeq2</em> object from the differential accessibility peak analysis.</dd>
</dl>
<p>Details</p>
</div>
</div>
<div class="section" id="folder-tf-specific">
<h2>FOLDER <code class="docutils literal"><span class="pre">TF-SPECIFIC</span></code><a class="headerlink" href="#folder-tf-specific" title="Permalink to this headline"></a></h2>
......@@ -652,15 +771,15 @@ If RNA-Seq data is integrated, different combinations of categories are shown on
<ul>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.allBAMs.overlaps.bed.gz</span></code> and <code class="docutils literal"><span class="pre">{TF}.{comparisonType}.allBAMs.overlaps.bed.summary</span></code>: Overlap and featureCounts summary file of read counts across all TFBS for all input BAM files.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.output.tsv</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A summary table for the DeSeq2 analysis. See the file <code class="docutils literal"><span class="pre">{comparisonType}.allMotifs.tsv.gz</span></code> in the <code class="docutils literal"><span class="pre">FINAL_OUTPUT</span></code> folder for a column description.</p>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.output.tsv</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A summary table for the <em>DESeq2</em> analysis. See the file <code class="docutils literal"><span class="pre">{comparisonType}.allMotifs.tsv.gz</span></code> in the <code class="docutils literal"><span class="pre">FINAL_OUTPUT</span></code> folder for a column description.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summary.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A summary table for the log2 fold-changes across all TFBS DESeq2 results.</p>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summary.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A summary table for the log2 fold-changes across all TFBS <em>DESeq2</em> results.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.diagnosticPlots.pdf</span></code> and <code class="docutils literal"><span class="pre">{TF}.{comparisonType}.diagnosticPlots_permutation{perm}.pdf</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. Various diagnostic plots for the differential accessibility TFBS analysis for the real and permuted data, respectively. See the description of the file <code class="docutils literal"><span class="pre">{comparisonType}.diagnosticPlots.peaks.pdf</span></code> in the <code class="docutils literal"><span class="pre">PEAKS</span></code> folder, which has an identical structure.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summaryPlots.pdf</span></code> and <code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summaryPlots_permutation{perm}.pdf</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A PDF with a summary of the DESeq2 analysis for the real and permuted data, respectively: Page 1 shows a density plot of the log2 fold-changes for the specific pairwise condition that the user selected, separately for the peaks only and across all TFBS from the specific TF. Page 2 shows the same but in a cumulative representation.</p>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summaryPlots.pdf</span></code> and <code class="docutils literal"><span class="pre">{TF}.{comparisonType}.summaryPlots_permutation{perm}.pdf</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. A PDF with a summary of the <em>DESeq2</em> analysis for the real and permuted data, respectively: Page 1 shows a density plot of the log2 fold-changes for the specific pairwise condition that the user selected, separately for the peaks only and across all TFBS from the specific TF. Page 2 shows the same but in a cumulative representation.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.DESeq.object.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. Original DESeq2 object.</p>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.DESeq.object.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">analyzeTF</span></code>. Original <em>DESeq2</em> object.</p>
</li>
<li><p class="first"><code class="docutils literal"><span class="pre">{TF}.{comparisonType}.permutationResults.rds</span></code>: Produced in rule <code class="docutils literal"><span class="pre">binningTF</span></code>. contains a data frame that stores the results of bin-specific results.</p>
</li>
......
Search.setIndex({docnames:["chapter1","chapter2","index","projectInfo"],envversion:53,filenames:["chapter1.rst","chapter2.rst","index.rst","projectInfo.rst"],objects:{},objnames:{},objtypes:{},terms:{"8see":1,"case":[0,1],"default":1,"final":1,"float":1,"function":1,"import":[1,3],"long":1,"public":1,"short":1,"switch":1,"true":[0,1],"try":[1,2],"while":[0,1],AND:3,BUT:3,FOR:3,For:[0,1,3],IDs:1,MES:1,NOT:3,Not:1,PES:1,TFs:[0,3],THE:3,That:1,The:[0,1,3],There:1,These:[1,2],USE:3,WITH:3,_doc:[],_gl:1,_tfb:1,abil:3,about:[0,1,3],abov:[0,1,3],absolut:1,access:1,accord:1,accordingli:[0,1],across:[1,3],act:3,action:3,activ:[1,3],actual:1,acycl:1,adapt:0,add:1,addit:[0,1,2],adjust:[0,1],advantag:1,after:1,against:1,align:1,all:[0,1,3],allbam:1,allmotif:1,allow:1,alltfb:1,alltfdata_processedforpermut:1,alltfuniquedata_processedforpermut:1,almost:0,along:[0,1],alreadi:[0,1],also:[0,1],alwai:1,analog:0,analysi:[1,2],analyzetf:1,ani:[0,1,3],annot:1,anoth:0,answer:3,antur:1,appear:1,applic:[1,3],arbitrari:1,area:1,argument:1,aris:[1,3],armando:3,arnold:3,around:0,arrow:1,ask:2,assembl:1,assess:3,associ:[1,3],author:3,automat:1,avail:[0,1],averag:1,avoid:1,axi:1,background:1,bad:1,bam:1,bamread:1,base:1,basemean:1,basenamebam:1,basi:1,batch:1,becaus:1,becom:1,bed6:1,bed:1,bedtool:[1,2],been:1,befor:[0,1],being:1,belong:1,below:1,benjamini:1,berest:3,better:0,between:1,bickel:1,bin:1,binari:1,bind:[1,3],binningtf:1,bioclit:0,bioconda:0,bioconductor:0,biolog:2,bit:1,bitbucket:3,blob:[],boot:0,both:1,bracket:1,briefli:[0,1],calcnucleotidecont:1,calcul:1,call:[0,1],can:[0,1],cannot:1,caption:1,captur:1,carefulli:[0,1],carnold:3,categori:1,caus:2,cebpb:1,cell:1,cellular:3,central:1,chang:[0,1,2],charg:3,check:[1,2],checkmat:0,checkparametervalid:1,choic:[0,1],chosen:1,chr:1,christian:3,chrm:1,chromatin:3,chromosom:1,chrun:1,chrx:1,chry:1,circular:1,citat:2,cite:3,claim:3,classif:1,classifi:[1,3],click:1,clone:0,close:1,cluster:[0,1],clusterconfigurationtempl:0,code:1,coexpressionnetwork:[],cohend_factor:1,color:1,column:1,combin:1,come:[1,3],comma:1,command:[0,1],comment:3,common:1,compar:1,comparison:1,comparisontyp:[],compat:1,compil:1,complex:1,comput:1,computation:0,concept:1,concord:1,conda:0,condit:[1,3],conditioncomparison:1,conditionsummari:1,config:[0,1],configfil:1,configur:[0,2],confound:1,connect:3,consensu:1,consensuspeak:[],consensuspeaks_lengthdistribut:1,consequ:1,consid:1,consol:1,constitut:3,construct:1,contact:[1,2],contain:[0,1],content:1,continu:1,contract:3,contrast:1,contribut:2,conveni:[0,1],coord:1,coordin:1,copi:[0,1,3],copyright:3,cor:1,core:0,corner:2,correct:[0,1],correctli:1,correl:1,correspond:1,count:1,cours:0,covari:1,covarianceresult:1,cran:0,crash:1,creat:[0,1],csaw:0,ctcf:1,cumul:1,curli:1,current:[1,3],curv:1,dag:1,damag:3,dash:1,data:[0,1,3],databas:1,day0:1,day10:1,dba:1,deal:3,debug:1,deeper:0,defin:[0,1],delet:0,demand:0,denot:1,densiti:1,depend:1,describ:1,descript:1,deseq2:[0,1],deseq:1,deseqpeak:1,design:1,designcontrast:[],designvariabletyp:[],detail:[0,1,2],deviat:1,devic:0,diagnost:1,diagnosticplot:1,diagnosticplots_permut:1,did:1,diff_mean:1,diff_median:1,diff_mod:1,diff_skew:1,diffbind:[0,1],differ:[0,1,3],differenti:[1,3],difficult:1,difftf:[0,1,3],dindler:3,dir_script:[],dir_tfb:[],direct:1,directori:[0,1],disk:0,displai:1,distribut:[1,3],divid:1,doc:[],document:[0,1,3],doe:1,done:1,dot:1,doubt:3,download:[0,1],downloadalldata:0,downstream:1,dryrun:0,due:1,dure:1,each:[0,1],easi:[0,1],easiest:0,easili:1,edu:[],effect:1,either:[0,1],element:1,elimin:1,embl:[0,1,3],empti:1,end:[1,3],enough:0,enrich:1,ensembl:1,ensg00000028277:1,ensur:0,environ:1,erron:1,error:[0,2],especi:1,estim:[0,1],etc:1,even:0,event:3,everywher:1,exact:[0,1],exactli:1,exampl:[0,1],except:1,exclud:1,execut:1,experi:1,explain:1,explicitli:1,express:[1,3],extend:1,extens:[],factor:[1,3],faidx:1,fall:1,fals:1,faq:[],fast:1,fasta:1,fdr:1,featurecount:1,feel:[1,3],few:1,fewer:1,field:2,fifth:1,figur:1,file:[0,2,3],filenam:1,filter:1,filtersexchromosomesandsortpeak:1,fimo:1,final_output:[0,2],finish:[0,1],first:[0,1],fit:3,fix:2,flag:1,flexibl:1,fold:1,folder:[0,2],follow:[0,1,2,3],forcerun:1,format:1,formula:1,found:1,fourth:1,fraction:1,frame:1,framework:0,free:[1,3],frequent:2,from:[0,1,3],full:1,furnish:3,futil:0,futur:1,gender:1,gene:1,geneplott:0,gener:2,genet:[],genom:[1,3],get:2,getting_start:[],ggrepel:0,git:0,give:1,given:1,global:0,gmp:1,grant:3,graph:1,great:1,gridextra:0,group:1,grp:0,guid:[],guidanc:[0,1],guidelin:[0,1],gzip:1,handl:[0,2],hap:1,happi:3,hard:1,has:[0,1],have:[0,1,3],helin:3,help:[1,2],helper:0,here:1,herebi:3,hesit:[],hg19:1,hg38:1,higher:1,highli:0,highlight:1,histogram:1,hochberg:1,hocomoco:1,hocomoco_map:[],hold:0,holder:3,homebrew:0,horvath:[],how:[0,1],howev:[0,1,3],html:[],htslib:[],http:[0,1],human:1,idea:3,ident:1,identifi:2,ignor:1,illustr:1,imag:1,implement:3,impli:3,importantli:1,improv:1,includ:[1,3],incorpor:1,increas:1,index:1,indic:1,individu:1,inform:[1,2],input:[0,2],insid:1,instal:[0,1,2],instead:0,instruct:0,integ:1,integr:[1,3],interest:[1,2],intermedi:1,interpret:[],intersectpeaksandbam:1,intersectpeaksandtfb:1,intersecttfbsandbam:1,introduc:3,invalu:0,invok:[0,1],irrelev:1,issu:[1,3],ivan:3,json:[0,1],jsonlit:0,judith:3,just:[0,1],kasper:3,kind:3,know:1,knowledg:0,known:3,kristian:3,lab:[],label:1,lack:3,larg:[0,1],larger:1,last:1,latenc:1,later:1,latest:[],latter:1,learn:1,least:[0,1],left:[0,1,2],length:1,less:1,let:1,lfcse:1,liabil:3,liabl:3,librari:1,licens:2,like:1,limit:[1,3],limma:0,line:[0,1],list:[0,1],littl:3,local:1,locat:1,locfdr:[0,1],log2:1,log2foldchang:1,log:[1,2],logfil:[],logger:0,logic:1,logs_and_benchmark:2,look:[0,1,2],lowli:1,lsr:0,maco:0,macs2:1,made:1,mai:1,main:1,make:[0,1],manag:0,mani:1,manner:[0,1,3],manual:[0,1],map:1,mark:1,master:[],match:1,matrix:1,matter:1,maxim:1,maximum:1,mean:[0,1],mean_l2fc:1,measur:1,median:1,median_l2fc:1,meme:1,memori:1,menu:2,merchant:3,merg:3,messag:1,metadata:2,method:3,might:[0,1],minim:1,minimum:1,minoverlap:[],mismatch:1,miss:1,mit:3,mle:1,mm10:1,mode:1,mode_l2fc:1,modeest:0,model:1,modif:1,modifi:[0,1,3],modul:[],more:[0,1],most:1,motif:1,motiv:2,mous:1,mpp:1,mss:1,multipl:[0,1,3],must:1,mutant:1,mutat:1,name:1,narrow:1,narrowpeak:1,natur:1,necessari:[0,1],need:[0,1],neg:1,neither:1,net:[],next:1,node:1,nois:1,non:1,noninfring:3,nor:1,normal:1,normfac:1,note:[0,1],notic:3,now:[1,2],npermut:[],nuc:1,nuccont:1,number:[0,1],numer:1,numpi:0,object:1,obtain:3,occur:1,off:1,offer:1,often:1,onc:0,one:1,ones:1,onli:[0,1],optim:1,option:1,order:0,org:0,organ:[1,2],orient:2,origin:1,other:[0,1,3],otherwis:[1,3],our:[0,1,2],out:[1,2,3],outdir:[],outlin:[0,1],output:[0,2],outsid:0,over:1,overlap:1,overrid:1,own:[1,2],packag:[1,2],padj:1,page:[1,2],pair:1,pairwis:1,palomar:3,panda:0,parallel:0,paramet:0,parameter_conditioncomparison:[],parameter_dir_tfb:[],parameter_hocomoco_map:[],parameter_peaktyp:[],parameter_refgenome_fasta:[],parameter_rnaseqcount:[],parameter_summaryfil:[],part:[1,2],particular:[1,3],pass:1,path:1,pathwai:3,pdf:1,peak:2,peakid:1,peaks_permut:1,peakset:1,peaktyp:[],pemrut:1,per:1,perform:0,perm:1,perman:1,permiss:3,permit:3,permut:1,permutationresult:1,permutationsummari:1,permutaton:1,person:3,pip:0,pipelin:[0,2],place:1,pleas:[0,1,3],plot:1,point:1,portion:3,pos_l2f:1,pos_l2fc:[],posit:1,possibl:[0,1],potenti:1,pre:1,predict:[1,3],prefer:1,prefix:1,preparebin:1,prerequisit:2,present:1,previou:1,price:1,princip:[0,1],problem:[1,3],procedur:1,produc:1,produceconsensuspeak:1,program:1,project:[2,3],proper:0,properli:0,proport:1,provid:[0,1,2,3],pss:1,publish:3,purpos:3,put:1,pvalu:1,pvalue_adj:1,pvalue_raw:1,pvalueadj:1,pwm:1,pwmscan:1,python:0,quantif:3,question:[2,3],quick:[0,2],quickstart:[],quit:1,radial:1,random:1,randomli:1,rassmussen:3,raw:1,rcolorbrew:0,rds:1,read:[0,1],readout:3,readrd:1,readthedoc:[],real:1,reason:1,receiv:[0,1],recommend:[0,1],red:1,refer:[0,1,3],referenc:1,refgenome_fasta:[],reflect:1,region:1,regionextens:[],regular:1,regularli:1,regulatori:3,rel:1,relat:[0,1,3],relev:1,remain:1,remov:[0,1],renam:0,report:1,repositori:0,repres:1,represent:1,repressor:[1,3],requir:[0,1],rerun:1,reshape2:0,resort:1,resortbam:1,respect:1,respond:3,restart:1,restrict:[1,3],result:1,retriev:1,rey:3,right:[1,3],rise:1,rlist:0,rna:[1,3],rnaseqcount:[],rnaseqintegr:[],role:1,root:1,rough:0,row:1,rpackag:[],rsamtool:0,rule:1,rulenam:1,run:[1,2],runownanalysi:[],saf:1,same:[0,1],sampl:0,sampledata:0,sampleid:1,samplemetadata:1,samtool:[1,2],save:1,scale:0,scenario:1,schemat:1,score:1,script:[0,1],sd_l2fc:1,search:2,second:0,section:0,section_metadata:[],see:[0,1],select:1,sell:3,separ:[0,1],seq:[1,3],sequenc:1,set:[0,1],setup:0,shall:[1,3],should:1,show:1,shown:1,side:1,signal:3,signific:1,silico:[1,3],similar:1,similarli:1,simpli:[0,1],simultan:3,sinc:0,singl:1,site:[1,2,3],size:1,skew:1,skewness_l2fc:1,slash:1,slope:1,small:0,snakefil:[0,1],snakemak:[1,2],softwar:[0,2,3],solv:1,some:[0,1],someth:[],soon:3,sort:1,sortedbam:[],sortpwm:1,sourc:0,sourceforg:[],space:[0,1],specif:2,specifi:[0,1],src:0,stabil:1,stabl:[],standard:1,start:[0,1,2],startanalysi:0,startanalysisdryrun:0,stat:1,statist:1,step:[0,1],store:1,strand:1,string:1,stringent:1,strongli:[0,1],structur:1,subject:3,sublicens:3,submit:3,subread:2,subsequ:1,subset:1,substanti:3,success:0,successfulli:[0,1],suite1:1,summar:[0,1],summari:[0,2],summary1:1,summaryfil:0,summaryfin:1,summaryplot:1,summaryplots_permut:1,support:1,sure:[0,1],surprisingli:3,system:[0,1],systemat:1,t_statist:1,tab:[0,1],tabl:[0,1],take:0,taken:1,tar:1,target:1,technic:[1,3],technolog:3,temp:2,temporari:1,test:[0,1],text:1,tf_vs_peak_distribut:1,tfb:1,tfbs_hg19_pwmscan_hocomocov10:1,tfbs_hg38_fimo_hocomocov11:1,tfbs_mm10_pwmscan_hocomocov10:1,tfbs_num:1,tfbsid:1,tfs:1,than:1,thank:2,thei:1,them:0,therefor:1,thi:[0,1,2,3],thing:1,think:1,three:[1,2],threshold:1,threw:1,throughout:1,thu:3,tidyvers:0,time:[1,3],todo:[0,1],tool:[0,1],tort:3,tracker:3,trail:1,transcript:3,transform:1,translat:1,treatment:1,trivial:1,troubleshoot:[0,1],tsv:[0,1],two:1,type:[0,2,3],typic:1,ucla:[],under:3,understand:0,undetermin:1,uninstal:0,uniqu:1,unit:1,unknown:1,unless:1,unmut:1,until:1,upper:2,use:[0,1,3],used:1,user:[0,1],using:[0,1],usual:1,valid:1,valu:1,vari:1,variabl:1,varianc:1,variou:[0,1],veri:[0,1],version:[0,1],versu:1,via:[0,1],visual:[0,1],vsn:0,wai:0,want:[0,1],warn:1,warranti:3,weight:1,weighted_cd:1,weighted_meandiffer:1,weighted_meandifference_enrich:1,weighted_median:1,weighted_sd:1,weighted_tstat:1,weighted_tstat_centr:1,well:[1,3],wgcna:[],what:2,when:1,where:1,whether:[1,3],which:[0,1],whom:3,wide:3,wildtyp:1,within:[0,1],without:[1,3],work:[0,2],workflow:2,write:1,written:0,wt1:1,www:1,yet:0,you:[0,1,2,3],your:[1,2],yourself:2,yvalu:1,zaugg:[0,1,3],zero:1},titles:["Try it out now!","Workflow","Welcome to the documentation of <em>diffTF</em>!","Biological motivation"],titleterms:{"try":0,TFs:1,additionalinputfil:1,analysi:0,ask:1,bedtool:0,biolog:3,caus:1,chang:3,citat:3,comparisontyp:1,config:[],configur:1,consensuspeak:1,contact:3,content:[],contribut:3,designcontrast:1,designvariabletyp:1,detail:[],difftf:2,dir_script:1,dir_tfb:1,document:2,error:1,extens:1,file:1,final_output:1,fix:1,folder:1,frequent:1,gener:1,handl:1,help:3,hocomoco_map:1,identifi:1,indic:[],input:1,json:[],licens:3,log:3,logs_and_benchmark:1,metadata:1,minoverlap:1,motiv:3,now:0,npermut:1,out:0,outdir:1,output:1,own:0,packag:0,par_gener:1,paramet:1,peak:1,peaktyp:1,pipelin:1,prerequisit:0,question:1,quick:[],refgenome_fasta:1,regionextens:1,rnaseqcount:1,rnaseqintegr:1,run:0,sampl:1,samtool:0,section:1,snakemak:0,sortedbam:1,specif:1,start:[],structur:[],subfold:1,subread:0,summari:1,summaryfil:1,tabl:[],temp:1,type:1,welcom:2,work:1,workflow:1,your:0}})
\ No newline at end of file
Search.setIndex({docnames:["chapter1","chapter2","index","projectInfo"],envversion:53,filenames:["chapter1.rst","chapter2.rst","index.rst","projectInfo.rst"],objects:{},objnames:{},objtypes:{},terms:{"8see