Commit b09d24c1 authored by Christian Arnold's avatar Christian Arnold
Browse files

GRaNIEdev: Important bugfix for enrichment analyses: Introduced a regression...

GRaNIEdev: Important bugfix for enrichment analyses: Introduced a regression in 0.14 that is fixed now. Thanks to Nila for reporting it
parent 8770eb71
Pipeline #28126 passed with stage
in 11 seconds
Package: GRaNIEdev
Package: GRaNIE
Title: GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using chromatin accessibility and RNA-seq data
Version: 0.14.0
Version: 0.14.3
Encoding: UTF-8
Authors@R: c(person("Christian", "Arnold", email =
"christian.arnold@embl.de", role = c("cre","aut")),
person("Judith", "Zaugg", email =
"judith.zaugg@embl.de", role = c("aut")),
person("Rim", "Moussa", email =
"rim.moussa01@gmail.com", role = "ctb"),
"rim.moussa01@gmail.com", role = "aut"),
person("Armando", "Reyes-Palomares", email =
"armandorp@gmail.com", role = "ctb"),
person("Giovanni", "Palla", email =
......
<!--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-->
# GRaNIEdev Changelog and News
# GRaNIE 0.9-0.14 (2021-12-13)
## GRaNIEdev 0.9 - 0.14 (2021-12-13)
### Major changes
## Major changes
- major overhaul and continuous work on peak-gene QC plots
- the *filterData* functions has now more filter parameters, such as filtering for CV. Also, all filters uniformly have a *min* and *max* filter.
......@@ -13,67 +8,63 @@
- handling of edge cases and rare events in various functions
- packages have been renamed to *GRaNIE* as basename (before: *GRN*)
### Bug fixes
## Bug fixes
- various minor bug fixes
### Minor changes
## Minor changes
- changed the object structure slightly and moved some gene and peak annotation data (such as mean, CV) to the appropriate annotation slot
## GRaNIEdev 0.8 (2021-05-07)
# GRaNIE 0.8 (2021-05-07)
### Major changes
## Major changes
- improved PCA plotting, PCA plots are now produced for both raw and normalized data
- new filters for the function *filterGRaNIEAndConnectGenes* (*peak_gene.maxDistance*) as well as more flexibility how to adjust the peak-gene raw p-values for multiple testing (including the possibility to use IHW - experimental)
- new function *plotDiagnosticPlots_TFPeaks* for plotting (this function was previously called only internally, but is now properly exported), in analogy to *plotDiagnosticPlots_peakGene*
### Bug fixes
## Bug fixes
- various minor bug fixes (PCA plotting, compatibility when providing pre-normalized data)
### Minor changes
## Minor changes
- changed the object structure slightly and cleaned the config slot, for example
- some functions have been added / renamed to make the workflow more clear and streamlined, see Vignette for details
- some default parameters changed
## GRaNIEdev 0.7 (2021-03-12)
# GRaNIE 0.7 (2021-03-12)
### Major changes
## Major changes
- improved PCA plotting, also works for pre-normalized counts now when provided as input originally
- more flexibility for data normalization
- homogenized wordings, function calls and workflow clarity, removed unnecessary warnings when plotting peak-gene diagnostic plots, added more R help documentation
- added IHW (Independent Hypothesis Weighting) as a multiple testing procedure for peak-gene p-values in addition to now allowing all methods that are supported by p.adjust
### Bug fixes
## Bug fixes
- various minor bug fixes
### Minor changes
## Minor changes
<!--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-->
## GRaNIEdev 0.6 (2021-02-09)
# GRaNIE 0.6 (2021-02-09)
### Major changes
## Major changes
- significant speed improvements for the peak-FDR calculations and subsequent plotting
- TF-peak diagnostic plots now also show negatively correlated TF-peak statistics irrespective of whether they have been filtered out in the object / pipeline. This may be useful for diagnostic purposes to check whether excluding them is a sensible choice and to confirm the numbers are low
### Bug fixes
## Bug fixes
- Numbers for connections per correlation bin in the TF-peak diagnostic plots were wrong as they did not correctly differentiate between the different connection types in case multiple ones had been specified (e.g., expression and TF activity). This has been fixed.
### Minor changes
## Minor changes
<!--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-->
## GRaNIEdev 0.5 (2021-02-02)
# GRaNIE 0.5 (2021-02-02)
first published package version
......@@ -4387,6 +4387,7 @@ calculateGeneralEnrichment <- function(GRN, ontology = c("BP", "MF"),
mapping = mapping)
GRN@stats[["Enrichment"]][["general"]] = general.enrichment
futile.logger::flog.info(paste0("Results stored in GRN@stats[[\"Enrichment\"]][[\"general\"]]"))
} else {
......@@ -4394,7 +4395,7 @@ calculateGeneralEnrichment <- function(GRN, ontology = c("BP", "MF"),
}
.printExecutionTime(start)
.printExecutionTime(start, prefix = "")
GRN
}
......@@ -4459,14 +4460,15 @@ calculateGeneralEnrichment <- function(GRN, ontology = c("BP", "MF"),
result.list = list()
# Construct a named vector with 0 or 1, 1 means in the foreground, 0 only in background
geneList = factor(as.integer(unique(background) %in% unique(foreground)))
names(geneList) = unique(background)
for (ontology_cur in ontology){
if (length(geneList == 0)) {
if (length(geneList) == 0) {
futile.logger::flog.warn(paste0("No enrichment found for ", ontology_cur, "."))
result.tbl = NULL
} else {
......@@ -4485,6 +4487,14 @@ calculateGeneralEnrichment <- function(GRN, ontology = c("BP", "MF"),
result.tbl = unique(topGO::GenTable(data, pval = result, orderBy = "pval", numChar = 1000,
topNodes = length(topGO::score(result))) )
result.tbl$GeneRatio = result.tbl$Significant / length(unique(foreground))
# Summmarize results
futile.logger::flog.info(paste0(" Enrichment calculation finished for ontology ", ontology_cur, ". Checked ", nrow(result.tbl), " terms"))
for (pvalueThres in c(0.01, 0.05, 0.1, 0.2)) {
dataCur = dplyr::filter(result.tbl, pval <= pvalueThres)
futile.logger::flog.info(paste0(" Number of terms for which p-value <= ", pvalueThres, ": ", nrow(dataCur)))
}
}
......@@ -4702,6 +4712,8 @@ calculateCommunitiesEnrichment <- function(GRN,
}
}
futile.logger::flog.info(paste0("Results stored in GRN@stats[[\"Enrichment\"]][[\"byCommunity\"]]"))
.printExecutionTime(start)
GRN
......
......@@ -2,7 +2,7 @@
The companion packages *GRaNIE* (**G**ene **R**egul**a**tory **N**etwork **I**nference including **E**nhancers) and *GRaNPA* (**G**ene **R**egul**a**tory **N**etwork **P**erformace **A**nalysis) are currently under active development. If you have questions, please do not hesitate to contact us (see below).
This page describes *GRaNIE*. For *GRaNPA*, please visit [this page](https://git.embl.de/grp-zaugg/GRaNPA).
This page describes *GRaNIE*. For *GRaNPA*, please visit [this page](https://grp-zaugg.embl-community.io/GRaNPA).
### Summary
......
<!-- Generated by pkgdown: do not edit by hand -->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Page not found (404) • GRaNIEdev</title>
<!-- jquery -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script>
<!-- Bootstrap -->
<link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.4.0/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script>
<!-- bootstrap-toc -->
<link rel="stylesheet" href="bootstrap-toc.css">
<script src="bootstrap-toc.js"></script>
<!-- Font Awesome icons -->
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous" />
<!-- clipboard.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script>
<!-- headroom.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script>
<!-- pkgdown -->
<link href="pkgdown.css" rel="stylesheet">
<script src="pkgdown.js"></script>
<meta property="og:title" content="Page not found (404)" />
<!-- mathjax -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script>
<!--[if lt IE 9]>
<title>Page not found (404) • GRaNIE</title>
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.4.0/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="bootstrap-toc.css">
<script src="bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="pkgdown.css" rel="stylesheet">
<script src="pkgdown.js"></script><meta property="og:title" content="Page not found (404)">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-530L9SXFM1"></script>
<script>
<![endif]--><!-- Global site tag (gtag.js) - Google Analytics --><script async src="https://www.googletagmanager.com/gtag/js?id=G-530L9SXFM1"></script><script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-530L9SXFM1');
</script>
</head>
<body data-spy="scroll" data-target="#toc">
</head>
<body data-spy="scroll" data-target="#toc">
<div class="container template-title-body">
<header>
<div class="navbar navbar-default navbar-fixed-top" role="navigation">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
......@@ -80,14 +37,14 @@
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="index.html">GRaNIEdev</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.14.0</span>
<a class="navbar-link" href="index.html">GRaNIE</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="">0.14.3</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<li>
<a href="index.html"></a>
</li>
<li>
......@@ -100,7 +57,7 @@
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<li>
<a href="articles/quickStart.html">Getting Started</a>
</li>
<li>
......@@ -118,19 +75,17 @@
<a href="news/index.html">Changelog &amp; News</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right">
</ul>
</div><!--/.nav-collapse -->
</div><!--/.container -->
</div><!--/.navbar -->
<ul class="nav navbar-nav navbar-right"></ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header>
<div class="row">
</header><div class="row">
<div class="contents col-md-9">
<div class="page-header">
<h1>Page not found (404)</h1>
......@@ -141,31 +96,31 @@ Content not found. Please use links in the navbar.
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="pkgdown-sidebar">
<nav id="toc" data-toggle="toc" class="sticky-top">
<h2 data-toc-skip>Contents</h2>
<nav id="toc" data-toggle="toc" class="sticky-top"><h2 data-toc-skip>Contents</h2>
</nav>
</div>
</div>
</div>
<footer>
<div class="copyright">
<p>Developed by Christian Arnold, Rim Moussa.</p>
<footer><div class="copyright">
<p></p>
<p>Developed by Christian Arnold, Judith Zaugg, Rim Moussa.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.6.1.</p>
<p></p>
<p>Site built with <a href="https://pkgdown.r-lib.org/" class="external-link">pkgdown</a> 2.0.1.</p>
</div>
</footer>
</div>
</div>
</body>
</html>
......@@ -5,14 +5,14 @@
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Introduction and Methodological Details • GRaNIEdev</title>
<title>Introduction and Methodological Details • GRaNIE</title>
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.min.js" integrity="sha256-CSXorXvZcTkaix6Yvo6HppcZGetbYMGWSFlBw8HfCJo=" crossorigin="anonymous"></script><!-- Bootstrap --><link href="https://cdnjs.cloudflare.com/ajax/libs/bootswatch/3.4.0/flatly/bootstrap.min.css" rel="stylesheet" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js" integrity="sha256-nuL8/2cJ5NDSSwnKD8VqreErSWHtnEP9E7AySL+1ev4=" crossorigin="anonymous"></script><!-- bootstrap-toc --><link rel="stylesheet" href="../bootstrap-toc.css">
<script src="../bootstrap-toc.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script><meta property="og:title" content="Introduction and Methodological Details">
<meta property="og:description" content="GRaNIEdev">
<meta property="og:description" content="GRaNIE">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
......@@ -25,6 +25,8 @@
</script>
</head>
<body data-spy="scroll" data-target="#toc">
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
......@@ -36,8 +38,8 @@
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">GRaNIEdev</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.14.0</span>
<a class="navbar-link" href="../index.html">GRaNIE</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="">0.14.3</span>
</span>
</div>
......@@ -84,13 +86,13 @@
</header><script src="Introduction_files/header-attrs-2.11/header-attrs.js"></script><div class="row">
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1 data-toc-skip>Introduction and Methodological Details</h1>
<h4 class="author">Christian Arnold, Judith Zaugg</h4>
<h4 data-toc-skip class="author">Christian Arnold, Judith Zaugg</h4>
<h4 class="date">13 December 2021</h4>
<h4 data-toc-skip class="date">15 December 2021</h4>
<div class="hidden name"><code>Introduction.Rmd</code></div>
......@@ -103,9 +105,9 @@
<p>This vignette introduces the <code>GRaNIE</code> package and explains the main features, methods and necessary background.</p>
</div>
<div id="motivation" class="section level1">
<h1 class="hasAnchor">
<a href="#motivation" class="anchor"></a>Motivation and Necessity</h1>
<div class="section level1">
<h1 id="motivation">Motivation and Necessity<a class="anchor" aria-label="anchor" href="#motivation"></a>
</h1>
<!-- <div align="center"> -->
<!-- <figure> -->
<!-- <img src="figs/Logo.png" height="200px"/> -->
......@@ -116,18 +118,18 @@
<p>Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have celltype specific activity. This TF activity can be quantified with existing tools such as <em>diffTF</em> and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally TF activity data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.</p>
<p>In summary, we present a framework to reconstruct predictive enhancer-mediated regulatory network models that are based on integrating of expression and chromatin accessibility/activity pattern across individuals, and provide a comprehensive resource of cell-type specific gene regulatory networks for particular cell types.</p>
</div>
<div id="installation" class="section level1">
<h1 class="hasAnchor">
<a href="#installation" class="anchor"></a>Installation and Example Workflow</h1>
<div class="section level1">
<h1 id="installation">Installation and Example Workflow<a class="anchor" aria-label="anchor" href="#installation"></a>
</h1>
<p>Please see the <a href="quickStart.html">quick start vignette for how to install our <code>GRaNIE</code> package(s)</a> and the <a href="workflow.html">workflow vignette for an example workflow</a>.</p>
</div>
<div id="input" class="section level1">
<h1 class="hasAnchor">
<a href="#input" class="anchor"></a>Input</h1>
<div class="section level1">
<h1 id="input">Input<a class="anchor" aria-label="anchor" href="#input"></a>
</h1>
<p>In our <code>GRN</code> approach, we integrate multiple data modalities. Here, we describe them in detail and their required format.</p>
<div id="input_peaks" class="section level2">
<h2 class="hasAnchor">
<a href="#input_peaks" class="anchor"></a>Open chromatin and RNA-seq data</h2>
<div class="section level2">
<h2 id="input_peaks">Open chromatin and RNA-seq data<a class="anchor" aria-label="anchor" href="#input_peaks"></a>
</h2>
<p>Open chromatin data may come from ATAC-seq, DNAse-seq or ChIP-seq data for particular histone modifications that associate with open chromatin such as histone acetylation (e.g., H3K27ac). They all capture open chromatin either directly or indirectly, and while we primarily tested and used ATAC-seq while developing the package, the others should also be applicable for our framework. <em>From here on, we will refer to these regions simply as peaks.</em></p>
<p>For RNA-seq, the data represent expression counts per gene across samples.</p>
<p>Here is a quick graphical representation which format is required to be compatible with our framework:</p>
......@@ -139,7 +141,7 @@
<ul>
<li>The name of the ID column can be anything and can be specific later in the pipeline. For peaks, we usually use <code>peakID</code> while for RNA-seq, we use <code>EnsemblID</code>
</li>
<li>for peaks, the required format is “chr:start-end”, with <code>chr</code> denoting the chromosome, followed by <code><a href="https://rdrr.io/r/base/Colon.html">:</a></code>, and then <code>start</code>, <code><a href="https://rdrr.io/r/base/Arithmetic.html">-</a></code>, and <code>end</code> for the peak start and end, respectively.</li>
<li>for peaks, the required format is “chr:start-end”, with <code>chr</code> denoting the chromosome, followed by <code>:</code>, and then <code>start</code>, <code>-</code>, and <code>end</code> for the peak start and end, respectively.</li>
</ul>
</li>
<li>counts should be raw if possible (that is, integers), but we also support pre-normalized data. <a href="#methods_dataNorm">See here for more information.</a>
......@@ -149,9 +151,9 @@
<p>Note that peaks must not overlap. If they do, an informative error message is thrown and the user is requested to modify the peak input data so that no overlaps exist among all peaks. This can be done by either merging overlapping peaks or deleting those that overlap with other peaks based on other criteria such as peak signal, by keeping only the strongest peak, for example.</p>
<p>For guidelines on how many peaks are necessary or recommended, see <a href="#guidelines">the section below</a>.</p>
</div>
<div id="input_TF" class="section level2">
<h2 class="hasAnchor">
<a href="#input_TF" class="anchor"></a>TF and TFBS data</h2>
<div class="section level2">
<h2 id="input_TF">TF and TFBS data<a class="anchor" aria-label="anchor" href="#input_TF"></a>
</h2>
<p>TF and TFBS data is mandatory as input. Specifically, the package requires a <code>bed</code> file per TF with TF binding sites (TFBS). TFBS can either be in-silico predicted, or experimentally verified, as long as genome-wide TFBS can be used. For convenience and orientation, we provide TFBS predictions for HOCOMOCO-based TF motifs that were used with <code>PWMScan</code> for <code>hg19</code>, <code>hg38</code> and <code>mm10</code>. Check the <a href="workflow.html">workflow vignette for an example</a>.</p>
<p>However, you may also use your own TFBS data, and we provide full flexibility in doing so. Only some manual preparation is necessary. Briefly, if you decide to use your own TFBS data, you have to prepare the following:</p>
<ul>
......@@ -161,56 +163,56 @@
</ul>
<p>For more methodological details, details on how to construct these files, their exact format etc we refer to <code>diffTF</code> paper for details.</p>
</div>
<div id="input_metadata" class="section level2">
<h2 class="hasAnchor">
<a href="#input_metadata" class="anchor"></a>Sample metadata (optional but highly recommended)</h2>
<div class="section level2">
<h2 id="input_metadata">Sample metadata (optional but highly recommended)<a class="anchor" aria-label="anchor" href="#input_metadata"></a>
</h2>
<p>Providing sample metadata is optional, but highly recommended - if available, the sample metadata is integrated into the PCA plots to understand where the variation in the data comes from and whether any of the metadata (e.g., age, sex, sequencing batch) is associated with the PCs from a PC, indicating a batch effect that needs to be addressed before running the <code>GRaNIE</code> pipeline.</p>
<p>The integration of sample metadata is in the <code>addData</code> function, see <code><a href="../reference/addData.html">?addData</a></code> for more information.</p>
</div>
<div id="input_HiC" class="section level2">
<h2 class="hasAnchor">
<a href="#input_HiC" class="anchor"></a>Hi-C data (optional)</h2>
<div class="section level2">
<h2 id="input_HiC">Hi-C data (optional)<a class="anchor" aria-label="anchor" href="#input_HiC"></a>
</h2>
<p>Integration of Hi-C data is optional and serves as alternative to identifying peak-gene pairs to test for correlation based on a predefined and fixed <em>neighborhood</em> size (see <a href="#methods_peakGene">Methods</a>).</p>
<p>If Hi-C data are available, the pipeline expects a BED file format with at least 3 columns: chromosome name, start, and end. An ID column is optional and assumed to be in the 4th column, all additional columns are ignored.</p>
<p>For more details, see the R help (<code><a href="../reference/addConnections_peak_gene.html">?addConnections_peak_gene</a></code>) and the <a href="#methods_peakGene">Methods</a>.</p>
</div>
<div id="input_SNP" class="section level2">
<h2 class="hasAnchor">
<a href="#input_SNP" class="anchor"></a>SNP data (optional, coming soon)</h2>
<div class="section level2">
<h2 id="input_SNP">SNP data (optional, coming soon)<a class="anchor" aria-label="anchor" href="#input_SNP"></a>
</h2>
<p>We also plan to integrate SNP data soon, stay tuned!</p>
</div>
</div>
<div id="methods" class="section level1">
<h1 class="hasAnchor">
<a href="#methods" class="anchor"></a>Methodological Details and Basic Mode of Action</h1>
<div class="section level1">
<h1 id="methods">Methodological Details and Basic Mode of Action<a class="anchor" aria-label="anchor" href="#methods"></a>
</h1>
<p>In this section, we give methodological details and guidelines.</p>
<div id="methods_dataNorm" class="section level2">
<h2 class="hasAnchor">
<a href="#methods_dataNorm" class="anchor"></a>Data normalization</h2>
<div class="section level2">
<h2 id="methods_dataNorm">Data normalization<a class="anchor" aria-label="anchor" href="#methods_dataNorm"></a>
</h2>
<p>An important consideration is data normalization for RNA and open chromatin data. We currently support three choices of normalization of either peak or RNA-Seq data: <code>quantile</code>, <code>DESeq_sizeFactor</code> and <code>none</code> and refer to the R help for more details (<code><a href="../reference/addData.html">?addData</a></code>). The default for RNA-Seq is a quantile normalization, while for the open chromatin peak data, it is <code>DESeq_sizeFactor</code> (i.e., a “regular” <code>DESeq</code> size factor normalization). Importantly, <code>DESeq_sizeFactor</code> requires raw data, while <code>quantile</code> does not necessarily. We nevertheless recommend raw data as input, although it is also possible to provide pre-normalized data as input and then topping this up with another normalization method or “none”.</p>
<p>While we recommend raw counts for both peaks and RNA-Seq as input and offer several normalization choices in the pipeline, it is also possible to provide pre-normalized data. Note that the normalization method may have a large influence on the resulting <code>eGRN</code> network, so make sure the choice of normalization is reasonable. For more details, see the next sections.</p>
</div>
<div id="methods_permutedData" class="section level2">
<h2 class="hasAnchor">
<a href="#methods_permutedData" class="anchor"></a>Permutations</h2>
<div class="section level2">
<h2 id="methods_permutedData">Permutations<a class="anchor" aria-label="anchor" href="#methods_permutedData"></a>
</h2>
<p>RNA-Seq is shuffled, this is permutation 1. TODO: More</p>
</div>
<div id="methods_TF_peak" class="section level2">
<h2 class="hasAnchor">
<a href="#methods_TF_peak" class="anchor"></a>TF-peak connections</h2>
<div id="methods_TF_peak_build" class="section level3">
<h3 class="hasAnchor">
<a href="#methods_TF_peak_build" class="anchor"></a>Establishing TF-peak links</h3>
<div class="section level2">
<h2 id="methods_TF_peak">TF-peak connections<a class="anchor" aria-label="anchor" href="#methods_TF_peak"></a>
</h2>
<div class="section level3">
<h3 id="methods_TF_peak_build">Establishing TF-peak links<a class="anchor" aria-label="anchor" href="#methods_TF_peak_build"></a>
</h3>
<p>TODO: Describe hoe we establish TF-peak links</p>
</div>
<div id="methods_TF_peak_TFActivity" class="section level3">
<h3 class="hasAnchor">
<a href="#methods_TF_peak_TFActivity" class="anchor"></a>TF Activity connections</h3>
<div class="section level3">
<h3 id="methods_TF_peak_TFActivity">TF Activity connections<a class="anchor" aria-label="anchor" href="#methods_TF_peak_TFActivity"></a>
</h3>
<p>As explained above, TF-peak connections are found by correlation TF <em>expression</em> with peak accessibility. In addition to <em>expression</em>, we also offer to identify statistically significant TF-peak links based on <em>TF Activity</em> and not expression of the TFs. The concept of TF Activity is described in more detail in our <em>diffTF</em> paper. In short, we define TF motif activity, or TF activity for short, as the effect of a TF on the state of chromatin as measured by chromatin accessibility or active chromatin marks (i.e., ATAC-seq, DNase sequencing [DNase-seq], or histone H3 lysine 27 acetylation [H3K27ac] ChIP-seq). A <em>TF Activity</em> score is therefore needed <em>for each TF and each sample</em>.</p>
<p>TF Activity information can either be calculated within the <code>GRaNIE</code> framework <a href="#methods_TF_peak_TFActivity_calculating">using a simplified and empirical approach)</a> or it can be calculated outside of our framework using designated methods and then <a href="#methods_TF_peak_TFActivity_importing">imported into our framework</a>. We now describe these two choices in more detail.</p>
<div id="methods_TF_peak_TFActivity_calculating" class="section level4">
<h4 class="hasAnchor">
<a href="#methods_TF_peak_TFActivity_calculating" class="anchor"></a>Calculating TF Activity</h4>
<div class="section level4">
<h4 id="methods_TF_peak_TFActivity_calculating">Calculating TF Activity<a class="anchor" aria-label="anchor" href="#methods_TF_peak_TFActivity_calculating"></a>
</h4>
<p>In our <em>GRaNIE</em> approach, we empirically estimate TF Activity for each TF with the following approach:</p>
<ul>
<li>normalize the raw peak counts by one of the supported normalization methods (see below)</li>
......@@ -227,53 +229,53 @@
<li>No normalization</li>
</ol>
</div>
<div id="methods_TF_peak_TFActivity_importing" class="section level4">
<h4 class="hasAnchor">
<a href="#methods_TF_peak_TFActivity_importing" class="anchor"></a>Importing TF Activity</h4>
<div class="section level4">
<h4 id="methods_TF_peak_TFActivity_importing">Importing TF Activity<a class="anchor" aria-label="anchor" href="#methods_TF_peak_TFActivity_importing"></a>
</h4>
<p>Soon, it will also be possible to import TF Activity data into our framework as opposed to calculating it using the procedure as described above. This feature is currently in development and will be available soon.</p>
</div>
<div id="methods_TF_peak_TFActivity_adding" class="section level4">
<h4 class="hasAnchor">
<a href="#methods_TF_peak_TFActivity_adding" class="anchor"></a>Adding TF Activity TF-peak connections</h4>
<div class="section level4">
<h4 id="methods_TF_peak_TFActivity_adding">Adding TF Activity TF-peak connections<a class="anchor" aria-label="anchor" href="#methods_TF_peak_TFActivity_adding"></a>