README.md 7.03 KB
Newer Older
1
# SIAMCAT <img src="man/figures/logo.png" align="right" width="120" />
2

Jakob Wirbel's avatar
Jakob Wirbel committed
3 4
[![Build Status](https://travis-ci.com/zellerlab/siamcat.svg?branch=master)](https://travis-ci.com/zellerlab/siamcat)

Jakob Wirbel's avatar
Jakob Wirbel committed
5
## Overview
Jakob Wirbel's avatar
Jakob Wirbel committed
6 7 8 9 10 11 12 13 14 15 16 17
`SIAMCAT` is a pipeline for Statistical Inference of Associations between
Microbial Communities And host phenoTypes. A primary goal of analyzing
microbiome data is to determine changes in community composition that are
associated with environmental factors. In particular, linking human microbiome
composition to host phenotypes such as diseases has become an area of intense
research. For this, robust statistical modeling and biomarker extraction
toolkits are crucially needed. `SIAMCAT` provides a full pipeline supporting
data preprocessing, statistical association testing, statistical modeling
(LASSO logistic regression) including tools for evaluation and interpretation
of these models (such as cross validation, parameter selection, ROC analysis
and diagnostic model plots).

18
<a href='https://microbiome-tools.embl.de'> <img src="man/figures/embl_microbiome_tools_logo.png" align="right" width="200"> </a>
Jakob Wirbel's avatar
Jakob Wirbel committed
19 20 21 22 23

`SIAMCAT` is developed in the
[Zeller group](https://www.embl.de/research/units/scb/zeller/index.html)
and is part of the suite of computational microbiome analysis tools hosted at
[EMBL](https://www.embl.org/).
Konrad Zych's avatar
Konrad Zych committed
24 25

## Starting with SIAMCAT
26 27 28

### Installation

Jakob Wirbel's avatar
Jakob Wirbel committed
29
In order to start with `SIAMCAT`, you need to install it from Bioconductor:
30 31 32 33
```R
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("SIAMCAT", version = "3.8")
Jakob Wirbel's avatar
Jakob Wirbel committed
34
```
35

36 37 38 39
Alternatively, you can install the current development version via `devtools`:
```R
require("devtools")
devtools::install_github(repo = 'zellerlab/siamcat')
Konrad Zych's avatar
Konrad Zych committed
40
```
41 42 43 44

### Quick start

There are a few manuals that will kick-start you and help you analyse your
Jakob Wirbel's avatar
Jakob Wirbel committed
45
data with `SIAMCAT`. You can find links to those on the
46 47 48
[Bioconductor website of SIAMCAT](https://bioconductor.org/packages/release/bioc/html/SIAMCAT.html)
or you can type into `R`:
```R
Konrad Zych's avatar
Konrad Zych committed
49
browseVignettes("SIAMCAT")
Jakob Wirbel's avatar
Jakob Wirbel committed
50 51
# Please Note:
# `browseVignettes` only works if `SIAMCAT` has been installed via Bioconductor
Jakob Wirbel's avatar
Jakob Wirbel committed
52
```
53

54
## Contact
Konrad Zych's avatar
Konrad Zych committed
55

Konrad Zych's avatar
Konrad Zych committed
56
If you run into any issue:
57
- create an
58
[issue in this repository](https://github.com/zellerlab/siamcat/issues/new) or
Jakob Wirbel's avatar
Jakob Wirbel committed
59
- mail [Georg Zeller](mailto:zeller@embl.de) or
60 61
- ask at the
[SIAMCAT support group](https://groups.google.com/forum/#!forum/siamcat-users)
62

Jakob Wirbel's avatar
Jakob Wirbel committed
63 64 65
If you found `SIAMCAT` useful, please consider giving us
[feedback](https://www.surveymonkey.de/r/denbi-service?sc=hd-hub&tool=siamcat).

66 67 68 69
## Citation

If you use `SIAMCAT`, please cite us by using

70
```R
71 72 73 74 75
citation("SIAMCAT")
```

or by

Jakob Wirbel's avatar
Jakob Wirbel committed
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152
> Zych K, Wirbel J, Essex M, Breuer K, Karcher N, Costea PI, Sunagawa S,
Bork P, Zeller G (2018). _SIAMCAT: Statistical Inference of Associations
between Microbial Communities And host phenoTypes_.
doi: 10.18129/B9.bioc.SIAMCAT (URL: http://doi.org/10.18129/B9.bioc.SIAMCAT),
R package version 1.0.1, (URL: https://bioconductor.org/packages/SIAMCAT/).

## Examples of primary package output

To give you a small preview about the primary package output, here are some
example plots taking from the main `SIAMCAT` vignette.

In this vignette, we use an example dataset which is also included in
the `SIAMCAT` package. The dataset is taken from the publication of
[Zeller et al](http://europepmc.org/abstract/MED/25432777), which demonstrated
the potential of microbial species in fecal samples to distinguish patients
with colorectal cancer (CRC) from healthy controls.

### Association testing

The result of the `check.associations` function is an association plot.
For significantly associated microbial features, the plot shows:
- the abundances of the features across the two different classes (CRC vs.
controls)
- the significance of the enrichment calculated by a Wilcoxon test (after
multiple hypothesis testing correction)
- the generalized fold change of each feature
- the prevalence shift between the two classes, and
- the Area Under the Receiver Operating Characteristics Curve (AU-ROC) as
non-parametric effect size measure.

![Association testing](man/figures/associations_plot.png)


### Model interpretation plot

After statistical models have been trained to distinguish cancer cases
from controls, the models can be investigated by the function
`model.interpretation.plot`. The plots shows:
- the median relative feature weight for selected features (barplot on the left)
- the robustness of features (i.e. in how many of the models the specific
feature has been selected)
- the distribution of selected features across samples (central heatmap)
- which proportion of the weight of all different models are shown in the plot
(boxplot on the right), and
- distribution of metadata across samples (heatmap below).

![Model interpretation plot](man/figures/interpretation_plot.png)

## Where SIAMCAT has been used already

Several publications already used `SIAMCAT` (or previous versions thereof).

- __[Potential of fecal microbiota for early-stage detection of colorectal cancer](http://europepmc.org/abstract/MED/25432777)__  
_Zeller G,  Tap J,  Voigt AY,  Sunagawa S,  Kultima JR,  Costea PI,  Amiot A,
Böhm J,  Brunetti F,  Habermann N,  Hercog R,  Koch M,  Luciani A,  Mende DR,
Schneider MA,  Schrotz-King P,  Tournigand C,  Tran Van Nhieu J,  Yamada T,
Zimmermann J,  Benes V,  Kloor M,  Ulrich CM,  von Knebel Doeberitz M,
Sobhani I,  Bork P_  
Molecular Systems Biology, (__2014__) 10, 766  
>Original Publication that inspired `SIAMCAT`

- __[Gut Microbiota Linked to Sexual Preference and HIV Infection](https://doi.org/10.1016/j.ebiom.2016.01.032)__  
_Noguera-Julian M, Rocafort M, Guillén Y, Rivera J, Casadellà M, Nowak P,
Hildebrand F, Zeller G, Parera M, Bellido R, Rodríguez C,Carrillo J, Mothe B,
Coll J, Bravo I, Estany C, Herrero C, Saz J, Sirera G, Torrela A, Navarro J,
Crespo M, Brander C, Negredo E, Blanco J, Guarner F, Calle ML, Bork P,
Sönnerborgd A, Clotet B, Paredes R_  
EBioMedicine 5 (__2016__) 135-146
>See Figure 5

- __[Extensive transmission of microbes along the gastrointestinal tract](https://elifesciences.org/articles/42693)__  
_Schmidt TSB, Hayward MR, Coelho LP, Li SS, Costea PI, Voigt AY, Wirbel J,
Maistrenko OM, Alves RJC, Bergsten E, de Beaufort C, Sobhani I,
Heintz-Buschart A, Sunagawa S, Zeller G, Wilmes P, Bork P_  
eLife, (__2019__) 8:e42693  
> See Figure 3 - figure supplement 1

153 154 155 156 157 158 159 160 161
- __[Meta-analysis of fecal metagenomes reveals global microbial signatures
that are specific for colorectal cancer](https://www.nature.com/articles/s41591-019-0406-6)__  
_Wirbel J, Pyl PT, Kartal E, Zych K, Kashani A, Milanese A, Fleck JS, Voigt AY,
Palleja A, Ponnudurai R, Sunagawa S, Coelho LP, Schrotz-King P, Vogtmann E,
Habermann N, Niméus E, Thomas AM, Manghi P, Gandini S, Serrano D, Mizutani S,
Shiroma H, Shiba S, Shibata T, Yachida S, Yamada T, Waldron L, Naccarati A,
Segata N, Sinha R, Ulrich CM, Brenner H, Arumugam M, Bork P, Zeller G_  
Nature Medicine, (__2019__) [Epub ahead of print]  
> In this publication, `SIAMCAT` is used extensively for holdout testing
Jakob Wirbel's avatar
Jakob Wirbel committed
162 163 164

If you used `SIAMCAT` in your publication,
[let us know](mailto:zeller@embl.de)!