Commit 2d3df76e authored by CosteaPaul's avatar CosteaPaul
Browse files

Readme update

parent 9ede1bb6
......@@ -13,7 +13,6 @@ Via Git:
or [download](https://git.embl.de/rmuench/metaSNP/repository/archive.zip?ref=master) a zip file of the repository.
Dependencies
============
......@@ -31,10 +30,9 @@ MetaSNP is mainly implemented in C/C++ and needs to be compiled.
I. Setup
--------
Compilation requires [htslib](http://www.htslib.org/) and
[boost](http://www.boost.org/users/download/). If installed correctly in a
standard location, they should be detected by the build procedure, but, if not,
edit the file `SETUPFILE` to contain the correct paths.
Compilation requires
[htslib](http://www.htslib.org/)
[boost](http://www.boost.org/users/download/).
#### Dependencies on Ubuntu/debian
......@@ -48,7 +46,6 @@ universe repository before):
sudo apt-get install libhts-dev libboost-dev
### Dependencies using anaconda
If you use [anaconda](https://www.continuum.io/downloads), you can create an
......@@ -58,7 +55,6 @@ environment with all necessary dependencies using the following commands:
source activate metaSNV
export CFLAGS=-I$CONDA_ENV_PATH/include
export LD_LIBRARY_PATH=$CONDA_ENV_PATH/lib:$LD_LIBRARY_PATH
make
Setting the `CFLAGS` environmental variable is necessary so that the boost
includes are found. The `LD_LIBRARY_PATH` variable is not necessary to build
......@@ -75,41 +71,29 @@ If you do not have a C++ compiler, anaconda can also install G++:
export CFLAGS=-I$CONDA_ENV_PATH/include
export LD_LIBRARY_PATH=$CONDA_ENV_PATH/lib:$LD_LIBRARY_PATH
make
### Download reference database
In order to download our reference database run the provided script in the
parent directory:
In order to download a reference database, run:
./getRefDB.sh
II. Compilation:
----------------
1) run `make` in the parent directory of metaSNV to compile qaTools and the snpCaller.
make
III. Environmental Variables:
-----------------------------
We assume the metaSNV parent directory, `samtools` and `python` are present in your PATH.
You can set this variable temporarily by running the following commands and
permanently by putting these line into your .bashrc:
export PATH=/path/2/metaSNV:$PATH
Note: Replace '/path/2' with the corresponding global path.
Workflow:
=========
## Required Files:
* **'all\_samples'** = a list of all BAM files, one /path/2/sample.bam per line (avoid duplicates!)
* **'all\_samples'** = a list of all BAM files, one /path/2/sample.bam per line (no duplicates)
* **'ref\_db'** = the reference database in fasta format (f.i. multi-sequence fasta)
* **'gen\_pos'** = a list with start and end positions for each sequence in the reference (format: `sequence\_id start end`)
## Optional Files:
* **'db\_ann'** = a gene annotation file for the reference database.
......@@ -118,7 +102,6 @@ Workflow:
metaSNV.py project_dir/ all_samples ref_db ref_fasta [options]
## 3. Part II: Post-Processing (Filtering & Analysis)
### a) Filtering:
......@@ -194,7 +177,7 @@ Example Tutorial
### b) Compute pair-wise distances between samples on their SNP profiles and create a PCoA plot.
TODO!
Advanced usage (tools and scripts)
......@@ -204,10 +187,6 @@ If you are interested in using the pipeline in a more manual way (for example
the metaSNV caller stand alone), you will find the executables for the
individual steps in the `src/` directory.
You will find scripts as well as the binaries for qaCompute and the metaSNP
caller in their corresponding directories (src/qaCompute and src/snpCaller)
post compilation.
metaSNV caller
--------------
Calls SNVs from samtools pileup format and generates two outputs.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment