Commit da30ad55 authored by Alessio Milanese's avatar Alessio Milanese
Browse files

Update README.md

parent 0e41d82a
......@@ -421,33 +421,64 @@ We use [mOTUs2](https://github.com/motu-tool/mOTUs_v2) to create taxonomic profi
- https://www.embl.de/download/zeller/metaG_course_2021/human_gut_sample_for.fq
- https://www.embl.de/download/zeller/metaG_course_2021/human_gut_sample_rev.fq
and profile with mOTUs3 and MetaPhlAn3 (already installed and can be run with `metaphlan`, manual: [link](https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0#basic-usage)).
and profile with mOTUs3 and MetaPhlAn3 (already installed and can be run with `metaphlan`, manual: [link](https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0#basic-usage)). How many species are profiled by the tools? How many different families?
<details><summary>SOLUTION</summary>
<p>
We can profile the unfiltered reads with:
We can download the samples with:
```
motus profile -f raw_reads_1.fastq -r raw_reads_2.fastq -t 8 -n unfiltered -o unfiltered.motus
wget https://www.embl.de/download/zeller/metaG_course_2021/human_gut_sample_for.fq
wget https://www.embl.de/download/zeller/metaG_course_2021/human_gut_sample_rev.fq
```
Let's have a look at the species that are profiled by both:
We can run motus with:
```
motus merge -i unfiltered.motus,test_sample.motus | grep -v "0.0000000000" | grep -v "#" | head
motus profile -f human_gut_sample_for.fq -r human_gut_sample_rev.fq -o human_gut.motus -t 8 -A
```
Which results in:
We can run metaphlan with:
```
Escherichia coli [ref_mOTU_v3_00095] 0.0002054152 0.0001797485
Eggerthella lenta [ref_mOTU_v3_00719] 0.0001531478 0.0001633324
Ruminococcus bromii [ref_mOTU_v3_00853] 0.0046285150 0.0049443856
Bacteroides uniformis [ref_mOTU_v3_00855] 0.0179185757 0.0188356677
Anaerostipes hadrus [ref_mOTU_v3_00857] 0.0000810850 0.0000864773
Roseburia faecis [ref_mOTU_v3_00859] 0.0003944823 0.0004207160
Roseburia inulinivorans [ref_mOTU_v3_00860] 0.0001580503 0.0001690509
Roseburia hominis [ref_mOTU_v3_00861] 0.0004800660 0.0005119912
Bifidobacterium longum [ref_mOTU_v3_01099] 0.0023339573 0.0027482230
Megasphaera elsdenii [ref_mOTU_v3_01516] 0.0006391750 0.0006816812
metaphlan --input_type fastq --nproc 16 --bowtie2out human_gut.bowtie2.bz2 -o human_gut.metaphlan human_gut_sample_for.fq,human_gut_sample_rev.fq
```
The two profiles are really similar. mOTUs is filtering the reads internally based on how good they map to the marker gene sequences; hence trimming and filtering the reads before will not affect much the profiles. For other analysis (like building metagenome-assembled genomes) trimming the reads improve the result.
We can have a look at the result. The `-A` in mOTUs change the result (`head human_gut.motus`):
```
#mOTUs2_clade unnamed sample
k__Bacteria 0.9481725845
k__Bacteria|p__Proteobacteria 0.0089415128
k__Bacteria|p__Firmicutes 0.3280255558
k__Bacteria|p__Actinobacteria 0.0533709217
k__Bacteria|p__Fusobacteria 0.0439658758
k__Bacteria|p__Bacteroidetes 0.5136232781
k__Bacteria|p__Bacteria phylum incertae sedis 0.0002454402
k__Bacteria|p__Proteobacteria|c__Gammaproteobacteria 0.0078059168
k__Bacteria|p__Proteobacteria|c__Deltaproteobacteria 0.0011355960
```
How many species there are for motus:
```
cat human_gut.motus | grep "s__" | wc -l
```
There are 122 species measured
How many species there are for metaphlan:
```
cat human_gut.metaphlan | grep "s__" | wc -l
```
There are 48 species measured
How many families there are for motus:
```
cat human_gut.motus | grep "f__" | grep -v "g__" | wc -l
```
There are 32 families measured
How many families there are for metaphlan:
```
cat human_gut.metaphlan | grep "f__" | grep -v "g__" | wc -l
```
There are 16 families measured
</p>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment