Skip to content
Snippets Groups Projects
Commit 5e7f1b5d authored by Niko Papadopoulos's avatar Niko Papadopoulos
Browse files

Merge branch 'ruperti-main-patch-43727' into 'main'

Update README.md

See merge request grp-arendt/spongfold!1
parents b0737a3b 49dcc056
No related branches found
No related tags found
1 merge request!1Update README.md
...@@ -15,6 +15,13 @@ The question of the threshold turns out to be an important one. Burkhardt Rost f ...@@ -15,6 +15,13 @@ The question of the threshold turns out to be an important one. Burkhardt Rost f
However, structure is more conserved than sequence. In theory, predicted structures can be compared against known structures that are otherwise annotated, allowing for the transfer of functional annotations (albeit less specific than sequence-based ones, since we will be detecting very remote homology at best). This is of particular interest for non-model organisms, especially ones outside the well-studied taxonomic groups (e.g. vertebrates or ecdysozoans). However, structure is more conserved than sequence. In theory, predicted structures can be compared against known structures that are otherwise annotated, allowing for the transfer of functional annotations (albeit less specific than sequence-based ones, since we will be detecting very remote homology at best). This is of particular interest for non-model organisms, especially ones outside the well-studied taxonomic groups (e.g. vertebrates or ecdysozoans).
## Follow-up ideas
Having a phylome, scRNAseq/cell type annotation, functional proteomic data and the prediction of protein structures for a non-bilaterian Metazoan presents a unique combination that would allow to ask many fundamental questions. Potential follow-up analysis include:
- Correltation between AF prediction accuracy (overall, domain specific, etc.) and sequence identity/similarity or bitscore of best FoldSeek hit. I.e.: "Does higher sequence identity mean better prediction accuracy?"
- Relationship between identified homologs through sequence search (orthofinder, eggnog-mapper, blast, phylome) and best hits in FoldSeek for single sponge proteins. I.e.: "Do the best AF hits also include proteins identified as homologs in the phylome? Is there a sequence identity threshold to that?"
- Is there biological meaning to best FoldSeek hits of un-annotated, highly expressed genes in the scRNAseq dataset or differentially regulated hits in the functional proteomic datasets?. I.e.: "Can we transfer function / functionally annotate previously un-annotated hits in scRNAseq and protomics data and most importantly, do these hits make sense when taking prior knowledge into account?"
## Usage ## Usage
(eventually a tutorial on what order to use the scripts in, if we don't have a master script). (eventually a tutorial on what order to use the scripts in, if we don't have a master script).
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment