From e1bc8290e60d75454d48dec14dcd57ff69417a88 Mon Sep 17 00:00:00 2001
From: Malvika Sharan <malvika.sharan@embl.de>
Date: Wed, 9 Nov 2016 11:34:18 +0100
Subject: [PATCH] Update motif_visualization.md

---
 TeachingMaterials/motif_visualization.md | 130 ++++++++++++++++++++++-
 1 file changed, 127 insertions(+), 3 deletions(-)

diff --git a/TeachingMaterials/motif_visualization.md b/TeachingMaterials/motif_visualization.md
index 793356f..ad2872d 100644
--- a/TeachingMaterials/motif_visualization.md
+++ b/TeachingMaterials/motif_visualization.md
@@ -1,4 +1,4 @@
-# Motif visualization using [Weblogo](http://weblogo.berkeley.edu/logo.cgi) and [MEME](http://meme-suite.org/)
+# Motif visualization using [Weblogo](http://weblogo.threeplusone.com/create.cgi) and [MEME](http://meme-suite.org/)
 
 ### Example protein: Anaphase promotion coplex
 
@@ -8,9 +8,133 @@ MVNTDNKENEPPNMEKAHMDSSNALYRVQRPLQRRPLQELSIELVKPSQTITVKKSKKST
 NSSSYFAQLHAASGQNPPPSVHSSHKQPSKARSPNPLLSMR
 ````
 
-### BLAST this sequence to get all the homologs (remote homology)
+### Query table for selected homologs (remote homology) of anaphase promoting complex
+
+|Organism|Entry|
+|:----------:|:----------:|
+|Homo sapiens (Human)|Q9BS18|
+|Mus musculus (Mouse)|Q8R034|
+|Rattus norvegicus (Rat)|D4A427|
+|Drosophila melanogaster (Fruit fly)|A1ZAZ9|
+|Xenopus laevis (African clawed frog)|A1L2Q2|
+|Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)|Q5RBV4|
+|Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)|O74358|
+|Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)|Q12379|
+
+**Exercise to extract query sequences**
+
+- Extract the FASTA sequences for the UniProt ids from the table, save it in your text file (or keep the browsing window open)
+- Use it for multiple sequence alignment: use either [Clustal Omega](https://www.ebi.ac.uk/Tools/msa/clustalo/) directly or via [align UniProt](http://www.uniprot.org/align/)
+- Extract the MSA in individual aligned FASTA format: use EMBOSS [extractalign](http://www.bioinformatics.nl/cgi-bin/emboss/extractalign)
+    - Visualize this in the [Mview](http://www.ebi.ac.uk/Tools/msa/mview/) to observe the conservation
+- Save these sequences, save it in your text file (or keep the browsing window open)
+
+## (Weblogo 3](http://weblogo.threeplusone.com/create.cgi)
+
+**About:**
+
+`WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible.
+A sequence logo is a graphical representation of an amino acid or nucleic acid multiple sequence alignment. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. The width of the stack is proportional to the fraction of valid symbols in that position. (Positions with many gaps have thin stacks.)In general, a sequence logo provides a richer and more precise description of, for example,a binding site, than would a consensus sequence.`
+
+**When to use:**
+
+**What to expect:**
+
+**Input: aligned sequences**
+
+````
+>TR|A8K3Z6|A8K3Z6_HUMAN
+------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
+EPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
+------------------------------------------------------------
+--
+>SP|Q9BS18|APC13_HUMAN
+------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
+EPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
+------------------------------------------------------------
+--
+>SP|Q8R034|APC13_MOUSE
+------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLS----E----------LP
+EPEQDNGGTTES----------VKEQEMKWTDLALQGLHENVP-----------PAGN--
+------------------------------------------------------------
+--
+>TR|D4A427|D4A427_RAT
+------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLS----E----------LP
+EPEQDNGGTTES----------VKEQEMKWTDLALQGLHENVP-----------PTGN--
+------------------------------------------------------------
+--
+>TR|A1ZAZ9|A1ZAZ9_DROME
+------MDSQA---PIDDLLLDIVDNAWRMEVLPFDQILVPRE----K----------LP
+DPEADGGDSHLT----------VSEQEQKWTDLALGSLAPDAA-----------LIDQL-
+--NITSI-----------------------------------------------------
+--
+>TR|A1L2Q2|A1L2Q2_XENLA
+------MDSEV---LRDGRILDLIDDAWREDKLPYEDVTIPLN----E----------LP
+EPEQDNGGATES----------VKEQEMKWADLALQYLHENIP-----------SSGS--
+------------------------------------------------------------
+--
+>SP|Q5RBV4|APC13_PONAB
+------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
+DPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
+------------------------------------------------------------
+--
+>SP|O74358|APC13_SCHPO
+------MDSNYNYVHMNKPGVVLFASDWLKDRLPVDDVEVRVE----H----------LP
+PVTEDEMTIQHSSANLILMKNKQLRHEPAWKDLELEDLVNAFA-----------FIQGS-
+--SNAEGKNTIEDNFETDPFKSVKEAPMAPFLEANRRHQGEHASMRYFR-----------
+--
+>SP|Q12379|SWM1_YEAST
+MSSSSYRDSYFQYRHLPAPHHI-LYAEWNQDILALPDEVANITMAMKDNTRTDAEEGRAP
+QDGERNSNVRESAQGKALMTSEQ-NSNRYWNSFHDEDDWNLFNGMELESNGVVTFAGQAF
+DHSLNGGTNSRNDGA-NEPRKET---ITGSIFD------RRITQLAYARNNGWHELALPQ
+SR
+````
+
+
+## [MEME](http://meme-suite.org/)
+
+**About:**
+
+**When to use:**
+
+**What to expect:**
+
+**Input: non aligned raw FASTA sequences**
+
+````
+>tr|A8K3Z6|A8K3Z6_HUMAN Anaphase promoting complex subunit 13, isoform CRA_a OS=Homo sapiens GN=ANAPC13 PE=2 SV=1
+MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPEPEQDNGGTTESVKEQEMKWTDL
+ALQYLHENVPPIGN
+>sp|Q9BS18|APC13_HUMAN Anaphase-promoting complex subunit 13 OS=Homo sapiens GN=ANAPC13 PE=1 SV=1
+MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPEPEQDNGGTTESVKEQEMKWTDL
+ALQYLHENVPPIGN
+>sp|Q8R034|APC13_MOUSE Anaphase-promoting complex subunit 13 OS=Mus musculus GN=Anapc13 PE=3 SV=1
+MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLSELPEPEQDNGGTTESVKEQEMKWTDL
+ALQGLHENVPPAGN
+>tr|D4A427|D4A427_RAT Protein Anapc13 OS=Rattus norvegicus GN=Anapc13 PE=1 SV=1
+MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLSELPEPEQDNGGTTESVKEQEMKWTDL
+ALQGLHENVPPTGN
+>tr|A1ZAZ9|A1ZAZ9_DROME GH27216p1 OS=Drosophila melanogaster GN=fab1-RA PE=2 SV=1
+MDSQAPIDDLLLDIVDNAWRMEVLPFDQILVPREKLPDPEADGGDSHLTVSEQEQKWTDL
+ALGSLAPDAALIDQLNITSI
+>tr|A1L2Q2|A1L2Q2_XENLA LOC100037155 protein OS=Xenopus laevis GN=anapc13.2 PE=2 SV=1
+MDSEVLRDGRILDLIDDAWREDKLPYEDVTIPLNELPEPEQDNGGATESVKEQEMKWADL
+ALQYLHENIPSSGS
+>sp|Q5RBV4|APC13_PONAB Anaphase-promoting complex subunit 13 OS=Pongo abelii GN=ANAPC13 PE=3 SV=1
+MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPDPEQDNGGTTESVKEQEMKWTDL
+ALQYLHENVPPIGN
+>sp|O74358|APC13_SCHPO Anaphase-promoting complex subunit 13 OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=apc13 PE=1 SV=1
+MDSNYNYVHMNKPGVVLFASDWLKDRLPVDDVEVRVEHLPPVTEDEMTIQHSSANLILMK
+NKQLRHEPAWKDLELEDLVNAFAFIQGSSNAEGKNTIEDNFETDPFKSVKEAPMAPFLEA
+NRRHQGEHASMRYFR
+>sp|Q12379|SWM1_YEAST Anaphase-promoting complex subunit SWM1 OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=SWM1 PE=1 SV=1
+MSSSSYRDSYFQYRHLPAPHHILYAEWNQDILALPDEVANITMAMKDNTRTDAEEGRAPQ
+DGERNSNVRESAQGKALMTSEQNSNRYWNSFHDEDDWNLFNGMELESNGVVTFAGQAFDH
+SLNGGTNSRNDGANEPRKETITGSIFDRRITQLAYARNNGWHELALPQSR
+````
+
+[result](http://meme-suite.org/opal-jobs/appMEME_4.11.21478686878460-1748534452/meme.html)
 
-We will use UniProt interface for this.
 
 
 
-- 
GitLab