Skip to content
Snippets Groups Projects
Commit e1bc8290 authored by Malvika Sharan's avatar Malvika Sharan
Browse files

Update motif_visualization.md

parent f56379cd
No related branches found
No related tags found
No related merge requests found
# Motif visualization using [Weblogo](http://weblogo.berkeley.edu/logo.cgi) and [MEME](http://meme-suite.org/)
# Motif visualization using [Weblogo](http://weblogo.threeplusone.com/create.cgi) and [MEME](http://meme-suite.org/)
### Example protein: Anaphase promotion coplex
......@@ -8,9 +8,133 @@ MVNTDNKENEPPNMEKAHMDSSNALYRVQRPLQRRPLQELSIELVKPSQTITVKKSKKST
NSSSYFAQLHAASGQNPPPSVHSSHKQPSKARSPNPLLSMR
````
### BLAST this sequence to get all the homologs (remote homology)
### Query table for selected homologs (remote homology) of anaphase promoting complex
|Organism|Entry|
|:----------:|:----------:|
|Homo sapiens (Human)|Q9BS18|
|Mus musculus (Mouse)|Q8R034|
|Rattus norvegicus (Rat)|D4A427|
|Drosophila melanogaster (Fruit fly)|A1ZAZ9|
|Xenopus laevis (African clawed frog)|A1L2Q2|
|Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii)|Q5RBV4|
|Schizosaccharomyces pombe (strain 972 / ATCC 24843) (Fission yeast)|O74358|
|Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast)|Q12379|
**Exercise to extract query sequences**
- Extract the FASTA sequences for the UniProt ids from the table, save it in your text file (or keep the browsing window open)
- Use it for multiple sequence alignment: use either [Clustal Omega](https://www.ebi.ac.uk/Tools/msa/clustalo/) directly or via [align UniProt](http://www.uniprot.org/align/)
- Extract the MSA in individual aligned FASTA format: use EMBOSS [extractalign](http://www.bioinformatics.nl/cgi-bin/emboss/extractalign)
- Visualize this in the [Mview](http://www.ebi.ac.uk/Tools/msa/mview/) to observe the conservation
- Save these sequences, save it in your text file (or keep the browsing window open)
## (Weblogo 3](http://weblogo.threeplusone.com/create.cgi)
**About:**
`WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible.
A sequence logo is a graphical representation of an amino acid or nucleic acid multiple sequence alignment. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. The width of the stack is proportional to the fraction of valid symbols in that position. (Positions with many gaps have thin stacks.)In general, a sequence logo provides a richer and more precise description of, for example,a binding site, than would a consensus sequence.`
**When to use:**
**What to expect:**
**Input: aligned sequences**
````
>TR|A8K3Z6|A8K3Z6_HUMAN
------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
EPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
------------------------------------------------------------
--
>SP|Q9BS18|APC13_HUMAN
------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
EPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
------------------------------------------------------------
--
>SP|Q8R034|APC13_MOUSE
------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLS----E----------LP
EPEQDNGGTTES----------VKEQEMKWTDLALQGLHENVP-----------PAGN--
------------------------------------------------------------
--
>TR|D4A427|D4A427_RAT
------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLS----E----------LP
EPEQDNGGTTES----------VKEQEMKWTDLALQGLHENVP-----------PTGN--
------------------------------------------------------------
--
>TR|A1ZAZ9|A1ZAZ9_DROME
------MDSQA---PIDDLLLDIVDNAWRMEVLPFDQILVPRE----K----------LP
DPEADGGDSHLT----------VSEQEQKWTDLALGSLAPDAA-----------LIDQL-
--NITSI-----------------------------------------------------
--
>TR|A1L2Q2|A1L2Q2_XENLA
------MDSEV---LRDGRILDLIDDAWREDKLPYEDVTIPLN----E----------LP
EPEQDNGGATES----------VKEQEMKWADLALQYLHENIP-----------SSGS--
------------------------------------------------------------
--
>SP|Q5RBV4|APC13_PONAB
------MDSEV---QRDGRILDLIDDAWREDKLPYEDVAIPLN----E----------LP
DPEQDNGGTTES----------VKEQEMKWTDLALQYLHENVP-----------PIGN--
------------------------------------------------------------
--
>SP|O74358|APC13_SCHPO
------MDSNYNYVHMNKPGVVLFASDWLKDRLPVDDVEVRVE----H----------LP
PVTEDEMTIQHSSANLILMKNKQLRHEPAWKDLELEDLVNAFA-----------FIQGS-
--SNAEGKNTIEDNFETDPFKSVKEAPMAPFLEANRRHQGEHASMRYFR-----------
--
>SP|Q12379|SWM1_YEAST
MSSSSYRDSYFQYRHLPAPHHI-LYAEWNQDILALPDEVANITMAMKDNTRTDAEEGRAP
QDGERNSNVRESAQGKALMTSEQ-NSNRYWNSFHDEDDWNLFNGMELESNGVVTFAGQAF
DHSLNGGTNSRNDGA-NEPRKET---ITGSIFD------RRITQLAYARNNGWHELALPQ
SR
````
## [MEME](http://meme-suite.org/)
**About:**
**When to use:**
**What to expect:**
**Input: non aligned raw FASTA sequences**
````
>tr|A8K3Z6|A8K3Z6_HUMAN Anaphase promoting complex subunit 13, isoform CRA_a OS=Homo sapiens GN=ANAPC13 PE=2 SV=1
MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPEPEQDNGGTTESVKEQEMKWTDL
ALQYLHENVPPIGN
>sp|Q9BS18|APC13_HUMAN Anaphase-promoting complex subunit 13 OS=Homo sapiens GN=ANAPC13 PE=1 SV=1
MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPEPEQDNGGTTESVKEQEMKWTDL
ALQYLHENVPPIGN
>sp|Q8R034|APC13_MOUSE Anaphase-promoting complex subunit 13 OS=Mus musculus GN=Anapc13 PE=3 SV=1
MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLSELPEPEQDNGGTTESVKEQEMKWTDL
ALQGLHENVPPAGN
>tr|D4A427|D4A427_RAT Protein Anapc13 OS=Rattus norvegicus GN=Anapc13 PE=1 SV=1
MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLSELPEPEQDNGGTTESVKEQEMKWTDL
ALQGLHENVPPTGN
>tr|A1ZAZ9|A1ZAZ9_DROME GH27216p1 OS=Drosophila melanogaster GN=fab1-RA PE=2 SV=1
MDSQAPIDDLLLDIVDNAWRMEVLPFDQILVPREKLPDPEADGGDSHLTVSEQEQKWTDL
ALGSLAPDAALIDQLNITSI
>tr|A1L2Q2|A1L2Q2_XENLA LOC100037155 protein OS=Xenopus laevis GN=anapc13.2 PE=2 SV=1
MDSEVLRDGRILDLIDDAWREDKLPYEDVTIPLNELPEPEQDNGGATESVKEQEMKWADL
ALQYLHENIPSSGS
>sp|Q5RBV4|APC13_PONAB Anaphase-promoting complex subunit 13 OS=Pongo abelii GN=ANAPC13 PE=3 SV=1
MDSEVQRDGRILDLIDDAWREDKLPYEDVAIPLNELPDPEQDNGGTTESVKEQEMKWTDL
ALQYLHENVPPIGN
>sp|O74358|APC13_SCHPO Anaphase-promoting complex subunit 13 OS=Schizosaccharomyces pombe (strain 972 / ATCC 24843) GN=apc13 PE=1 SV=1
MDSNYNYVHMNKPGVVLFASDWLKDRLPVDDVEVRVEHLPPVTEDEMTIQHSSANLILMK
NKQLRHEPAWKDLELEDLVNAFAFIQGSSNAEGKNTIEDNFETDPFKSVKEAPMAPFLEA
NRRHQGEHASMRYFR
>sp|Q12379|SWM1_YEAST Anaphase-promoting complex subunit SWM1 OS=Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GN=SWM1 PE=1 SV=1
MSSSSYRDSYFQYRHLPAPHHILYAEWNQDILALPDEVANITMAMKDNTRTDAEEGRAPQ
DGERNSNVRESAQGKALMTSEQNSNRYWNSFHDEDDWNLFNGMELESNGVVTFAGQAFDH
SLNGGTNSRNDGANEPRKETITGSIFDRRITQLAYARNNGWHELALPQSR
````
[result](http://meme-suite.org/opal-jobs/appMEME_4.11.21478686878460-1748534452/meme.html)
We will use UniProt interface for this.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment