Skip to content
Snippets Groups Projects
Commit 0cd6842d authored by Malvika Sharan's avatar Malvika Sharan
Browse files

Update EMBOSS_EBI.md

parent 95f738f8
No related branches found
No related tags found
No related merge requests found
......@@ -20,4 +20,175 @@ For alignment tools, we will use human p53 and zebrafish dp53:
Example proteins for dotmatcher:
- ZKSC7_HUMAN: [Q9P0L1](http://www.uniprot.org/uniprot/Q9P0L1.fasta)
- MPDZ_HUMAN: [O75970](http://www.uniprot.org/uniprot/O75970.fasta)
\ No newline at end of file
- MPDZ_HUMAN: [O75970](http://www.uniprot.org/uniprot/O75970.fasta)
## Quick Demo on EMBOSS tools
...but before that, re-use/do the Clustal Omega analysis on your set of 10 P53 sequences. (or, go down this document to use my set of sequences ;) !)
- [extractalign](http://emboss.bioinformatics.nl/cgi-bin/emboss/extractalign)
- Swich to [Mview](http://www.ebi.ac.uk/Tools/msa/mview/) to visualize consensus
- Create consensus with [cons](http://emboss.bioinformatics.nl/cgi-bin/emboss/cons)
- Also check [consambig](http://emboss.bioinformatics.nl/cgi-bin/emboss/consambig): cons calculates a consensus sequence from a multiple sequence alignment. To obtain the consensus, the amino acid residue or nucleotide at each position is compared to the possible ambiguity codes using consambig. The consensus sequence uses the minimum ambiguity code match. The ambiguity characters were designed to encode positional variations found among families of related genes. Useful for DNA sequences.
- use [Merger](Merge two overlapping sequences) to merge two overlapping sequences. It uses a global alignment algorithm (Needleman & Wunsch) to optimally align the sequences. A merged sequence is generated from the alignment and writen to the output file. Also useful in case of DNA.
- [Dotmatcher](http://emboss.bioinformatics.nl/cgi-bin/emboss/dotmatcher) generates a dotplot from two input sequences. The dotplot is an intuitive graphical representation of the regions of similarity between two sequences. All positions from the first input sequence are compared with all positions from the second input sequence using a specified substitution matrix.
- [plotcon](http://emboss.bioinformatics.nl/cgi-bin/emboss/plotcon)
- [prettyplot](http://emboss.bioinformatics.nl/cgi-bin/emboss/prettyplot): claims to present alignment with pretty formatting (?)
### Set of P53 proteins:
Raw sequences
'''''''
>Mus musculus
MTAMEESQSDISLELPLSQETFSGLWKLLPPEDILPSPHCMDDLLLPQDVEEFFEGPSEALRVSGAPAAQDPVTETPGPV
APAPATPWPLSSFVPSQKTYQGNYGFHLGFLQSGTAKSVMCTYSPPLNKLFFQLAKTCPVQLWVSATPPAGSRVRAMAIY
KKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNLYPEYLEDRQTFRHSVVVPYEPPEAGSEYTTIHYKYMCNSSCM
GGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEVLCPELPPGSAKRALPTCTSASPPQKKKPL
DGEYFTLKIRGRKRFEMFRELNEALELKDAHATEESGDSRAHSSLQPRAFQALIKEESPNC
>Rattus norvegicus
MEDSQSDMSIELPLSQETFSCLWKLLPPDDILPTTATGSPNSMEDLFLPQDVAELLEGPEEALQVSAPAAQEPGTEAPAP
VAPASATPWPLSSSVPSQKTYQGNYGFHLGFLQSGTAKSVMCTYSISLNKLFCQLAKTCPVQLWVTSTPPPGTRVRAMAI
YKKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNPYAEYLDDRQTFRHSVVVPYEPPEVGSDYTTIHYKYMCNSSC
MGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEEHCPELPPGSAKRALPTSTSSSPQQKKKP
LDGEYFTLKIRGRERFEMFRELNEALELKDARAAEESGDSRAHSSLQPRTFQALIKKESPNC
>Mastomys natalensis
LPLSQETFQRLWKLLPPEAVLSEASPNSMDNMFLSPDVVNLLEGPEEALQVSAAPAAQDPVTETPAPAAPAPATPWPLSS
FVPSQKTYQGSYGFHLGFLQSGTAKSVMCTYSPSLNKLFCQLAKTCPVQLWVSDTPPAGSRVRAMAIYKKSQHMTEVVRR
CPHHERCTDGDGLAPPQHLIRVEGNLNAEYLDDKQTFRHSVVVPYEPPEVGSDYTTIHYKYMCNSSCMGGMNRRPILTII
TLEDSSGNLLGRDSFEVRICACPGRDRRTEEENFRKKEEPCPELPLGSAKRALPTGTSASPQQKKKRLDGEYFTLKIRGR
ERFEMFRELNEALELKDARAAEELGDSRAHSSYLKTKRGQSSSHHKKPMVKKVGPDSD
>Microtus ochrogaster
MEEPQSDLSIEPPLSQETFSDLWNLLPPNNVLSTSLSVDAMEDLFLSQDVANWLEEPNEGPQMSAAASTAEDPVTEAPAP
VTPAPVTSWPLSSSVPSQKTYQGEYGFRLGFLHSGTAKSVTCTYSPSLNKLFCQLAKTCPVQLWVSSTPPPGTRVRAMAI
YKKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNLRAEYLDDRQTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSC
MGGMNRRPILTIITLEDPSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPRPELPVGSTKRVLPTNTSSPQPKKKPL
DGEYFTLKIRGRERFKMFSELNEALELKDAQDANGSGDSRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
>Nannospalax galili
MEEQQSDLSIEPPLSQETFSDLWKLLPQNNVLSTPLSPNSMEDLLLSPEDVANWLDDPDEALQVPAAAITGDPVTETSAP
VAPPPATPWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPPLNKLFCQLAKTCPVQLWVDSTPPPGTRVRAMAI
YKKSQHMTEVVKRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSC
MGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGELCPELPPGSTKRALPTGTSSSPQPKKKP
LDGEYFTLKIRGRERFEMFRELNEALELKDTQAEKDSGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
>Eospalaxbaileyi
MEEPQSDLSIEPPLSQETFSDLWKLLPQNNVLSTSLSPNSMEDLLLSAEDVANWLDDPDDALRMPAAPVTEDPATEASAP
VAPPPATPWPLSSSVPSQKTYQGNYGFRLGFLHSGTAKSVTCTYSPCLNKLFCQLAKTCPVQLWVDSTPPPGTRVRAMAI
YKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSVIVPYEPPEVGSDCTTIHYNYMCNSSC
MGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGESCPELPPGSTKRALPTDTSSSPQPKKKP
LLDGEYFTLKIRGRERFEMFRELNEALELKDAQAEKESGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
>Eospalaxcansus
MEEPQSDLSIEPPLSQETFSDLWKLLPQNNVLSTSLSPNSMEDLLLSAEDVANWLDDPDDALRMPAAPVTEDPTTEASAP
VAPPPATPWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVACTYSPCLNKLFCQLAKTCPVQLWVDSTPPPGTRVRAMAI
YKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSC
MGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGESCPELPPGSTKRALPTGTSSSPQPKKKP
LLDGEYFTLKIRGRERFEMFRELNEALELKDAQAEKESGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
>Cricetulus griseus
MEEPQSDLSIELPLSQETFSDLWKLLPPNNVLSTLPSSDSIEELFLSENVTGWLEDSGGALQGVAAAAASTAEDPVTETP
APVASAPATPWPLSSSVPSYKTFQGDYGFRLGFLHSGTAKSVTCTYSPSLNKLFCQLAKTCPVQLWVNSTPPPGTRVRAM
AIYKKLQYMTEVVRRCPHHERSSEGDSLAPPQHLIRVEGNLHAEYLDDKQTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDPSGNLLGRNSFEVRICACPGRDRRTEEKNFQKKGEPCPELPPKSAKRALPTNTSSSPPPKK
KTLDGEYFTLKIRGHERFKMFQELNEALELKDAQASKGSEDNGAHSSYLKSKKGQSASRLKKLMIKREGPDSD
>Oryctolagus cuniculus
MSATAQAGPGGSQEASDPAAAMEESQSDLSLEPPLSQETFSDLWKLLPENNLLTTSLNPPVDDLLSAEDVANWLNEDPEE
GLRVPAAPAPEAPAPAAPALAAPAPATSWPLSSSVPSQKTYHGNYGFRLGFLHSGTAKSVTCTYSPCLNKLFCQLAKTCP
VQLWVDSTPPPGSRVRAMAIYKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEP
PEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPCPELPPG
SSKRALPTTTTDSSPQTKKKPLDGEYFILKIRGRERFEMFRELNEALELKDAQAEKEPGGSRAHSSYLKAKKGQSTSRHK
KPMFKREGPDSD
>Carlito syrichta
MEEPQSDLSIEPLSQETFSDLWKLLPENNVLSPSLSPPVDDLILSTEDIANWFSEGPDEALRTAPAPVAPTPAASTQAAP
APGTPWPLSSSVPSQKTYHGNYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQ
SQYMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDKTTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGG
MNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKGEPCSELPPGSTKRALPTSTSSPSQPKKKPLDG
EYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHTSHLKSKKGQSTSRHKKLMFKREGPDSD
'''''''
Aligned by Clustal Omega
'''''''
CLUSTAL O(1.2.3) multiple sequence alignment
Cricetulus ---------------------MEEPQSDLSIELPLSQETFSDLWKLLPPNNVLSTL--PS
Carlito ---------------------MEEPQSDLSIE-PLSQETFSDLWKLLPENNVLSPS--LS
Microtus ---------------------MEEPQSDLSIEPPLSQETFSDLWNLLPPNNVLSTS--LS
Oryctolagus MSATAQAGPGGSQEASDPAAAMEESQSDLSLEPPLSQETFSDLWKLLPENNLLTTS--LN
Nannospalax ---------------------MEEQQSDLSIEPPLSQETFSDLWKLLPQNNVLSTP--LS
Eospalaxbaileyi ---------------------MEEPQSDLSIEPPLSQETFSDLWKLLPQNNVLSTS--LS
Eospalaxcansus ---------------------MEEPQSDLSIEPPLSQETFSDLWKLLPQNNVLSTS--LS
Mastomys --------------------------------LPLSQETFQRLWKLLPPEAVLSE---AS
Mus ------------------MTAMEESQSDISLELPLSQETFSGLWKLLPPEDILPS-----
Rattus ---------------------MEDSQSDMSIELPLSQETFSCLWKLLPPDDILPTTATGS
*******. **:*** : :*
Cricetulus SDSIEELFL-SENVTGWLEDSGGALQGVAAAAASTAEDPVTETPAPVASAPATPWPLSSS
Carlito PP-VDDLILSTEDIANWFSEGPDE--ALRTAPAPV--APTPAASTQAAPAPGTPWPLSSS
Microtus VDAMEDLFL-SQDVANWLEEPNEG--PQMSAAASTAEDPVTEAPAPVTPAPVTSWPLSSS
Oryctolagus PP--VDDLLSAEDVANWLNEDPEE--GLRVPAAPAPEAPAPAAPALAAPAPATSWPLSSS
Nannospalax PNSMEDLLLSPEDVANWLD-DPDE--ALQVPAAAITGDPVTETSAPVAPPPATPWPLSSS
Eospalaxbaileyi PNSMEDLLLSAEDVANWLD-DPDD--ALRMPAAPVTEDPATEASAPVAPPPATPWPLSSS
Eospalaxcansus PNSMEDLLLSAEDVANWLD-DPDD--ALRMPAAPVTEDPTTEASAPVAPPPATPWPLSSS
Mastomys PNSMDNMFL-SPDVVNLLEGPEE---ALQVSAAPAAQDPVTETPAPAAPAPATPWPLSSF
Mus PHCMDDLLL-PQDVEEFFEGPSE---ALRVSGAPAAQDPVTETPGPVAPAPATPWPLSSF
Rattus PNSMEDLFL-PQDVAELLEGPEE---ALQVS-APAAQEPGTEAPAPVAPASATPWPLSSS
: :* :: :. * * : .: * *****
Cricetulus VPSYKTFQGDYGFRLGFLHSGTAKSVTCTYSPSLNKLFCQLAKTCPVQLWVNSTPPPGTR
Carlito VPSQKTYHGNYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTR
Microtus VPSQKTYQGEYGFRLGFLHSGTAKSVTCTYSPSLNKLFCQLAKTCPVQLWVSSTPPPGTR
Oryctolagus VPSQKTYHGNYGFRLGFLHSGTAKSVTCTYSPCLNKLFCQLAKTCPVQLWVDSTPPPGSR
Nannospalax VPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPPLNKLFCQLAKTCPVQLWVDSTPPPGTR
Eospalaxbaileyi VPSQKTYQGNYGFRLGFLHSGTAKSVTCTYSPCLNKLFCQLAKTCPVQLWVDSTPPPGTR
Eospalaxcansus VPSQKTYQGSYGFRLGFLHSGTAKSVACTYSPCLNKLFCQLAKTCPVQLWVDSTPPPGTR
Mastomys VPSQKTYQGSYGFHLGFLQSGTAKSVMCTYSPSLNKLFCQLAKTCPVQLWVSDTPPAGSR
Mus VPSQKTYQGNYGFHLGFLQSGTAKSVMCTYSPPLNKLFFQLAKTCPVQLWVSATPPAGSR
Rattus VPSQKTYQGNYGFHLGFLQSGTAKSVMCTYSISLNKLFCQLAKTCPVQLWVTSTPPPGTR
*** **::*.***:****:******* **** ***:* ************ *** *:*
Cricetulus VRAMAIYKKLQYMTEVVRRCPHHERSSEGDSLAPPQHLIRVEGNLHAEYLDDKQTFRHSV
Carlito VRAMAIYKQSQYMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRVEYLDDKTTFRHSV
Microtus VRAMAIYKKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNLRAEYLDDRQTFRHSV
Oryctolagus VRAMAIYKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDRNTFRHSV
Nannospalax VRAMAIYKKSQHMTEVVKRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSV
Eospalaxbaileyi VRAMAIYKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSV
Eospalaxcansus VRAMAIYKKSQHMTEVVRRCPHHERCSDSDGLAPPQHLIRVEGNLRAEYLDDKHTFRHSV
Mastomys VRAMAIYKKSQHMTEVVRRCPHHERCTDGDGLAPPQHLIRVEGNLNAEYLDDKQTFRHSV
Mus VRAMAIYKKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNLYPEYLEDRQTFRHSV
Rattus VRAMAIYKKSQHMTEVVRRCPHHERCSDGDGLAPPQHLIRVEGNPYAEYLDDRQTFRHSV
********: *:*****:*******.::.*.************* ***:*: ******
Cricetulus VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDPSGNLLGRNSFEVRICA
Carlito VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCA
Microtus VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDPSGNLLGRNSFEVRVCA
Oryctolagus VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCA
Nannospalax VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCA
Eospalaxbaileyi IVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCA
Eospalaxcansus VVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCA
Mastomys VVPYEPPEVGSDYTTIHYKYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRICA
Mus VVPYEPPEAGSEYTTIHYKYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCA
Rattus VVPYEPPEVGSDYTTIHYKYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCA
:*******.**: *****:************************ *******:*****:**
Cricetulus CPGRDRRTEEKNFQKKGEPCPELPPKSAKRALPTNTSSS-PPPKKKTLDGEYFTLKIRGH
Carlito CPGRDRRTEEENFRKKGEPCSELPPGSTKRALPTSTSS-PSQPKKKPLDGEYFTLQIRGR
Microtus CPGRDRRTEEENFRKKGEPRPELPVGSTKRVLPTNTS--SPQPKKKPLDGEYFTLKIRGR
Oryctolagus CPGRDRRTEEENFRKKGEPCPELPPGSSKRALPTTTTDSSPQTKKKPLDGEYFILKIRGR
Nannospalax CPGRDRRTEEENFRKKGELCPELPPGSTKRALPTGTSSSPQPKKKP-LDGEYFTLKIRGR
Eospalaxbaileyi CPGRDRRTEEENFRKKGESCPELPPGSTKRALPTDTSSSPQPKKKPLLDGEYFTLKIRGR
Eospalaxcansus CPGRDRRTEEENFRKKGESCPELPPGSTKRALPTGTSSSPQPKKKPLLDGEYFTLKIRGR
Mastomys CPGRDRRTEEENFRKKEEPCPELPLGSAKRALPTGTSAS-PQQKKKRLDGEYFTLKIRGR
Mus CPGRDRRTEEENFRKKEVLCPELPPGSAKRALPTCTSAS-PPQKKKPLDGEYFTLKIRGR
Rattus CPGRDRRTEEENFRKKEEHCPELPPGSAKRALPTSTSSS-PQQKKKPLDGEYFTLKIRGR
**********:**:** *** *:**.*** *: ** ****** *:***:
Cricetulus ERFKMFQELNEALELKDAQASKGSEDNGAHSSYLKSKKGQSASRLKKLMIKREGPDSD
Carlito ERFEMFRELNEALELKDAQAGKEPGGSRAHTSHLKSKKGQSTSRHKKLMFKREGPDSD
Microtus ERFKMFSELNEALELKDAQDANGSGDSRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
Oryctolagus ERFEMFRELNEALELKDAQAEKEPGGSRAHSSYLKAKKGQSTSRHKKPMFKREGPDSD
Nannospalax ERFEMFRELNEALELKDTQAEKDSGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
Eospalaxbaileyi ERFEMFRELNEALELKDAQAEKESGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
Eospalaxcansus ERFEMFRELNEALELKDAQAEKESGESRAHSSYLKSKKGQSTSRHKKLMIKREGPDSD
Mastomys ERFEMFRELNEALELKDARAAEELGDSRAHSSYLKTKRGQSSSHHKKPMVKKVGPDSD
Mus KRFEMFRELNEALELKDAHATEESGDSRAHSSLQPRAFQ--------ALIKEESPNC-
Rattus ERFEMFRELNEALELKDARAAEESGDSRAHSSLQPRTFQ--------ALIKKESPNC-
:**:** **********:: : . **:* :.*. .*:.
```````
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment