From 1665de4f092544966e74f3644dee6cc312b14254 Mon Sep 17 00:00:00 2001
From: Malvika Sharan <malvika.sharan@embl.de>
Date: Tue, 8 Nov 2016 16:03:29 +0100
Subject: [PATCH] Update EMBOSS_EBI.md

---
 TeachingMaterials/EMBOSS_EBI.md | 81 ++++++++++++++++++++++++++++++++-
 1 file changed, 79 insertions(+), 2 deletions(-)

diff --git a/TeachingMaterials/EMBOSS_EBI.md b/TeachingMaterials/EMBOSS_EBI.md
index f9a0cd2..5da3a0a 100644
--- a/TeachingMaterials/EMBOSS_EBI.md
+++ b/TeachingMaterials/EMBOSS_EBI.md
@@ -14,14 +14,91 @@ Wageningen Bioinformatics Webportal, Netherlands offers [a graphical user interf
 
 ## Example proteins
 
-For alignment tools, we will use human p53 and zebrafish dp53:
+For pairwise alignment tools, we can use human p53 and zebrafish dp53:
 - Human p53: [P04637](http://www.uniprot.org/uniprot/P04637.fasta)
 - Zebrafish tp53: [P79734](http://www.uniprot.org/uniprot/P79734.fasta)
 
-Example proteins for dotmatcher:
+````
+>P53_HUMAN|P04637| Cellular tumor antigen p53 OS=Homo sapiens GN=TP53 PE=1 SV=4
+MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
+DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
+SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
+RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
+SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
+PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
+GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD
+
+>P53_DANRE|P79734| Cellular tumor antigen p53 OS=Danio rerio GN=tp53 PE=1 SV=1
+MAQNDSQEFAELWEKNLIIQPPGGGSCWDIINDEEYLPGSFDPNFFENVLEEQPQPSTLP
+PTSTVPETSDYPGDHGFRLRFPQSGTAKSVTCTYSPDLNKLFCQLAKTCPVQMVVDVAPP
+QGSVVRATAIYKKSEHVAEVVRRCPHHERTPDGDNLAPAGHLIRVEGNQRANYREDNITL
+RHSVFVPYEAPQLGAEWTTVLLNYMCNSSCMGGMNRRPILTIITLETQEGQLLGRRSFEV
+RVCACPGRDRKTEESNFKKDQETKTMAKTTTGTKRSLVKESSSATLRPEGSKKAKGSSSD
+EEIFTLQVRGRERYEILKKLNDSLELSDVVPASDAEKYRQKFMTKNKKENRESSEPKQGK
+KLMVKDEGRSDSD
+
+````
+
+For dotmatcher we can use these sequences:
 - ZKSC7_HUMAN: [Q9P0L1](http://www.uniprot.org/uniprot/Q9P0L1.fasta)
 - MPDZ_HUMAN: [O75970](http://www.uniprot.org/uniprot/O75970.fasta)
 
+````
+>ZKSC7_HUMAN|Q9P0L1| Zinc finger protein with KRAB and SCAN domains 7 OS=Homo sapiens GN=ZKSCAN7 PE=1 SV=2
+MTTAGRGNLGLIPRSTAFQKQEGRLTVKQEPANQTWGQGSSLQKNYPPVCEIFRLHFRQL
+CYHEMSGPQEALSRLRELCRWWLMPEVHTKEQILELLVLEQFLSILPGELRTWVQLHHPE
+SGEEAVAVVEDFQRHLSGSEEVSAPAQKQEMHFEETTALGTTKESPPTSPLSGGSAPGAH
+LEPPYDPGTHHLPSGDFAQCTSPVPTLPQVGNSGDQAGATVLRMVRPQDTVAYEDLSVDY
+TQKKWKSLTLSQRALQWNMMPENHHSMASLAGENMMKGSELTPKQEFFKGSESSNRTSGG
+LFGVVPGAAETGDVCEDTFKELEGQTSDEEGSRLENDFLEITDEDKKKSTKDRYDKYKEV
+GEHPPLSSSPVEHEGVLKGQKSYRCDECGKAFNRSSHLIGHQRIHTGEKPYECNECGKTF
+RQTSQLIVHLRTHTGEKPYECSECGKAYRHSSHLIQHQRLHNGEKPYKCNECAKAFTQSS
+RLTDHQRTHTGEKPYECNECGEAFIRSKSLARHQVLHTGKKPYKCNECGRAFCSNRNLID
+HQRIHTGEKPYECSECGKAFSRSKCLIRHQSLHTGEKPYKCSECGKAFNQNSQLIEHERI
+HTGEKPFECSECGKAFGLSKCLIRHQRLHTGEKPYKCNECGKSFNQNSHLIIHQRIHTGE
+KPYECNECGKVFSYSSSLMVHQRTHTGEKPYKCNDCGKAFSDSSQLIVHQRVHTGEKPYE
+CSECGKAFSQRSTFNHHQRTHTGEKSSGLAWSVS
+
+>sp|O75970|MPDZ_HUMAN Multiple PDZ domain protein OS=Homo sapiens GN=MPDZ PE=1 SV=2
+MLEAIDKNRALHAAERLQTKLRERGDVANEDKLSLLKSVLQSPLFSQILSLQTSVQQLKD
+QVNIATSATSNIEYAHVPHLSPAVIPTLQNESFLLSPNNGNLEALTGPGIPHINGKPACD
+EFDQLIKNMAQGRHVEVFELLKPPSGGLGFSVVGLRSENRGELGIFVQEIQEGSVAHRDG
+RLKETDQILAINGQALDQTITHQQAISILQKAKDTVQLVIARGSLPQLVSPIVSRSPSAA
+STISAHSNPVHWQHMETIELVNDGSGLGFGIIGGKATGVIVKTILPGGVADQHGRLCSGD
+HILKIGDTDLAGMSSEQVAQVLRQCGNRVKLMIARGAIEERTAPTALGITLSSSPTSTPE
+LRVDASTQKGEESETFDVELTKNVQGLGITIAGYIGDKKLEPSGIFVKSITKSSAVEHDG
+RIQIGDQIIAVDGTNLQGFTNQQAVEVLRHTGQTVLLTLMRRGMKQEAELMSREDVTKDA
+DLSPVNASIIKENYEKDEDFLSSTRNTNILPTEEEGYPLLSAEIEEIEDAQKQEAALLTK
+WQRIMGINYEIVVAHVSKFSENSGLGISLEATVGHHFIRSVLPEGPVGHSGKLFSGDELL
+EVNGITLLGENHQDVVNILKELPIEVTMVCCRRTVPPTTQSELDSLDLCDIELTEKPHVD
+LGEFIGSSETEDPVLAMTDAGQSTEEVQAPLAMWEAGIQHIELEKGSKGLGFSILDYQDP
+IDPASTVIIIRSLVPGGIAEKDGRLLPGDRLMFVNDVNLENSSLEEAVEALKGAPSGTVR
+IGVAKPLPLSPEEGYVSAKEDSFLYPPHSCEEAGLADKPLFRADLALVGTNDADLVDEST
+FESPYSPENDSIYSTQASILSLHGSSCGDGLNYGSSLPSSPPKDVIENSCDPVLDLHMSL
+EELYTQNLLQRQDENTPSVDISMGPASGFTINDYTPANAIEQQYECENTIVWTESHLPSE
+VISSAELPSVLPDSAGKGSEYLLEQSSLACNAECVMLQNVSKESFERTINIAKGNSSLGM
+TVSANKDGLGMIVRSIIHGGAISRDGRIAIGDCILSINEESTISVTNAQARAMLRRHSLI
+GPDIKITYVPAEHLEEFKISLGQQSGRVMALDIFSSYTGRDIPELPEREEGEGEESELQN
+TAYSNWNQPRRVELWREPSKSLGISIVGGRGMGSRLSNGEVMRGIFIKHVLEDSPAGKNG
+TLKPGDRIVEVDGMDLRDASHEQAVEAIRKAGNPVVFMVQSIINRPRKSPLPSLLHNLYP
+KYNFSSTNPFADSLQINADKAPSQSESEPEKAPLCSVPPPPPSAFAEMGSDHTQSSASKI
+SQDVDKEDEFGYSWKNIRERYGTLTGELHMIELEKGHSGLGLSLAGNKDRSRMSVFIVGI
+DPNGAAGKDGRLQIADELLEINGQILYGRSHQNASSIIKCAPSKVKIIFIRNKDAVNQMA
+VCPGNAVEPLPSNSENLQNKETEPTVTTSDAAVDLSSFKNVQHLELPKDQGGLGIAISEE
+DTLSGVIIKSLTEHGVAATDGRLKVGDQILAVDDEIVVGYPIEKFISLLKTAKMTVKLTI
+HAENPDSQAVPSAAGAASGEKKNSSQSLMVPQSGSPEPESIRNTSRSSTPAIFASDPATC
+PIIPGCETTIEISKGRTGLGLSIVGGSDTLLGAIIIHEVYEEGAACKDGRLWAGDQILEV
+NGIDLRKATHDEAINVLRQTPQRVRLTLYRDEAPYKEEEVCDTLTIELQKKPGKGLGLSI
+VGKRNDTGVFVSDIVKGGIADADGRLMQGDQILMVNGEDVRNATQEAVAALLKCSLGTVT
+LEVGRIKAGPFHSERRPSQSSQVSEGSLSSFTFPLSGSSTSESLESSSKKNALASEIQGL
+RTVEMKKGPTDSLGISIAGGVGSPLGDVPIFIAMMHPTGVAAQTQKLRVGDRIVTICGTS
+TEGMTHTQAVNLLKNASGSIEMQVVAGGDVSVVTGHQQEPASSSLSFTGLTSSSIFQDDL
+GPPQCKSITLERGPDGLGFSIVGGYGSPHGDLPIYVKTVFAKGAASEDGRLKRGDQIIAV
+NGQSLEGVTHEEAVAILKRTKGTVTLMVLS
+
+````
+
+
 ## Quick Demo on EMBOSS tools
 
 ...but before that, re-use/do the Clustal Omega analysis on your set of 10 P53 sequences. (or, go down this document to use my set of sequences ;) !)
-- 
GitLab