From 1665de4f092544966e74f3644dee6cc312b14254 Mon Sep 17 00:00:00 2001 From: Malvika Sharan <malvika.sharan@embl.de> Date: Tue, 8 Nov 2016 16:03:29 +0100 Subject: [PATCH] Update EMBOSS_EBI.md --- TeachingMaterials/EMBOSS_EBI.md | 81 ++++++++++++++++++++++++++++++++- 1 file changed, 79 insertions(+), 2 deletions(-) diff --git a/TeachingMaterials/EMBOSS_EBI.md b/TeachingMaterials/EMBOSS_EBI.md index f9a0cd2..5da3a0a 100644 --- a/TeachingMaterials/EMBOSS_EBI.md +++ b/TeachingMaterials/EMBOSS_EBI.md @@ -14,14 +14,91 @@ Wageningen Bioinformatics Webportal, Netherlands offers [a graphical user interf ## Example proteins -For alignment tools, we will use human p53 and zebrafish dp53: +For pairwise alignment tools, we can use human p53 and zebrafish dp53: - Human p53: [P04637](http://www.uniprot.org/uniprot/P04637.fasta) - Zebrafish tp53: [P79734](http://www.uniprot.org/uniprot/P79734.fasta) -Example proteins for dotmatcher: +```` +>P53_HUMAN|P04637| Cellular tumor antigen p53 OS=Homo sapiens GN=TP53 PE=1 SV=4 +MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP +DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK +SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE +RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS +SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP +PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG +GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD + +>P53_DANRE|P79734| Cellular tumor antigen p53 OS=Danio rerio GN=tp53 PE=1 SV=1 +MAQNDSQEFAELWEKNLIIQPPGGGSCWDIINDEEYLPGSFDPNFFENVLEEQPQPSTLP +PTSTVPETSDYPGDHGFRLRFPQSGTAKSVTCTYSPDLNKLFCQLAKTCPVQMVVDVAPP +QGSVVRATAIYKKSEHVAEVVRRCPHHERTPDGDNLAPAGHLIRVEGNQRANYREDNITL +RHSVFVPYEAPQLGAEWTTVLLNYMCNSSCMGGMNRRPILTIITLETQEGQLLGRRSFEV +RVCACPGRDRKTEESNFKKDQETKTMAKTTTGTKRSLVKESSSATLRPEGSKKAKGSSSD +EEIFTLQVRGRERYEILKKLNDSLELSDVVPASDAEKYRQKFMTKNKKENRESSEPKQGK +KLMVKDEGRSDSD + +```` + +For dotmatcher we can use these sequences: - ZKSC7_HUMAN: [Q9P0L1](http://www.uniprot.org/uniprot/Q9P0L1.fasta) - MPDZ_HUMAN: [O75970](http://www.uniprot.org/uniprot/O75970.fasta) +```` +>ZKSC7_HUMAN|Q9P0L1| Zinc finger protein with KRAB and SCAN domains 7 OS=Homo sapiens GN=ZKSCAN7 PE=1 SV=2 +MTTAGRGNLGLIPRSTAFQKQEGRLTVKQEPANQTWGQGSSLQKNYPPVCEIFRLHFRQL +CYHEMSGPQEALSRLRELCRWWLMPEVHTKEQILELLVLEQFLSILPGELRTWVQLHHPE +SGEEAVAVVEDFQRHLSGSEEVSAPAQKQEMHFEETTALGTTKESPPTSPLSGGSAPGAH +LEPPYDPGTHHLPSGDFAQCTSPVPTLPQVGNSGDQAGATVLRMVRPQDTVAYEDLSVDY +TQKKWKSLTLSQRALQWNMMPENHHSMASLAGENMMKGSELTPKQEFFKGSESSNRTSGG +LFGVVPGAAETGDVCEDTFKELEGQTSDEEGSRLENDFLEITDEDKKKSTKDRYDKYKEV +GEHPPLSSSPVEHEGVLKGQKSYRCDECGKAFNRSSHLIGHQRIHTGEKPYECNECGKTF +RQTSQLIVHLRTHTGEKPYECSECGKAYRHSSHLIQHQRLHNGEKPYKCNECAKAFTQSS +RLTDHQRTHTGEKPYECNECGEAFIRSKSLARHQVLHTGKKPYKCNECGRAFCSNRNLID +HQRIHTGEKPYECSECGKAFSRSKCLIRHQSLHTGEKPYKCSECGKAFNQNSQLIEHERI +HTGEKPFECSECGKAFGLSKCLIRHQRLHTGEKPYKCNECGKSFNQNSHLIIHQRIHTGE +KPYECNECGKVFSYSSSLMVHQRTHTGEKPYKCNDCGKAFSDSSQLIVHQRVHTGEKPYE +CSECGKAFSQRSTFNHHQRTHTGEKSSGLAWSVS + +>sp|O75970|MPDZ_HUMAN Multiple PDZ domain protein OS=Homo sapiens GN=MPDZ PE=1 SV=2 +MLEAIDKNRALHAAERLQTKLRERGDVANEDKLSLLKSVLQSPLFSQILSLQTSVQQLKD +QVNIATSATSNIEYAHVPHLSPAVIPTLQNESFLLSPNNGNLEALTGPGIPHINGKPACD +EFDQLIKNMAQGRHVEVFELLKPPSGGLGFSVVGLRSENRGELGIFVQEIQEGSVAHRDG +RLKETDQILAINGQALDQTITHQQAISILQKAKDTVQLVIARGSLPQLVSPIVSRSPSAA +STISAHSNPVHWQHMETIELVNDGSGLGFGIIGGKATGVIVKTILPGGVADQHGRLCSGD +HILKIGDTDLAGMSSEQVAQVLRQCGNRVKLMIARGAIEERTAPTALGITLSSSPTSTPE +LRVDASTQKGEESETFDVELTKNVQGLGITIAGYIGDKKLEPSGIFVKSITKSSAVEHDG +RIQIGDQIIAVDGTNLQGFTNQQAVEVLRHTGQTVLLTLMRRGMKQEAELMSREDVTKDA +DLSPVNASIIKENYEKDEDFLSSTRNTNILPTEEEGYPLLSAEIEEIEDAQKQEAALLTK +WQRIMGINYEIVVAHVSKFSENSGLGISLEATVGHHFIRSVLPEGPVGHSGKLFSGDELL +EVNGITLLGENHQDVVNILKELPIEVTMVCCRRTVPPTTQSELDSLDLCDIELTEKPHVD +LGEFIGSSETEDPVLAMTDAGQSTEEVQAPLAMWEAGIQHIELEKGSKGLGFSILDYQDP +IDPASTVIIIRSLVPGGIAEKDGRLLPGDRLMFVNDVNLENSSLEEAVEALKGAPSGTVR +IGVAKPLPLSPEEGYVSAKEDSFLYPPHSCEEAGLADKPLFRADLALVGTNDADLVDEST +FESPYSPENDSIYSTQASILSLHGSSCGDGLNYGSSLPSSPPKDVIENSCDPVLDLHMSL +EELYTQNLLQRQDENTPSVDISMGPASGFTINDYTPANAIEQQYECENTIVWTESHLPSE +VISSAELPSVLPDSAGKGSEYLLEQSSLACNAECVMLQNVSKESFERTINIAKGNSSLGM +TVSANKDGLGMIVRSIIHGGAISRDGRIAIGDCILSINEESTISVTNAQARAMLRRHSLI +GPDIKITYVPAEHLEEFKISLGQQSGRVMALDIFSSYTGRDIPELPEREEGEGEESELQN +TAYSNWNQPRRVELWREPSKSLGISIVGGRGMGSRLSNGEVMRGIFIKHVLEDSPAGKNG +TLKPGDRIVEVDGMDLRDASHEQAVEAIRKAGNPVVFMVQSIINRPRKSPLPSLLHNLYP +KYNFSSTNPFADSLQINADKAPSQSESEPEKAPLCSVPPPPPSAFAEMGSDHTQSSASKI +SQDVDKEDEFGYSWKNIRERYGTLTGELHMIELEKGHSGLGLSLAGNKDRSRMSVFIVGI +DPNGAAGKDGRLQIADELLEINGQILYGRSHQNASSIIKCAPSKVKIIFIRNKDAVNQMA +VCPGNAVEPLPSNSENLQNKETEPTVTTSDAAVDLSSFKNVQHLELPKDQGGLGIAISEE +DTLSGVIIKSLTEHGVAATDGRLKVGDQILAVDDEIVVGYPIEKFISLLKTAKMTVKLTI +HAENPDSQAVPSAAGAASGEKKNSSQSLMVPQSGSPEPESIRNTSRSSTPAIFASDPATC +PIIPGCETTIEISKGRTGLGLSIVGGSDTLLGAIIIHEVYEEGAACKDGRLWAGDQILEV +NGIDLRKATHDEAINVLRQTPQRVRLTLYRDEAPYKEEEVCDTLTIELQKKPGKGLGLSI +VGKRNDTGVFVSDIVKGGIADADGRLMQGDQILMVNGEDVRNATQEAVAALLKCSLGTVT +LEVGRIKAGPFHSERRPSQSSQVSEGSLSSFTFPLSGSSTSESLESSSKKNALASEIQGL +RTVEMKKGPTDSLGISIAGGVGSPLGDVPIFIAMMHPTGVAAQTQKLRVGDRIVTICGTS +TEGMTHTQAVNLLKNASGSIEMQVVAGGDVSVVTGHQQEPASSSLSFTGLTSSSIFQDDL +GPPQCKSITLERGPDGLGFSIVGGYGSPHGDLPIYVKTVFAKGAASEDGRLKRGDQIIAV +NGQSLEGVTHEEAVAILKRTKGTVTLMVLS + +```` + + ## Quick Demo on EMBOSS tools ...but before that, re-use/do the Clustal Omega analysis on your set of 10 P53 sequences. (or, go down this document to use my set of sequences ;) !) -- GitLab