From abf691476973f968da098a45ac5cd825110f5999 Mon Sep 17 00:00:00 2001 From: Malvika Sharan <malvika.sharan@embl.de> Date: Sat, 5 Nov 2016 12:42:36 +0100 Subject: [PATCH] Add new file --- TeachingMaterials/protein_database.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 TeachingMaterials/protein_database.md diff --git a/TeachingMaterials/protein_database.md b/TeachingMaterials/protein_database.md new file mode 100644 index 0000000..043224c --- /dev/null +++ b/TeachingMaterials/protein_database.md @@ -0,0 +1,24 @@ +# Proteins + +## Introduction + +Proteins are macromolecules, constituted of long chains of amino acid residues of varying lengths inferred from the corresponding nucleotide sequences of their genes. Proteins are the building block of our body and they are involved in a wide range of biological functions within organisms, that include DNA replication, catalysis of metabolic reactions, response to stimuli, interaction with other biomolecules for pathway regulation, stability, transport, localization or degradation. + +## Protein databases + +A biological database is an organized collection of a particular type of datasets compiled from a large number of scientifc publications and discoveries, for example, biological sequences or different -omics (transcriptomics, proteomics, metagenomics) data, specific type of annotations, structural data, chemical compounds, biological pathways etc. + +The Protein databases contain entries for each protein sequence from all the known proteome sets. There are few well know protein databases like the National Center for Biotechnology Information Reference Sequence project, UniProtKB/SWISS-Prot and the DNA Databank of Japan Amino Acid Sequence Database. + +Protein records are available mainly in text formats that include sequence entries as FASTA and their corresponding annotations in XML formats. The protein entries are generally linked to external resources, allowing users to find relevant data such as literature (Pubmed), genes (NCBI, GenBank database), biological pathways (KEGG database), structures (PDB database), corresponding DNA/RNA sequences, sequence homologs, and expression and variation data. + +## Hands-on sessions on protein databases + +#### 1. [National Center for Biotechnology Information - NCBI](https://www.ncbi.nlm.nih.gov/) +#### 2. [UniProt Knowledgebase](https://www.ebi.ac.uk/uniprot) +##### Swissprot and Trembl +##### Cross-reference +##### Other resources for proteins + + + -- GitLab