Commit ea9303a7 authored by Hugo Carlos's avatar Hugo Carlos
Browse files

Upload New File

parent 993ec2d3
# Proteins
## Introduction
Proteins are macromolecules, constituted of long chains of amino acid residues of varying lengths inferred from the corresponding nucleotide sequences of their genes. Proteins are the building block of our body and they are involved in a wide range of biological functions within organisms, that include DNA replication, catalysis of metabolic reactions, response to stimuli, interaction with other biomolecules for pathway regulation, stability, transport, localization or degradation.
## Protein databases
A biological database is an organized collection of a particular type of datasets compiled from a large number of scientifc publications and discoveries, for example, biological sequences or different -omics (transcriptomics, proteomics, metagenomics) data, specific type of annotations, structural data, chemical compounds, biological pathways etc.
The Protein databases contain entries for each protein sequence from all the known proteome sets. There are few well known protein databases like the National Center for Biotechnology Information Reference Sequence project, UniProtKB/SWISS-Prot and the DNA Databank of Japan Amino Acid Sequence Database.
Protein records are available mainly in text formats that include sequence entries as FASTA and their corresponding annotations in XML formats. The protein entries are generally linked to external resources, allowing users to find relevant data such as literature (Pubmed), genes (NCBI, GenBank database), biological pathways (KEGG database), structures (PDB database), corresponding DNA/RNA sequences, sequence homologs, and expression and variation data.
## Hands-on sessions on protein databases
#### 1. [National Center for Biotechnology Information - NCBI](
The NCBI interface provides aceess to several journals and bioinfomatics resources.
In this course, we will use several protein related resources of NCBI.
###### Example proteins:
* **Tumor protein P53**: a tumor suppressor protein in human, the absence of which allows many cancers to proliferate.
###### Search method:
* Text/term search in [All fields] (simply type in your query)
* Limiting the search using [filters]
- Organism [ORGN]
- Source database
- Genetic component
- Bio-chemical/physical properties etc.
* Combining multiple search criteria by boolean AND, OR, NOT
* Browsing by taxonomy (right side of the screen)
###### Select one record of your choice
* Browse the GenPept entry
- Identical proteins
- FASTA entry
- Graphical representation of the features
- Other linked data
- Articles
- Pathways
- Reference sequences
- Homologs
- Related information
- Link-outs
- Analysis options (we will explore these later)
- Domains
- Sequence features
- Regular expression
- Tertiary structure
- Multiple alignment by COBALT
#### 2. [UniProt Knowledgebase](
- Swissprot and Trembl
- Cross-reference
- Other resources for proteins
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment