Outline Proteins Bioinformatics pioneers Outline

Transcription

Outline Proteins Bioinformatics pioneers Outline
Chapter 4: Using protein
sequence databases
Outline
• Protein maturation
• SWISS-PROT entry
– Protein function
• Protein-related sources
Mankind is a catalyzing enzyme for the transition from a
carbon-based to a silicon-based intelligence.
__Gerard Bricogne
Proteins
• NOT GenBank, SWISS-PROT is central
• Swiss Institute of Bioinformatics (SIB) and
European Bioinformatics Institute (EBI)
• One man show in the beginning – Amos
Bairoch
• Only the most current version of data is
available
• Lagging – automated TrEMBL (TRanslation of
EMBL nucleotide sequences)
Bioinformatics pioneers
•
•
•
•
Amos Bairoch (Swiss-Prot)
Elvin Kabat (Immunoglobulin sequence)
Richard J. Robert (Restrictases)
Victor McCusick (OMIM)
Outline
• Protein maturation
• SWISS-PROT entry
– Protein function
• Protein-related sources
Protein maturation
• From translated ORF to mature protein
• WYSIWYG
• Posttranslational modifications:
–
–
–
–
–
cut
removal
chemical modification (methylation)
addition of lipid (myristylation)
addition of sugar (glycosylation)
• Question: what are other posttranslational
modifications?
1
Final destination of protein
• Attached to cell membrane
• Secreted outside the cell
• Transported into the periplasm (region
near a bacterial cell wall, outside the
plasma membrane)
• Transported into organelle
• Transported into nucleus
Combination of folds and functions
• Compact and stable 3-D structure
• Several „independent“ domains (rigid
bricks) – few out of thousands
• Domains are identifiable by scaffold
sequence signatures
Swiss-Prot
background knowledge
Outline
• Protein maturation
• SWISS-PROT entry
– Protein function
• Protein-related sources
SwissProt
• www.expasy.ch/sprot/
• Proteins are shorter than genes.
• Proteins are defined on a single strand,
with a clear beginning and ending.
• Posttranslational modification does NOT
change the amino acid order.
– Question: are there any proteins without
corresponding nucleotide sequences?
2
Swiss-Prot entry sections
•
•
•
•
•
Information part
Bibliographic part
Functional part
Feature table
Sequence part
Tip
Example: EGF-R
• Epidermal growth factor receptor
Access to Swiss-Prot manual
• Entry name – sometimes difficult to
decipher (i.e. LYSA_DROME), therefore
primary accession number is the
identifier
• Secondary accession numbers – historic
identifiers
3
Name and the origin of protein
References
Comments
Comments
• Does your BLAST match make sense?
4
Outline
• Protein maturation
• SWISS-PROT entry
– Protein function
• Protein-related sources
Cross-reference
Swiss2D page – proteomics link
Cross-references
• Links to other databases
• Open in a new browser window
• PIR - Protein Information Resource of
Georgetown University
• PDB/HSSP – link to X-ray structure of
protein/its homologue
PDB Protein DataBank
Glycosylation patterns
Gene nomenclature
5
Profiles/domains
InterPro
Database of Interacting Proteins (DIP)
Tip
• Do NOT rely on free services of forprofit companies.
Features
Features
•
•
•
•
•
•
•
•
•
•
•
SIGNAL signal peptide
CHAIN mature peptidic chain
DOMAIN –
extracellular domain
TRANSMEM – transmembrane domain
DOMAIN –
intracellular domain
REPEAT –
motif repeated elsewhere in the
protein
NP-BIND –
nucleotide phosphate binding
BINDING –
precise binding site for ATP
VARSPLIC – alternative splicing
CONFLICT – errors or polymorphisms
ACT_SITE, DISULPHID, MOD_RES, CARBOHYD…
6
Outline
• Protein maturation
• SWISS-PROT entry
Modified aminoacids
• PIR-RESID database part of iProClass
• http://pir.georgetown.edu/
– Protein function
• Protein-related sources
Modified aminoacids
Glycan Structure Database
• https://tmat.proteomesystems.com/glycosuite/
glycodb
7
Lipid Bank
• http://lipidbank.jp/
ChemIDPlus
ChemIDPlus
• Molecular query
• http://chem.sis.nlm.nih.gov/chemidplus/
Biochemical pathways
• www.expasy.ch/cgi-bin/search-biochem-index
KEGG
• http://www.genome.ad.jp/kegg/kegg2.html
• Kyoto Encyclopedia of Genes and Genomes
8
Enzymes
• http://www.brenda.uni-koeln.de/
E.coli genes and metabolism
• http://www.ecocyc.org/
Enzyme nomenclature
• http://www.chem.qmw.ac.uk/iubmb/
Protein structure: PDB
• PDB http://www.rcsb.org/pdb/Welcome.do
• Requires JavaScript
Protein structure: MMDB
• MMDB MacroMolecular 3D DataBase
http://www.ncbi.nlm.nih.gov/Structure/
Protein structure: SCOP
• SCOP Structural Classification of Proteins
http://scop.mrc-lmb.cam.ac.uk/scop/
9
Protein structure: Swiss-Model
Protein structure homology-modeling
http://www.expasy.org/swissmod/SWISSMODEL.html
Specialized protein databases
• Immunogenetics: http://imgt.cines.fr/
• Restrictases:
http://rebase.neb.com/rebase/rebase.html
• Carbohydrate-active enzymes: http://afmb.cnrsmrs.fr/CAZY/
• Proteases: http://merops.sanger.ac.uk/cgibin/merops.cgi
• Kinases: http://pkr.sdsc.edu/html/index.shtml
• Nuclear receptors: http://nrr.georgetown.edu/
• Brain: http://senselab.med.yale.edu/senselab/
• Ortologues:
http://www.ncbi.nlm.nih.gov/COG/new/
Reverse translation
• http://www.bioinformatics.vg/bioinform
atics_tools/reversetranslate.shtml
• V = A, C or G = Not T (letter after)
• D = A, G or T = Not C
• H = A, C or T = Not G
• B = C, G or T = Not A
10

Similar documents

Data problems: How to spot them and what to do

Data problems: How to spot them and what to do what would be nice : movie of crystal as it rotates during data collection for now : analyse timestamp in image header (MAR, ADSC, Pilatus, RIgaku, ...) C. Vonrhein, Global Phasing Ltd

More information