Outline Proteins Bioinformatics pioneers Outline
Transcription
Outline Proteins Bioinformatics pioneers Outline
Chapter 4: Using protein sequence databases Outline • Protein maturation • SWISS-PROT entry – Protein function • Protein-related sources Mankind is a catalyzing enzyme for the transition from a carbon-based to a silicon-based intelligence. __Gerard Bricogne Proteins • NOT GenBank, SWISS-PROT is central • Swiss Institute of Bioinformatics (SIB) and European Bioinformatics Institute (EBI) • One man show in the beginning – Amos Bairoch • Only the most current version of data is available • Lagging – automated TrEMBL (TRanslation of EMBL nucleotide sequences) Bioinformatics pioneers • • • • Amos Bairoch (Swiss-Prot) Elvin Kabat (Immunoglobulin sequence) Richard J. Robert (Restrictases) Victor McCusick (OMIM) Outline • Protein maturation • SWISS-PROT entry – Protein function • Protein-related sources Protein maturation • From translated ORF to mature protein • WYSIWYG • Posttranslational modifications: – – – – – cut removal chemical modification (methylation) addition of lipid (myristylation) addition of sugar (glycosylation) • Question: what are other posttranslational modifications? 1 Final destination of protein • Attached to cell membrane • Secreted outside the cell • Transported into the periplasm (region near a bacterial cell wall, outside the plasma membrane) • Transported into organelle • Transported into nucleus Combination of folds and functions • Compact and stable 3-D structure • Several „independent“ domains (rigid bricks) – few out of thousands • Domains are identifiable by scaffold sequence signatures Swiss-Prot background knowledge Outline • Protein maturation • SWISS-PROT entry – Protein function • Protein-related sources SwissProt • www.expasy.ch/sprot/ • Proteins are shorter than genes. • Proteins are defined on a single strand, with a clear beginning and ending. • Posttranslational modification does NOT change the amino acid order. – Question: are there any proteins without corresponding nucleotide sequences? 2 Swiss-Prot entry sections • • • • • Information part Bibliographic part Functional part Feature table Sequence part Tip Example: EGF-R • Epidermal growth factor receptor Access to Swiss-Prot manual • Entry name – sometimes difficult to decipher (i.e. LYSA_DROME), therefore primary accession number is the identifier • Secondary accession numbers – historic identifiers 3 Name and the origin of protein References Comments Comments • Does your BLAST match make sense? 4 Outline • Protein maturation • SWISS-PROT entry – Protein function • Protein-related sources Cross-reference Swiss2D page – proteomics link Cross-references • Links to other databases • Open in a new browser window • PIR - Protein Information Resource of Georgetown University • PDB/HSSP – link to X-ray structure of protein/its homologue PDB Protein DataBank Glycosylation patterns Gene nomenclature 5 Profiles/domains InterPro Database of Interacting Proteins (DIP) Tip • Do NOT rely on free services of forprofit companies. Features Features • • • • • • • • • • • SIGNAL signal peptide CHAIN mature peptidic chain DOMAIN – extracellular domain TRANSMEM – transmembrane domain DOMAIN – intracellular domain REPEAT – motif repeated elsewhere in the protein NP-BIND – nucleotide phosphate binding BINDING – precise binding site for ATP VARSPLIC – alternative splicing CONFLICT – errors or polymorphisms ACT_SITE, DISULPHID, MOD_RES, CARBOHYD… 6 Outline • Protein maturation • SWISS-PROT entry Modified aminoacids • PIR-RESID database part of iProClass • http://pir.georgetown.edu/ – Protein function • Protein-related sources Modified aminoacids Glycan Structure Database • https://tmat.proteomesystems.com/glycosuite/ glycodb 7 Lipid Bank • http://lipidbank.jp/ ChemIDPlus ChemIDPlus • Molecular query • http://chem.sis.nlm.nih.gov/chemidplus/ Biochemical pathways • www.expasy.ch/cgi-bin/search-biochem-index KEGG • http://www.genome.ad.jp/kegg/kegg2.html • Kyoto Encyclopedia of Genes and Genomes 8 Enzymes • http://www.brenda.uni-koeln.de/ E.coli genes and metabolism • http://www.ecocyc.org/ Enzyme nomenclature • http://www.chem.qmw.ac.uk/iubmb/ Protein structure: PDB • PDB http://www.rcsb.org/pdb/Welcome.do • Requires JavaScript Protein structure: MMDB • MMDB MacroMolecular 3D DataBase http://www.ncbi.nlm.nih.gov/Structure/ Protein structure: SCOP • SCOP Structural Classification of Proteins http://scop.mrc-lmb.cam.ac.uk/scop/ 9 Protein structure: Swiss-Model Protein structure homology-modeling http://www.expasy.org/swissmod/SWISSMODEL.html Specialized protein databases • Immunogenetics: http://imgt.cines.fr/ • Restrictases: http://rebase.neb.com/rebase/rebase.html • Carbohydrate-active enzymes: http://afmb.cnrsmrs.fr/CAZY/ • Proteases: http://merops.sanger.ac.uk/cgibin/merops.cgi • Kinases: http://pkr.sdsc.edu/html/index.shtml • Nuclear receptors: http://nrr.georgetown.edu/ • Brain: http://senselab.med.yale.edu/senselab/ • Ortologues: http://www.ncbi.nlm.nih.gov/COG/new/ Reverse translation • http://www.bioinformatics.vg/bioinform atics_tools/reversetranslate.shtml • V = A, C or G = Not T (letter after) • D = A, G or T = Not C • H = A, C or T = Not G • B = C, G or T = Not A 10
Similar documents
Data problems: How to spot them and what to do
what would be nice : movie of crystal as it rotates during data collection for now : analyse timestamp in image header (MAR, ADSC, Pilatus, RIgaku, ...) C. Vonrhein, Global Phasing Ltd
More information