naslednja (next) in tretja generacija visokozmogljivega
Transcription
naslednja (next) in tretja generacija visokozmogljivega
NASLEDNJA (NEXT) IN TRETJA GENERACIJA VISOKOZMOGLJIVEGA SEKVENCIRANJA Naslednja generacija (HTS-NG) – PCR amplifikacija za ojačitev signala - Sekveniranje z amplifikacijo na kroglicah (Roche/454FLX) - Sekvenciranje s sintezo (Illumina/Solexa Genome analyzer) -Sekvenciranje z ligacijo (Applied Biosystems SOLID System) Tretja generacija – sekvenciranje brez ampifikacije - Heliscope (tSMS – true single molecule sequencing, 2007) - SMRT (single molecule real time sequencer) - RNAP (single molecule real time sequencer) - Nanopore DNA sequencer - Ion Torrent DNA sequencer SEKVENCIRANJE DNA http://universe-review.ca/R11-16-DNAsequencing.htm NASLEDNJA GENERACIJA SEKVENCIRANJA NG-HTS Skupne lastnosti NG-HTS: -Kompleksnost encimskih reakcij, kemije, programske in strojne opreme, optike, itd. -Premočrtna priprava knjižnic (vzorcev) pred sekvenciranjem. -Priprava knjižnic fragmentov DNA s prileganjem za platformo značilnih podaljškov (angl. Linker) in PCR amplifikacija. -PCR amplifikacija enoverižnih fragmentov knjižnice in sekveniranje pomnoženih fragmentov. (Roche – emulzijske kroglice; ABI – kloni kroglic; illumina – kloni mostičkov) General concepts for clonal-array generation and sequencing a | Bead-chips. Genomic DNA is fragmented and adaptors are ligated to create an insert library that is flanked by two universal priming sites. Because of the random fragmentation, the complexity of this signature sequence library is equivalent to the genome. This library is cloned on beads using emulsion PCR technology. A water-in-oil emulsion is created from a PCR mix that contains a limiting dilution of DNA and beads. The emulsion creates micro-compartments with, on average, a single bead and single DNA template each. After PCR, beads with clones are affinity selected and assembled onto a planar substrate. A subsequent cycle-sequencing reaction is used to read out the sequence on the clones. (Roche). b | Sequencing by synthesis (SBS). A common anchor primer is annealed to a constant sequence (universal priming site) that is contained within the library clones that are located on the polony (clonal bead) array (the orientation of the immobilized target might vary depending on the platform that is used). The sequence is read out by polymerase extension in a base-by-base fashion using either reversible terminators or sequential nucleotide addition (pyrosequencing). After incorporation of a single base or base type, the incorporated base is identified by fluorescence (laser) or chemiluminescence (no laser required) (Ilumina-Solexa). c | Sequencing by ligation. The polony array set-up is similar to SBS in which a common primer is annealed to an arrayed polony library and used to read out the sequence through a stepwise ligation of random oligomers. The labelled oligomers are designed to have random bases inserted at every site except the query site. The query site has one of four base substitutions, each matched to a particular fluorescent label on the oligonucleotide. After read-out of each ligation event, the primer and the ligated oligomer are stripped, a new primer reannealed and the process repeated with an oligomer that contains a query base at a different position (ABI). Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901 PRIMERJAVA PLATFORM ZA NG-HTS Splošni koncepti priprave klonalnih mrež in sekveniranje Bead chips Sequencing by synthesis Sequencing by ligation Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901 Roche/454 FLX Pyrosequencer Library fragments are mixed with agarose beads with oligos complementary to adapter sequences on the library. Each bead is associated with a single fragment. Each fragment-bead complex is isolated into individual oil:water micelles with PCR mixture. Thermal cycling of this emulsion PCR of the micelles produces amplified unique sequences on the bead surface. “En mass” sequencing of PCR products on picotiter plates (PTP) with single beads in each picowell. Enzyme/substrate containing beads for the pyrosequencing reaction are added to wells that act as floww cells for addition of individual pure nucleotide solutions. The CCD camera records the light emitted at each bead. Mardis E.R. Annual Review of Genomics and Human Genetics 9: 387-403 (2008). Principles of Pyrosequencing Watsonov and Crickov izpis po pirosekveniranju Timeline of the pyrosequencing development October 2005 Release of the Genome Sequencer 20, the first next-generation sequencing system on the market October 2005 Collaboration agreement signed with Roche Diagnostics . January 2007 Release of the Genome Sequencer FLX System March 2007 Roche Diagnostics completes integration with 454 Life Sciences May 2007 Complete sequence of Jim Watson published in Nature. First genome to be sequenced for less than $1 million. November 2007 Announcement of the 100th peer-reviewed publication enabled by 454 Sequencing June 2008 454 Joins the 1000 Genome Project, an international effort to build the most detailed map to date of human genetic variation as a tool for medical research September 2008 Announcement of the 250th peer-reviewed publication enabled by 454 Sequencing October 2008 Release of Genome Sequencer FLX Titanium Series reagents, featuring 1 million reads at 400 base pairs in length illumina sequencing technology is based on arrays of randomly assembled glass (silica) beads; the beads have oligonucleotides covalently attached to the surface; each bead has about one million oligos on its surface; all oligos on each bead have the same sequence Attached DNA fragments are extended and bridge amplified to create an ultra-high density sequencing flow cell with 80-100 million clusters, each containing ~1,000 copies of the same template. These templates are sequenced using a robust four-color DNA sequencing-by-synthesis technology that employs reversible terminators with removable fluorescent dyes. This novel approach ensures high accuracy and true base-by-base sequencing, eliminating sequence-context specific errors and enabling sequencing through homopolymers and repetitive sequences. the beads are randomly assembled on the arrays, and the location of a particular probe is initially unknown;a process called decoding is used to find the location of each bead; Illumina sequencing by synthesis Illumina sequencing by synthesis ABI- Ligation mediated sequencing Mardis E.R. Annual Review of Genomics and Human Genetics 9: 387-403 (2008). Structure of detector oligonucleotides First two nucleotides determine the colour of the fluorophore. Colour table show the relationship between dinucleotides and fluorophores. Four different dinucleotides (256 different oligonucleotides) correspond to each fluorophore. If first or second nucleotide (in dinucleotide) is known, colour is unambiguously related with the other nucleotide. Three next positions — degenerate nucleotides: 64 different versions for each particular dinucleotide. When ligated to the sequencing primer, only one from these 64 versions would fit to the position. Detector oligonucleotides (DO) are 8-mers fluorescently labeled on 3' end. DO's can't be too short, otherwise T4 ligase would not recognize them as a substrate. Altogether, there are 1024 different detection oligos: (dinucleotide + 3 degenerate)4=54. Three last positions: universal bases, they are the same for all detector oligonucleotides. Dark oligonucleotides have the same internal structure, but have no fluorophores. seq.molbiol.ru/sch_seq_ligase.html The principles of 2-base encoding/decoding Mardis E.R. Annual Review of Genomics and Human Genetics 9: 387-403 (2008). TRETJA GENERACIJA SEKVENCIRANJA - Heliscope (tSMS – true single molecule sequencing, 2007) - SMRT (single molecule real time sequencer) - RNAP (single molecule real time sequencer) - Nanopore DNA sequencer - Ion Torrent DNA sequencer TRETJA GENERACIJA SEKVENCIRANJA tSMS (Heliscope) – PRAVO SEKVENCIRANJE POSAMEZNIH MOLEKUL - Priprava knjižnice z naključno razgradnjo DNA Dodajanje poli-A repov Hibridizacija z oligo-dT v pretočnih celicah Sekvenciranje z dodajanjem posameznih fluorescentnig nukleotidov Odčitavanje signala Odčitki dolgi 55 bp, 8 dni za 28 Gb v enem zagonu reakcije tSMS (Heliscope) – ODČITAVANJE SIGNALA PRI SEKVENCIRANJU POSAMEZNIH MOLEKUL Slika, ki jo podaja HeliScope molekulski sekvenator. V povečavi je prikazana molekula DNA, ki je v tem ciklu vključila nukleotid “G”. http://www.helicosbio.com/Technology/TrueSingleMoleculeSequencing/tabid/64/Default.aspx SMRT (single molecule real time sequencer) SMRT celice, od katerih vsaka vsebuje tisoč “zero-mode waveguides” (ZMW). Ena molekula DNA polimeraze je pripeta na dno vsakega ZMV, kar omogoča opazovanje polimerizacije posamezne molekule DNA polimeraze. Opazujemo sintezo novonastale molekuče DNA s fluorescentno označenimi dNTP. SEKVENCIRANJE Z NANOPORAMI Ni barvnega označevanja nukleotidov in CCD detekcije. Izrablja transokacijo DNA čez nanopore in s tem povezane odčitke električnega signala. Nukleotid blokira ionski tok čez nanoporo – vsak nukleotid ima drugačno časovno periodo blokiranja! . Nanopora: alfa-hemolizin, kovalentno vezan na molekulo ciklodekstrina ION TORRENT TEHNOLOGIJA SEKVENCIRANJA PostLight™ tehnologija sekvenciranjem na polprevodniških čipih. Pri detekciji niso potrebni optični instrumenti. Direktna povezava med kemično in digitalno informacijo. MEHANIZEM VGRAJEVANJA NUKLEOTIDOV PRI SEKVENCIRARNJU Z ION TORRENTOM V naravi vgraditev vsakega nukleotida v verigo DNA vodi do sprostitev vodikovega protona (H+). Ion Torrent vsebuje visoko-gostotno mrežo mikroaparatur v žepkih (luknjicah), kjer se proces vgrajevanja nukelotidov dogaja masivno in paralelno. Vsak žepek vsebuje svojo molekulo DNA. Pod žepkom je za ione občutljiva plast (senzor), ki beleži pH spremembe zaradi sproščenih protonov. Ker se nukleotidi dodajajo zaporedno, vemo, zaradi katerega nukleotida je prišlo do spremembe pH. BELEŽENJE SIGNALA PRI VGRAJEVANJU NUKLEOTIDOV UPORABA NASLEDNJE GENERACIJE SEKVENCIRANJA HTS-NG od leta 2005 dalje revolucija v raziskavah genomov (tudi rastlinskih, živalskih, človeškega) V letu 2003 je NHGRI (National Human Genome Research Institute napove 100-kratno znižanje cene na bp v 5 letih in v 10 leith 10 000-kratno znižanje cene na bp, kar bi privedlo do “1000 $ genoma”. SNP genotyping illumina CNV genotyping Fan et al. Nature Reviews Genetics 7, 632–644 (August 2006) | doi:10.1038/nrg1901 RNA-seq -sekvencriranje cDNA z NG-HTS - Sekvenciranje RNA direktno s tretjo generacijo Chip-seq Povzetek • Nova generacija visokozmogljivostnega sekvenciranje omogoča razpoznavanje zaporedij DNA na ravni celega genoma, z resolucijo posameznega baznega para. • Iz vsakega vzorca se pripravi z adaptorji ligirana knjižnica, ki vsebuje vse v vzorcu prisotne fragmente DNA ali RNA (cDNA). • Vse platforme bazirajo na ligaciji adaptorjev in pomnoževanju, imajo pa različne pristope sekvenciranja: -Pirosekvenciranje (Roche-Nimblegen) -Sekvenciranje s sintezo (Illumina-Solexa) -Sekvenciranje z ligacijo (ABI) •Razvile so se tudi metode, ki pred sekvenciranjem ne potrebujejo pomnoževanja. Bionformatics www.gwumc.edu http://bioinformatics.ubc.ca/about/what_is_bioinformatics/images/computer.gif Bioinformatics Wikipedia Making sense of the huge amounts of DNA data produced by gene sequencing projects. Bioinformatics and computational biology involve the use of techniques from applied mathematics, informatics, statistics, and computer science to solve biological problems. Research in computational biology often overlaps with systems biology. Major research efforts in the field include sequence alignment, gene finding, genome assembly, protein structure alignment, protein structure prediction, prediction of gene expression and protein-protein interactions, and the modeling. The terms bioinformatics and computational biology are often used interchangeably, although the former typically focuses on algorithm development and specific computational methods, while the latter focuses more on hypothesis testing and discovery in the biological domain. Bioinformatics More hypothesis-driven research in computational biology. More technique-driven research in bioinformatics. A common thread in projects in bioinformatics and computational biology is the use of mathematical tools to extract useful information from noisy data produced by high-throughput biological techniques. A representative problem in bioinformatics is the assembly of high-quality DNA sequences from fragmentary "shotgun" DNA sequencing. In computational biology, a representative problem might be statistical testing of a hypothesis of common gene regulation using data from mRNA microarrays or mass spectrometry.