qPCR measures quantities of target sequences
Transcription
qPCR measures quantities of target sequences
qPCR measures quantities of target sequences qPCR stands for quantitative PCR Targets can be DNA, cDNA or RNA DNA: present in a sample ? number of copies in a sample ? -> copy number analysis ! Plasmids need to be linearized for qPCR cDNA: quantify expression levels -> gene expression analysis expression = amount of mRNA in a sample mRNA cDNA to prevent degradation reverse transcription RNA: quantify expression levels of small non-coding RNA -> miRNA profiling http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0009545 The qPCR protocol Growing the samples RNA extraction DNase treatment cDNA synthesis Primer design qPCR Starting with good samples: RNA extraction Phenol/chloroform Spin columns High yield Lower yield Better with troublesome tissues High gDNA contamination Low gDNA contamination Residual chemicals may inhibit PCR Clean Small RNA’s are not retained mRNA or miRNA extraction ? -> different kits recommended: use a kit that can extract both e.g. Qiagen miRNeasy use same samples for mRNA and miRNA profiling Quantifying RNA Spectrometry: UV spec or Nanodrop Highly variable (don’t trust absorptions < 0.1) No distinction RNA/DNA Microfluidic analytics: Agilent Bioanalyzer, BioRad’s Experion More accurate Specifically measure RNA Fluorescent dye detection: RiboGreen Dye Very sensitive Expensive Specifically measure RNA Always use the same method of quantification ! Checking RNA quality RNA quality - Integrity: RNA breaks down easily - Purity: RNA can contain inhibitors of qPCR reaction ! Don’t aim for high quality, aim for equal quality between samples ! Checking RNA integrity by microfluidic electrophoresis Evaluates integrity of 18S and 28S rRNA You look at rRNA while you’re interested in mRNA If you see these peaks there’s no degradation http://www.iata.csic.es/IATA/usct/geno/Doc/Bibliografia/Quality%20assessment%20of%20total%20RNA.pdf Checking RNA integrity by 5’-3’ mRNA ratio Perform reverse transcription on RNA Use qPCR to measure HPRT1 in cDNA using two primer sets HPRT1 5’ Cq 3’ Cq 5’ No degradation: Cq 5’ = Cq 3’ Degradation: Cq 5’ <<< Cq 3’ HPRT1 is a reference gene with stable, low expression levels HPRT1 is expressed in all organisms AAAAAAA 3’ Inhibitors distort the qPCR reaction No inhibitors Inhibitors: shift Cq to right e.g. phenol, ethanol... Checking RNA purity by SPUD assay SPUD = potato gene with no homology to any known sequence Add equal amounts SPUD to each RNA sample Create controls: positive: SPUD + heparin (a known inhibitor) negative: SPUD + water Perform qPCR to measure SPUD SPUD + water SPUD + heparin SPUD + RNA1 SPUD + RNA2 SPUD + RNA3 Cq=22 Cq=27 Cq=22 Cq=26 Cq=22 Clean samples have same Cq as water Samples with inhibitors have higher Cq ΔCq > 1: presence of inhibitors Starting with good samples: DNase treatment To remove genomic DNA contamination from mRNA samples Recommended but never 100% efficient Solutions: • Intron or exon-exon junction spanning primers: not always feasible gDNA will generate a longer or no PCR product • No RT control to detect gDNA contamination Starting with good samples: cDNA synthesis To transform RNA into more stable cDNA Prepare all cDNAs that you are going to use in a single batch Primers: Biogazelle uses a mix of random and oligodT primers Validation: RNA RNA 1/4 RNA 1/16 RNA 1/64 RNA 1/256 cDNA cDNA cDNA cDNA cDNA Cq=18 Cq=20 Cq=22 Cq=24 Cq=26 Create RNA dilution series Synthesize cDNA qPCR Cqs follow dilution series: good cDNA synthesis cDNA synthesis + DNase treatment directly on cells: Ambion Cells-to-CT kit Measuring the cDNA concentation Quantification of cDNA is not required After cDNA synthesis you don’t have pure cDNA but a mixture of RNA, cDNA, dNTPs Spectrometry based method will measure the mixture not the cDNA Better to quantify RNA after purification (less contaminants) Use equal amounts of RNA for cDNA synthesis Different ways to measure amount of PCR product Fluorescent dyes: e.g. Sybr Green, EvaGreen... Fluorophore-containing DNA probes: e.g. Taqman... Fluorescent dye binds to ds DNA Denaturation: dye is released and fluorescence reduces Polymerization: primers anneal and ds PCR product is formed Dye binds to ds product and fluorescence increases ID&T pros and cons of both methods http://www.slideshare.net/idtdna/technical-tips-for-qpcr Plot of the fluorescent signal (Rn) during PCR Each PCR cycle: PCR product x2 => signal x2 Rn Reagents run out Signal of unbound dye Not enough product to detect signal Cycle Signal becomes detectable Complete introduction to qPCR http://pathmicro.med.sc.edu/pcr/realtime-home.htm Threshold for the transition of baseline to exponential phase ∆Rn = Rn -baseline = Cq ABI instruments use different thresholds for different plates ! Not Ok: Use the same threshold on every plate ! Cq depends on initial amount of target in sample Signal becomes detectable at 32 copies of PCR product sample Initial amount ID of target After PCR cycle 1 After PCR cycle 2 After PCR cycle 3 After PCR cycle 4 After PCR cycle 5 A 1 2 4 8 16 32 B 2 4 8 16 32 64 C 4 8 16 32 64 128 D 8 16 32 64 128 256 Cq(D) = 2 Cq(C) = 3 Cq(B) = 4 Cq(A) = 5 B/A = 2 : two-fold difference in initial amount of target => Cq(A) – Cq(B) = 1 C/A = 4 : four-fold difference in initial amount of target => Cq(A) – Cq(C) = 2 D/A = 8 : eight-fold difference in initial amount of target => Cq(A) – Cq(D) = 3 qPCR generates Relative Quantities (RQ) more DNA/RNA less DNA/RNA ∆Cq = difference between 2 Cq’s 0 1 2 3 4 10 11... Check on qPCR instrument Detection of abnormal amplification: visual inspection of the curves http://www.slideshare.net/idtdna/troubleshooting-qpcr-what-are-my-amplification-curves-telling-me-33646672 Generation of melting curves After qPCR: T gradually ↑ => PCR products melts dye loosens signal ↓ Inspection of melting curves -dF/dT OK: 1 sharp peak Primer dimer: broad flat peak at ↓T Product with 2 regions: GC/AT rich Aspecific product Visualization of PCR products on gel What to do when you see abnormal curves? Possible causes: How to check? You see no amplification Gene is not expressed in this sample Are other genes expressed in this sample? Sample was degraded Same for all genes in this sample ? Wrong instrument settings Same for all wells ? Low PCR efficiency (low curves with delayed Cq values) Bad primer design (SNP) Same for all wells that measure this gene? Presence of inhibitors Same for all wells that include this sample? High PCR efficiency (Efficiency-values > 2.1) Primer dimers or aspecific products Check melt curves + put products on gel Contamination with gDNA Same for all wells that include this sample? Biogazelle’s design guidelines regarding PCR products degraded material: 50-80 bp good material: 80-150 bp Fluorescent dyes: amount signal depends on length PCR product longer products generate more signal PCR products should all have ± same length Avoid repeats and domains (high conservation: no specific primers) Take into account splice variants and SNPs ID&T guide for designinging primers to differentiate between similar genes http://www.slideshare.net/idtdna/qpcr-design-strategies-for-specific-applications Biogazelle’s qPCR primer design guidelines Intron or exon-exon junction spanning length: 9-30 bp (ideally: 20 bp) melting temperature Tm: 58-60 °C (ideally: 59°C) maximum Tm difference between primers: 2°C GC content: 30-80% (ideally: 50%) 5 nucleotides at 3’ end < 3 G or C avoid runs > 3 identical nucleotides Primers should not bind themselves / each other Same primers ordered from different vendors will give different results -> variation in amount of inhibitors in the primers Recommended: IDT primers qPCR primer design software Primer Design Primer3Plus: http://primer3plus.com/cgi-bin/dev/primer3plus.cgi PrimerQuest: https://eu.idtdna.com/PrimerQuest/Home/Index Primer-BLAST: http://blast.ncbi.nlm.nih.gov/ Checking primer dimerization properties Oligo Analyzer: https://eu.idtdna.com/analyzer/Applications/OligoAnalyzer/ Checking primer specificity BLAST: http://blast.ncbi.nlm.nih.gov/ BiSearch: http://bisearch.enzim.hu/ qPCR primer design software Checking secondary structures in the PCR product UNAfold: http://mfold.rna.albany.edu/ Commercial -> not free Checking location of SNPs UCSC in silico PCR + SNPs track: http://genome.ucsc.edu/cgi-bin/hgPcr User interface of primer BLAST Primers cannot lie in region e.g. 100-1000bp Impossible unless excluded region is at the start/end of the input sequence Primers must surround region closely From To e.g. 100-1000bp Forward primer 100 Also see intron parameters below Reverse primer 1000 1100 Primers must lie in region e.g. 100-1000bp From To Forward primer 100 Reverse primer 1000 PrimerBLAST offers less but better parameters Exon junction containing primers Intron spanning primers PrimerBLAST performs BLAST during primer design so that it returns only specific primers Direct link to BLAST Parameters of the BLAST search It is important that the 3’ end of a primer is specific since extension is done here Results of the Primer-BLAST search Self complementary: score of local alignment of primer to itself Scoring system: +1 for a match, -1 for a mismatch and -2 for a gap Lower score => more mismatches => better primers Self 3'complementarity scores: score of global alignment of primer to itself Try binding the 3'-end of the primer to an identical primer and score the best binding Lower score => more mismatches => better primers Tips for improving Primer-BLAST searches Enter RefSeq accessions rather than sequences as template If you want to measure just a part of the sequence still use accessions and specify a range Accessions give PrimerBLAST access to all info of the RefSeq record -> Allows better distinction between intended template and off-targets Choose a non-redundant database: RefSeq or Genome database (not nr !) Specify your organism for the database to search in Exclude predicted transcripts from database search if you are not concerned about these. OligoAnalyzer to check primer characteristics https://eu.idtdna.com/analyzer/Applications/OligoAnalyzer/ Have large impact on Tm (± 10°C) Realistic settings for PCR General characteristics e.g. Tm, GC content… Can primers bind themselves ? Can primers bind each other ? Assess effect of SNPs on Tm SNPs can have large impact on Tm Primer Template SNP Tm lowers because of mismatch Same annealT in PCR => amount of PCR product significantly reduced If the SNP occurs at the final base at the 3’ end: no PCR product will be formed Paper on effect of SNPs on PCR http://www.ncbi.nlm.nih.gov/pubmed/24014836 UNAfold secondary structure prediction of PCR product Secondary structure overlapping primer annealing site Do not use these primers ! Checking primer / probe specificity Primer specificity Always: inspection of melting curves First time use: gel analysis Probe specificity Negative control (sample in which gene is not expressed) Wiki : Exercises on primer design Take home message: overview of recommendations Different location pre- and post PCR 96 well versus 384 well cheaper more data easier to pipet smaller volumes Manual versus robotic pipetting 96 well 384 well multichannel pipets: calibrate regularly ! Never use volumes < 1 µl (pipetting errors too high) Check quality of pipets: pilot experiment with same sample-target in all wells -> differences in Cq should be < 0.2 Sample maximization In most qPCR experiments you want to compare expression levels between samples For each gene you test if it’s DE in one group of samples compared to another group You have to put all the samples for the same gene on the same plate In the few cases that you compare genes You have to put all the genes for the same sample on the same plate Qbase+ Hierarchy: Projects < Experiments < Runs = data files coming from qPCR instrument < Cq values for targets in samples Genes = Targets Housekeeping genes = Reference Targets Ct = Cq Difference between qbase+ and classical ΔΔCt method Classic method Qbase+ You assume the amount of PCR product 1. Gene-specific amplification efficiencies doubles each PCR cycle One reference gene 2. Multiple reference genes Difficult to combine data from separate runs 3. Inter-run calibration 4. Rescale expression values ! Error propagation is handled by the software ! Qbase+ demonstration Gene expression analysis Exercise on BITS wiki page Analysis of geNorm pilot experiment Exercise on BITS wiki page Replicates Technical replicates: - same cDNA in different wells - same RNA different cDNA preparations - same samples different RNA purifications Not independent -> averaging at the start of the analysis Biological replicates: - different samples, same treatment Independent -> Used in statistical analysis Technical replicates are expected to have similar Cq values Choose biological replicates over technical replicates At least 4 biological replicates Technical versus biological replicates Technical replicates: repeated measurements on same sample e.g. You take 1 control and 1 treated mouse and perform 3 replicate measurements n=1 You cannot do statistics here, you can only report the average of the replicates Technical versus biological replicates Biological replicates: independently performed experiments on different samples e.g. You take 3 control and 3 treated mice and do measurements on each mouse n=3 Now you can compare control and treated mice ? If the mice come from the same litter Technical or biological replicates ? What if the mice come from the same litter ? You can treat them as biological replicates but the conclusions that you draw will only apply to mice of that litter Mice from other parents might behave differently ? What if the mice come from the same strain ? What if the mice come from the same strain? You can treat them as biological replicates but the conclusions that you draw will only apply to mice of that strain Mice from other strains might behave differently => The definition of biological replication depends on what you want to prove What if the mice come from the same strain? You can treat them as biological replicates but the conclusions that you draw will only apply to mice of that strain Mice from other strains might behave differently => The definition of biological replication depends on what you want to prove Choosing the strategy for handling technical replicates Qbase+ automatically recognizes technical replicates: same run same target same sample First step in the analysis: averaging Cq values of technical replicates Only use when you have at least 3 replicates Settings -> Calculation parameters Negative controls No template: no cDNA but primers are present signal -> primer dimers contamination of reagents with cDNA include in every run No RT: DNase treated RNA instead of cDNA signal: gDNA contamination include for every RNA extraction Biological control: sample in which gene is not expressed Cq of samples of interest should be well below Cq of negative controls Positive controls Biological control: sample in which gene is expressed synthetic templates e.g. Ultramer oligonucleotides Amplification efficiencies Efficiencies of primer pairs vary Each time you use a new primer pair you should check efficiency How? qPCR on serial dilution of representative sample = mix of all samples used in study in similar concentrations as used in later qPCRs => 1 efficiency for each target Efficiencies in qbase+ expressed between 1 and 2: 100% = 2, 80% = 1.8 ... If efficiencies of all primer pairs between 1.9 and 2.1 => use 2 for all targets Otherwise use target-specific effiencies or design other primers Setting the amplification efficiencies You only have to include dilution series only once The next time you use these primers you can fill in the efficiencies Calculation of amplification efficiencies Calculation of amplification efficiencies Most diluted Linear regression to fit a line through the data Slope of the line is used to canculate the efficiency E Deviation of data from the line is used to calculate error SE(E) Least diluted Normalization Variability between samples with no biological cause: differences in RNA integrity different amounts of cDNA different enzyme efficiencies presence of inhibitors Remove this variability using housekeeping genes = genes with same expression in all samples Variability of housekeeping genes is measure of technical variability Expression values of housekeeping genes are adjusted to remove variability Same adjustment is done on all samples => 1 normalization factor for each sample Choosing the normalization strategy Use with > 50 genes NF based on all genes instead of HK only Single cell PCR Defining how stable the reference genes should be Settings -> Quality control settings M and CV are measures for the stability of the expression of genes The lower they are, the more stable the expression Thresholds based on experiment done by Biogazelle: qPCR on 85 human samples from ≠ tissues to check expression of published HK genes Fly / plants: you may increase thresholds to 1 and 0.5 Checking if the reference genes are stable Quality control -> Reference target stability Green: reference genes pass the thresholds that you have set for M and CV Red: reference genes do not pass the thresholds that you haved set for M and CV What to do ? remove the most unstable reference gene (top) How ? define it as a target of interest 3 Housekeeping genes: you have a backup if one of them fails ! Normalization Why 2 housekeeping genes ? You can check their stability Why 3 housekeeping genes ? If something goes wrong with one of them you still have 2 left How to determine number and which housekeeping genes ? GeneVestigator geNorm pilot experiment: qPCR on at least 8 candidates in at least 10 samples from GeneVestigator different pathways No need to repeat housekeeping genes on each plate ! from study Choosing the scaling strategy OK Avoid: large error bars Gene expression analysis Copy number analysis Choose a reference: one sample / control group of samples / average of all samples Set expression level of reference to 1 Scale all other expression levels accordingly You just change the scale of the expression values, not the proportions ! Helpful for interpretation of results on bar chart: is my gene DE ? What are the bars on the bar chart of individual samples ? Scaled Averaged (over technical replicates) Normalized (using reference targets) Calibrated (if inter-run calibration was performed) Relative Quantities = NCRQs What are the error bars on the bar chart of individual samples? Technical variation < all propagated errors: technical replicates deviations from regression line of standard series variation between reference targets Scaling settings to create bar chart of group averages Set mean NCRQ of untreated controls = 1 Scale mean NCRQ of treated group accordingly Chart settings to create bar chart of group averages Bar chart of group averages Bars = mean NCRQ of a group of samples Error bars = biological variation How to interpret these NCRQs ? Upregulation: NCRQ = fold change Expression in treated samples is 3.5 fold upregulated compared to that in controls How to interpret these NCRQs ? Downregulation: fold change = −𝟏 𝑵𝑪𝑹𝑸 Expression in treated samples is 4 fold downregulated compared to that in controls What is the link between CNRQ and ΔΔCt ? ∆∆𝐶𝑡 = log 2 (𝐶𝑁𝑅𝑄) So ΔΔCt = 2 => CNRQ = 4 4-fold upregulation Statistical analysis Mean comparison: comparison of gene expression between two or more groups of samples which genes are DE ? Correlation: check if genes show similar expression patterns Survival analysis: determine influence of gene expression on survival to a disease Characteristics of your data that determine which test to use Is your data drawn from a normal distribution ? > 25 samples in each group: yes < 25 samples in each group: check in GraphPad or choose don’t know You choose yes but in reality it’s no => false positives You choose no/don’t know but in reality it’s yes => false negatives No => log transformation can improve normality of your data Qbase+ automatically log transforms the data during statistical analyses transforms them back before showing the results Characteristics of your data that determine which test to use Paired or unpaired data ? Paired: same/related individuals used in all groups of samples Unpaired: different individuals used in all groups of samples One sided or two sided test ? Two sided: you test if the groups are different One sided: you test if expression in one group is higher/lower than in the other group Only use one-sided if you know before the experiment that only one side is relevant Multiple testing correction If you test multiple genes / groups of samples you have to correct for this How to choose the best reference genes ? First select 10 candidate reference genes in Genevestigator RefGenes tool (free!) Select 10 genes with the most stable expression levels in the same organism as in your qPCR experiment in the same tissue as in your qPCR experiment According to public microarray data 3 steps in RefGenes selection of candidate reference genes Select independent studies grouping at least 50 arrays Select specific probe sets for the genes of interest The most stable genes with similar expression values Check these 10 candidates in your samples using GeNorm GeNorm checks the stability of the candidates in your samples geNorm pilot experiment: M-values Least stable Most stable How to choose the ideal number of reference genes ? geNorm pilot experiment: V-values = value of adding another reference gene Treshold determined by Biogazelle V > treshold: it is worthwhile to add another reference gene Suppose you use YHWAZ and UBC What is the value of adding GAPD as a reference gene ? Common reference genes Error bars overlap a lot 4xDE Genevestigator selection Error bars slightly overlap 4xDE Statistical analysis with common reference genes No DE Statistical analysis with GeneVestigator selection DE Inter-run calibration If it’s impossible to put all samples of the same gene on the same plate e.g. you have too many samples qPCR runs are spread over time: now a few samples, next month a few more => You have to use at least three inter-run calibrators = same sample repeated in each run Same principle as housekeeping genes: Technical variability between runs Variability of inter-run calibrators is measure of technical variability Used to adjust all expression values to remove inter-run variability 1 calibration factor for each gene in each run IRCs must have different names in different runs in qbase+ Wiki: Exercises on qPCR design – Exercises on qbase+ Use of qPCR for absolute quantification Compare Cq of unknown sample to calibration curve Calibration curves are based on known concentrations of DNA standard molecules e.g. recombinant plasmid, genomic DNA, commercial oligos… Cq Absolute quantification by calibration curves is not trivial Standards must be accurate ! Not easy: - Producing standards is very time consuming - Exact determination of standard concentration (precision) - Accurate pipetting during dilution - Stability during storage (reproducibility) - Length of standard should be comparable to that of unknown Rate of unspecific cleavage / proofreading during PCR depends on template length - Subject to PCR step only while unknown samples go through entire protocol - If unknown = mRNA then standard should be subjected to RT also even then variation may arise because of absence of rRNA/tRNA in standard - Copy number variation has to be taken into account !! Quantification strategies in RT-PCR http://www.gene-quantification.de/strategy.html Replicates are essential for absolute quantification Technical replicates are essential to estimate: - precision: replicates in the same run - reproducibility: replicates in different runs Normalization can be performed To normalize using reference targets: - Create standard curves for targets of interest and for reference targets - Use these curves to determine amount of each target in each sample - Normalize by calculating: 𝑎𝑚𝑜𝑢𝑛𝑡𝑡𝑎𝑟𝑔𝑒𝑡 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡 𝑎𝑚𝑜𝑢𝑛𝑡𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑡𝑎𝑟𝑔𝑒𝑡 If using multiple reference targets: divide by their geometric mean Quantification strategies in RT-PCR http://www.gene-quantification.de/strategy.html Digital PCR for absolute quantification without standards Sample is divided into many individual RT-PCR reactions Some of these reactions contain the target (positive) while others do not (negative) 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠 𝑡𝑜𝑡𝑎𝑙 = absolute quantification of target without standards or reference targets Review on digital PCR http://www.nature.com/nmeth/journal/v9/n6/pdf/nmeth.2027.pdf Take home message: overview of recommendations Replicates Choose biological replicates over technical replicates 4 biological replicates Controls Negative controls: no template, no RT, biological control Positive control: biological control Reference genes Minimum 3 Validate primers, kits and reference genes before the experiment ! Take home message: overview of recommendations Sample maximization Put all samples of the same gene on the same plate No need to repeat reference genes on each plate When you want to compare one gene over different conditions <-> when you want to compare genes: gene maximization If samples of same gene are to be spread over different plates: use IRCs