qPCR measures quantities of target sequences

Transcription

qPCR measures quantities of target sequences
qPCR measures quantities of target sequences
qPCR stands for quantitative PCR
Targets can be DNA, cDNA or RNA
DNA: present in a sample ?
number of copies in a sample ? -> copy number analysis
! Plasmids need to be linearized for qPCR
cDNA: quantify expression levels -> gene expression analysis
expression = amount of mRNA in a sample
mRNA
cDNA to prevent degradation
reverse transcription
RNA: quantify expression levels of small non-coding RNA -> miRNA profiling
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0009545
The qPCR protocol
Growing the samples
RNA extraction
DNase treatment
cDNA synthesis
Primer design
qPCR
Starting with good samples: RNA extraction
Phenol/chloroform
Spin columns
High yield
Lower yield
Better with troublesome tissues
High gDNA contamination
Low gDNA contamination
Residual chemicals may inhibit PCR
Clean
Small RNA’s are not retained
mRNA or miRNA extraction ?
-> different kits
recommended: use a kit that can extract both e.g. Qiagen miRNeasy
use same samples for mRNA and miRNA profiling
Quantifying RNA
Spectrometry: UV spec or Nanodrop
Highly variable (don’t trust absorptions < 0.1)
No distinction RNA/DNA
Microfluidic analytics: Agilent Bioanalyzer, BioRad’s Experion
More accurate
Specifically measure RNA
Fluorescent dye detection: RiboGreen Dye
Very sensitive
Expensive
Specifically measure RNA
Always use the same method of quantification !
Checking RNA quality
RNA quality
-
Integrity: RNA breaks down easily
-
Purity: RNA can contain inhibitors of qPCR reaction
! Don’t aim for high quality, aim for equal quality between samples !
Checking RNA integrity by microfluidic electrophoresis
Evaluates integrity of 18S and 28S rRNA
You look at rRNA while you’re interested in mRNA
If you see these peaks there’s no degradation
http://www.iata.csic.es/IATA/usct/geno/Doc/Bibliografia/Quality%20assessment%20of%20total%20RNA.pdf
Checking RNA integrity by 5’-3’ mRNA ratio
Perform reverse transcription on RNA
Use qPCR to measure HPRT1 in cDNA using two primer sets
HPRT1
5’
Cq 3’
Cq 5’
No degradation: Cq 5’ = Cq 3’
Degradation:
Cq 5’ <<< Cq 3’
HPRT1 is a reference gene with stable, low expression levels
HPRT1 is expressed in all organisms
AAAAAAA
3’
Inhibitors distort the qPCR reaction
No inhibitors
Inhibitors: shift Cq to right
e.g. phenol, ethanol...
Checking RNA purity by SPUD assay
SPUD = potato gene with no homology to any known sequence
Add equal amounts SPUD to each RNA sample
Create controls:
positive: SPUD + heparin (a known inhibitor)
negative: SPUD + water
Perform qPCR to measure SPUD
SPUD
+
water
SPUD
+
heparin
SPUD
+
RNA1
SPUD
+
RNA2
SPUD
+
RNA3
Cq=22
Cq=27
Cq=22
Cq=26
Cq=22
Clean samples have same Cq as water
Samples with inhibitors have higher Cq
ΔCq > 1: presence of inhibitors
Starting with good samples: DNase treatment
To remove genomic DNA contamination from mRNA samples
Recommended but never 100% efficient
Solutions:
• Intron or exon-exon junction spanning primers: not always feasible
gDNA will generate a longer or no PCR product
• No RT control to detect gDNA contamination
Starting with good samples: cDNA synthesis
To transform RNA into more stable cDNA
Prepare all cDNAs that you are going to use in a single batch
Primers: Biogazelle uses a mix of random and oligodT primers
Validation:
RNA
RNA
1/4
RNA
1/16
RNA
1/64
RNA
1/256
cDNA
cDNA
cDNA
cDNA
cDNA
Cq=18
Cq=20
Cq=22
Cq=24
Cq=26
Create RNA dilution series
Synthesize cDNA
qPCR
Cqs follow dilution series: good cDNA synthesis
cDNA synthesis + DNase treatment directly on cells: Ambion Cells-to-CT kit
Measuring the cDNA concentation
Quantification of cDNA is not required
After cDNA synthesis you don’t have pure cDNA but a mixture of RNA, cDNA, dNTPs
Spectrometry based method will measure the mixture not the cDNA
Better to quantify RNA after purification (less contaminants)
Use equal amounts of RNA for cDNA synthesis
Different ways to measure amount of PCR product
Fluorescent dyes:
e.g. Sybr Green, EvaGreen...
Fluorophore-containing DNA probes:
e.g. Taqman...
Fluorescent dye binds to ds DNA
Denaturation: dye is released and fluorescence reduces
Polymerization: primers anneal and ds PCR product is formed
Dye binds to ds product and fluorescence increases
ID&T pros and cons of both methods http://www.slideshare.net/idtdna/technical-tips-for-qpcr
Plot of the fluorescent signal (Rn) during PCR
Each PCR cycle: PCR product x2 => signal x2
Rn
Reagents run out
Signal of unbound dye
Not enough product
to detect signal
Cycle
Signal becomes detectable
Complete introduction to qPCR http://pathmicro.med.sc.edu/pcr/realtime-home.htm
Threshold for the transition of baseline to exponential phase
∆Rn = Rn -baseline
= Cq
ABI instruments use different thresholds for different plates
! Not Ok: Use the same threshold on every plate !
Cq depends on initial amount of target in sample
Signal becomes detectable at 32 copies of PCR product
sample Initial amount
ID
of target
After
PCR cycle 1
After
PCR cycle 2
After
PCR cycle 3
After
PCR cycle 4
After
PCR cycle 5
A
1
2
4
8
16
32
B
2
4
8
16
32
64
C
4
8
16
32
64
128
D
8
16
32
64
128
256
Cq(D) = 2
Cq(C) = 3
Cq(B) = 4
Cq(A) = 5
B/A = 2 : two-fold difference in initial amount of target => Cq(A) – Cq(B) = 1
C/A = 4 : four-fold difference in initial amount of target => Cq(A) – Cq(C) = 2
D/A = 8 : eight-fold difference in initial amount of target => Cq(A) – Cq(D) = 3
qPCR generates Relative Quantities (RQ)
more DNA/RNA
less DNA/RNA
∆Cq = difference between 2 Cq’s
0
1
2
3
4
10 11...
Check on qPCR instrument
Detection of abnormal amplification: visual inspection of the curves
http://www.slideshare.net/idtdna/troubleshooting-qpcr-what-are-my-amplification-curves-telling-me-33646672
Generation of melting curves
After qPCR:
T gradually ↑ => PCR products melts
dye loosens
signal ↓
Inspection of melting curves
-dF/dT
OK: 1 sharp peak
Primer dimer: broad flat peak at ↓T
Product with 2 regions: GC/AT rich
Aspecific product
Visualization of PCR products on gel
What to do when you see abnormal curves?
Possible causes:
How to check?
You see no amplification
Gene is not expressed in this sample
Are other genes expressed in this sample?
Sample was degraded
Same for all genes in this sample ?
Wrong instrument settings
Same for all wells ?
Low PCR efficiency (low curves with delayed Cq values)
Bad primer design (SNP)
Same for all wells that measure this gene?
Presence of inhibitors
Same for all wells that include this sample?
High PCR efficiency (Efficiency-values > 2.1)
Primer dimers or aspecific products
Check melt curves + put products on gel
Contamination with gDNA
Same for all wells that include this sample?
Biogazelle’s design guidelines regarding PCR products
degraded material: 50-80 bp
good material: 80-150 bp
Fluorescent dyes: amount signal depends on length PCR product
longer products generate more signal
PCR products should all have ± same length
Avoid repeats and domains (high conservation: no specific primers)
Take into account splice variants and SNPs
ID&T guide for designinging primers to differentiate between similar genes
http://www.slideshare.net/idtdna/qpcr-design-strategies-for-specific-applications
Biogazelle’s qPCR primer design guidelines
Intron or exon-exon junction spanning
length: 9-30 bp (ideally: 20 bp)
melting temperature Tm: 58-60 °C (ideally: 59°C)
maximum Tm difference between primers: 2°C
GC content: 30-80% (ideally: 50%)
5 nucleotides at 3’ end < 3 G or C
avoid runs > 3 identical nucleotides
Primers should not bind themselves / each other
Same primers ordered from different vendors will give different results
-> variation in amount of inhibitors in the primers
Recommended: IDT primers
qPCR primer design software
Primer Design
Primer3Plus: http://primer3plus.com/cgi-bin/dev/primer3plus.cgi
PrimerQuest: https://eu.idtdna.com/PrimerQuest/Home/Index
Primer-BLAST: http://blast.ncbi.nlm.nih.gov/
Checking primer dimerization properties
Oligo Analyzer: https://eu.idtdna.com/analyzer/Applications/OligoAnalyzer/
Checking primer specificity
BLAST: http://blast.ncbi.nlm.nih.gov/
BiSearch: http://bisearch.enzim.hu/
qPCR primer design software
Checking secondary structures in the PCR product
UNAfold: http://mfold.rna.albany.edu/
Commercial -> not free
Checking location of SNPs
UCSC in silico PCR + SNPs track: http://genome.ucsc.edu/cgi-bin/hgPcr
User interface of primer BLAST
Primers cannot lie in region e.g. 100-1000bp
Impossible unless excluded region is at the start/end of the input sequence
Primers must surround region closely
From
To
e.g. 100-1000bp
Forward primer
100
Also see intron parameters below
Reverse primer 1000
1100
Primers must lie in region
e.g. 100-1000bp
From
To
Forward primer 100
Reverse primer
1000
PrimerBLAST offers less but better parameters
Exon junction containing primers
Intron spanning primers
PrimerBLAST performs BLAST during primer design so that it
returns only specific primers
Direct link to BLAST
Parameters of the BLAST search
It is important that the 3’ end of a primer is specific since extension is done here
Results of the Primer-BLAST search
Self complementary: score of local alignment of primer to itself
Scoring system: +1 for a match, -1 for a mismatch and -2 for a gap
Lower score => more mismatches => better primers
Self 3'complementarity scores: score of global alignment of primer to itself
Try binding the 3'-end of the primer to an identical primer and score the best binding
Lower score => more mismatches => better primers
Tips for improving Primer-BLAST searches
Enter RefSeq accessions rather than sequences as template
If you want to measure just a part of the sequence still use accessions and specify a range
Accessions give PrimerBLAST access to all info of the RefSeq record
-> Allows better distinction between intended template and off-targets
Choose a non-redundant database: RefSeq or Genome database (not nr !)
Specify your organism for the database to search in
Exclude predicted transcripts from database search if you are not concerned about these.
OligoAnalyzer to check primer characteristics
https://eu.idtdna.com/analyzer/Applications/OligoAnalyzer/
Have large impact on Tm (± 10°C)
Realistic settings for PCR
General characteristics e.g. Tm, GC content…
Can primers bind themselves ?
Can primers bind each other ?
Assess effect of SNPs on Tm
SNPs can have large impact on Tm
Primer
Template
SNP
Tm lowers because of mismatch
Same annealT in PCR =>
amount of PCR product significantly reduced
If the SNP occurs at the final base at the 3’ end: no PCR product will be formed
Paper on effect of SNPs on PCR http://www.ncbi.nlm.nih.gov/pubmed/24014836
UNAfold secondary structure prediction of PCR product
Secondary structure overlapping primer annealing site
Do not use these primers !
Checking primer / probe specificity
Primer specificity
Always: inspection of melting curves
First time use: gel analysis
Probe specificity
Negative control (sample in which gene is not expressed)
Wiki :
Exercises on primer design
Take home message: overview of recommendations
Different location pre- and post PCR
96 well versus 384 well
cheaper
more data
easier to pipet smaller volumes
Manual versus robotic pipetting
96 well
384 well
multichannel pipets: calibrate regularly
! Never use volumes < 1 µl (pipetting errors too high)
Check quality of pipets: pilot experiment with same sample-target in all wells
-> differences in Cq should be < 0.2
Sample maximization
In most qPCR experiments you want to compare expression levels between samples
For each gene you test if it’s DE in one group of samples compared to another group
 You have to put all the samples for the same gene on the same plate
In the few cases that you compare genes
You have to put all the genes for the same sample on the same plate
Qbase+
Hierarchy: Projects < Experiments < Runs
= data files coming from qPCR instrument
< Cq values for targets in samples
Genes = Targets
Housekeeping genes = Reference Targets
Ct = Cq
Difference between qbase+ and classical ΔΔCt method
Classic method
Qbase+
You assume the amount of PCR product 1. Gene-specific amplification efficiencies
doubles each PCR cycle
One reference gene
2. Multiple reference genes
Difficult to combine data from separate runs 3. Inter-run calibration
4. Rescale expression values
! Error propagation is handled by the
software !
Qbase+ demonstration
Gene expression analysis
Exercise on BITS wiki page
Analysis of geNorm pilot experiment
Exercise on BITS wiki page
Replicates
Technical replicates: - same cDNA in different wells
- same RNA different cDNA preparations
- same samples different RNA purifications
Not independent -> averaging at the start of the analysis
Biological replicates: - different samples, same treatment
Independent -> Used in statistical analysis
Technical replicates are expected to have similar Cq values
Choose biological replicates over technical replicates
At least 4 biological replicates
Technical versus biological replicates
Technical replicates: repeated measurements on same sample
e.g. You take 1 control and 1 treated mouse and perform 3 replicate measurements
n=1
You cannot do statistics here, you can only report the average of the replicates
Technical versus biological replicates
Biological replicates: independently performed experiments on different samples
e.g. You take 3 control and 3 treated mice and do measurements on each mouse
n=3
Now you can compare control and treated mice
?
If the mice come from the same litter
Technical or biological replicates ?
What if the mice come from the same litter ?
You can treat them as biological replicates but the conclusions that you draw will
only apply to mice of that litter
Mice from other parents might behave differently
?
What if the mice come from the same strain ?
What if the mice come from the same strain?
You can treat them as biological replicates but the conclusions that you draw will
only apply to mice of that strain
Mice from other strains might behave differently
=> The definition of biological replication depends on what you want to prove
What if the mice come from the same strain?
You can treat them as biological replicates but the conclusions that you draw will
only apply to mice of that strain
Mice from other strains might behave differently
=> The definition of biological replication depends on what you want to prove
Choosing the strategy for handling technical replicates
Qbase+ automatically recognizes technical replicates: same run
same target
same sample
First step in the analysis: averaging Cq values of technical replicates
Only use when you have at least 3 replicates
Settings -> Calculation parameters
Negative controls
No template:
no cDNA but primers are present
signal -> primer dimers
contamination of reagents with cDNA
include in every run
No RT: DNase treated RNA instead of cDNA
signal: gDNA contamination
include for every RNA extraction
Biological control: sample in which gene is not expressed
Cq of samples of interest should be well below Cq of negative controls
Positive controls
Biological control: sample in which gene is expressed
synthetic templates e.g. Ultramer oligonucleotides
Amplification efficiencies
Efficiencies of primer pairs vary
Each time you use a new primer pair you should check efficiency
How? qPCR on serial dilution of representative sample
= mix of all samples used in study
in similar concentrations as used in later qPCRs
=> 1 efficiency for each target
Efficiencies in qbase+ expressed between 1 and 2: 100% = 2, 80% = 1.8 ...
If efficiencies of all primer pairs between 1.9 and 2.1 => use 2 for all targets
Otherwise use target-specific effiencies or design other primers
Setting the amplification efficiencies
You only have to include dilution series only once
The next time you use these primers you can fill in the efficiencies
Calculation of amplification efficiencies
Calculation of amplification efficiencies
Most diluted
Linear regression to fit a line through the data
Slope of the line is used to canculate the efficiency E
Deviation of data from the line is used to calculate error SE(E)
Least diluted
Normalization
Variability between samples with no biological cause: differences in RNA integrity
different amounts of cDNA
different enzyme efficiencies
presence of inhibitors
Remove this variability using housekeeping genes
= genes with same expression in all samples
Variability of housekeeping genes is measure of technical variability
Expression values of housekeeping genes are adjusted to remove variability
Same adjustment is done on all samples
=> 1 normalization factor for each sample
Choosing the normalization strategy
Use with > 50 genes
NF based on all genes instead of HK only
Single cell PCR
Defining how stable the reference genes should be
Settings -> Quality control settings
M and CV are measures for the stability of the expression of genes
The lower they are, the more stable the expression
Thresholds based on experiment done by Biogazelle:
qPCR on 85 human samples from ≠ tissues to check expression of published HK genes
Fly / plants: you may increase thresholds to 1 and 0.5
Checking if the reference genes are stable
Quality control -> Reference target stability
Green: reference genes pass the thresholds that you have set for M and CV
Red: reference genes do not pass the thresholds that you haved set for M and CV
What to do ?
remove the most unstable reference gene (top)
How ?
define it as a target of interest
3 Housekeeping genes: you have a backup if one of them fails !
Normalization
Why 2 housekeeping genes ?
You can check their stability
Why 3 housekeeping genes ?
If something goes wrong with one of them you still have 2 left
How to determine number and which housekeeping genes ?
GeneVestigator
geNorm pilot experiment: qPCR on at least 8 candidates in at least 10 samples
from GeneVestigator
different pathways
No need to repeat housekeeping genes on each plate !
from study
Choosing the scaling strategy
OK
Avoid: large error bars
Gene expression analysis
Copy number analysis
Choose a reference: one sample / control group of samples / average of all samples
Set expression level of reference to 1
Scale all other expression levels accordingly
You just change the scale of the expression values, not the proportions !
Helpful for interpretation of results on bar chart: is my gene DE ?
What are the bars on the bar chart of individual samples ?
Scaled
Averaged (over technical replicates)
Normalized (using reference targets)
Calibrated (if inter-run calibration was performed)
Relative Quantities
= NCRQs
What are the error bars on the bar chart of individual samples?
Technical variation
< all propagated errors:
technical replicates
deviations from regression line of standard series
variation between reference targets
Scaling settings to create bar chart of group averages
Set mean NCRQ of untreated controls = 1
Scale mean NCRQ of treated group accordingly
Chart settings to create bar chart of group averages
Bar chart of group averages
Bars = mean NCRQ of a group of samples
Error bars = biological variation
How to interpret these NCRQs ?
Upregulation: NCRQ = fold change
Expression in treated samples is 3.5 fold upregulated compared to that in controls
How to interpret these NCRQs ?
Downregulation: fold change =
−𝟏
𝑵𝑪𝑹𝑸
Expression in treated samples is 4 fold downregulated compared to that in controls
What is the link between CNRQ and ΔΔCt ?
∆∆𝐶𝑡 = log 2 (𝐶𝑁𝑅𝑄)
So ΔΔCt = 2 => CNRQ = 4
4-fold upregulation
Statistical analysis
Mean comparison:
comparison of gene expression between two or more groups of samples
which genes are DE ?
Correlation:
check if genes show similar expression patterns
Survival analysis:
determine influence of gene expression on survival to a disease
Characteristics of your data that determine which test to use
Is your data drawn from a normal distribution ?
> 25 samples in each group: yes
< 25 samples in each group: check in GraphPad or choose don’t know
You choose yes but in reality it’s no => false positives
You choose no/don’t know but in reality it’s yes => false negatives
No => log transformation can improve normality of your data
Qbase+ automatically log transforms the data during statistical analyses
transforms them back before showing the results
Characteristics of your data that determine which test to use
Paired or unpaired data ?
Paired: same/related individuals used in all groups of samples
Unpaired: different individuals used in all groups of samples
One sided or two sided test ?
Two sided: you test if the groups are different
One sided: you test if expression in one group is higher/lower than in the other group
Only use one-sided if you know before the experiment that only one side is relevant
Multiple testing correction
If you test multiple genes / groups of samples you have to correct for this
How to choose the best reference genes ?
First select 10 candidate reference genes in Genevestigator
RefGenes tool (free!)
Select 10 genes with the most stable expression levels
in the same organism as in your qPCR experiment
in the same tissue as in your qPCR experiment
According to public microarray data
3 steps in RefGenes selection of candidate reference genes
Select independent studies grouping at least 50 arrays
Select specific probe sets for the genes of interest
The most stable genes with similar expression values
Check these 10 candidates in your samples using GeNorm
GeNorm checks the stability of the candidates in your samples
geNorm pilot experiment: M-values
Least stable
Most stable
How to choose the ideal number of reference genes ?
geNorm pilot experiment: V-values = value of adding another reference gene
Treshold determined by Biogazelle
V > treshold: it is worthwhile to add another reference gene
Suppose you use YHWAZ and UBC
What is the value of adding GAPD as a reference gene ?
Common reference genes
Error bars overlap a lot
4xDE
Genevestigator selection
Error bars slightly overlap
4xDE
Statistical analysis with common reference genes
No DE
Statistical analysis with GeneVestigator selection
DE
Inter-run calibration
If it’s impossible to put all samples of the same gene on the same plate
e.g. you have too many samples
qPCR runs are spread over time: now a few samples, next month a few more
=> You have to use at least three inter-run calibrators
= same sample repeated in each run
Same principle as housekeeping genes:
Technical variability between runs
Variability of inter-run calibrators is measure of technical variability
Used to adjust all expression values to remove inter-run variability
 1 calibration factor for each gene in each run
IRCs must have different names in different runs in qbase+
Wiki: Exercises on qPCR design – Exercises on qbase+
Use of qPCR for absolute quantification
Compare Cq of unknown sample to calibration curve
Calibration curves are based on known concentrations of DNA standard molecules
e.g. recombinant plasmid, genomic DNA, commercial oligos…
Cq
Absolute quantification by calibration curves is not trivial
Standards must be accurate !
Not easy:
- Producing standards is very time consuming
- Exact determination of standard concentration (precision)
- Accurate pipetting during dilution
- Stability during storage (reproducibility)
- Length of standard should be comparable to that of unknown
Rate of unspecific cleavage / proofreading during PCR depends on template length
- Subject to PCR step only while unknown samples go through entire protocol
- If unknown = mRNA then standard should be subjected to RT also
even then variation may arise because of absence of rRNA/tRNA in standard
- Copy number variation has to be taken into account !!
Quantification strategies in RT-PCR http://www.gene-quantification.de/strategy.html
Replicates are essential for absolute quantification
Technical replicates are essential to estimate:
- precision: replicates in the same run
- reproducibility: replicates in different runs
Normalization can be performed
To normalize using reference targets:
- Create standard curves for targets of interest and for reference targets
- Use these curves to determine amount of each target in each sample
- Normalize by calculating:
𝑎𝑚𝑜𝑢𝑛𝑡𝑡𝑎𝑟𝑔𝑒𝑡 𝑜𝑓 𝑖𝑛𝑡𝑒𝑟𝑒𝑠𝑡
𝑎𝑚𝑜𝑢𝑛𝑡𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑡𝑎𝑟𝑔𝑒𝑡
If using multiple reference targets: divide by their geometric mean
Quantification strategies in RT-PCR http://www.gene-quantification.de/strategy.html
Digital PCR for absolute quantification without standards
Sample is divided into many individual RT-PCR reactions
Some of these reactions contain the target (positive) while others do not (negative)
𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑠
𝑡𝑜𝑡𝑎𝑙
= absolute quantification of target without standards or reference targets
Review on digital PCR http://www.nature.com/nmeth/journal/v9/n6/pdf/nmeth.2027.pdf
Take home message: overview of recommendations
Replicates
Choose biological replicates over technical replicates
4 biological replicates
Controls
Negative controls: no template, no RT, biological control
Positive control: biological control
Reference genes
Minimum 3
Validate primers, kits and reference genes before the experiment !
Take home message: overview of recommendations
Sample maximization
Put all samples of the same gene on the same plate
No need to repeat reference genes on each plate
When you want to compare one gene over different conditions
<-> when you want to compare genes: gene maximization
If samples of same gene are to be spread over different plates: use IRCs