Al Parker

Transcription

Al Parker
Linear and non-linear Bayesian
models of 3-D Confocal
Microscope Movies of Biofilms
Al Parker
MUQ, Missoula, MT
June 26, 2015
Talk Outline:
1. Describe the problem of estimating biofilm characteristics from confocal scanning
laser microscope (CSLM) movies using a linear statistical model of biofilm
heights.
2. Apply polynomial accelerated iterative samplers to
solve this large (> 108-dim) Bayesian linear inverse
problem.
3. Overcome some limitations of the linear approach
by constructing a non-linear model.
Center for Biofilm Engineering
Betsey
Pitts
www.biofilm.montana.edu
Phil
Stewart
In vitro methods to grow relevant biofilms
Rotating Disk
Reactor
Moderate shear
- CSTR ASTM
Method E2196
Drip Flow
Biofilm Reactor
Low shear
- Plug flow ASTM
Method E2647
CDC Biofilm
Reactor
MBEC Assay
High shear
- CSTR ASTM
Method E2562
Gentle shear
- Batch ASTM
Method E2799
“Traditional way” to quantify microbial abundances
http://www.hypertextbookshop.com/biofilmbook/v004/r003/
Colony forming units (CFU)
Microbiologists in my lab usually want to kill biofilms:
3
Michaelis-Menten,
Monod,
lethality model:
𝑉
𝑐
∆abundance = log⁡(CFU⁡ratio) = 𝐾𝑚𝑎𝑥
+𝑐
𝑚
2
1
Lauchnor et al., JAM, 2015
Inference via the posterior
Check model fit via residuals
Using confocal images to quantify microbial abundances
CDC Biofilm Reactor
After growing a Staph. aureus
biofilm that has been
genetically modified to express
a green fluorescent protein …
… apply real-time,
microscopy-based analysis of
biofilm accumulation.
High shear
- CSTR ASTM
Method E2562
The confocal microscope
Cartoon from A. Constans, The Confocal Microscope, The
Scientist, November 22, 2004, http://www.thescientist.com/?articles.view/articleNo/16043/title/TheConfocal-Microscope/
The data:
The biofilm movies, y є R4.5 x 10^8, are stored as uint8 (i.e., 0255), with dimension calculated by:
(
planar slice
) x (depth)
x (time)
= (620μm x 620μm) x (119μm)
x (4 f/m over hours)
=(
512 pixels
)2 x (17 z-slices) x (100’s)
= 5122 x 17 x 100
= 4.5 x 108
Linear model assumptions:
1. The height of the biofilm is modeled as a surface over a 2-D domain at each
time point (Sheppard & Shotton 1997), i.e.,
z є R512^2 = R2.6 x 10^5
Independent
cryosectioning of biofilm
suggests that there is
biofilm all the way through
Linear model assumptions:
1. The height of the biofilm is modeled as a surface over a 2-D domain at each
time point (Sheppard & Shotton 1997), i.e.,
z є R512^2 = R2.6 x 10^5
Linear model assumptions:
2. The biofilm’s height changes smoothly: each point on the biofilm surface is
assumed to be dependent on its neighbors (1st order), and conditionally independent
the rest of the surface
Figure 14 from D. Higdon, A primer on space-time
modelling from a Bayesian perspective, in Statistical
Methods for Spatio-Temporal Systems, B.
Finkenstadt, L. Held, and V. Isham, eds., Chapman
& Hall/CRC, Boca Raton, FL, 2007, pp. 217–279.
3. Assume a locally linear structure:
z(i) | (neighbors) ~ N(mean of the neighbors, 1/ [λz(# of neighbors)])
The linear statistical model:
The assumptions 1-3 are encapsulated by the prior in the following
locally linear Gauss-Markov Random Field:
y = Az + ε, , A = I, ε ~ N(0,λyI);
likelihood:
priors:
y|z
z
λy
λz
~ N(z, λyI);
~ N(0, λzW-1);
~ Gamma(ay, by)
~ Gamma(az, bz)
W is a Laplacian (a sparse singular precision matrix)
with bandwidth b = 512:
There are 3 parameters to estimate at each time t:
* true image:
* SD of the data:
* SD of the prior:
z є R512^2 = R2.6 x 10^5
1/sqrt(λy)
1/sqrt(λz)
W=
Estimation procedure:
The posterior is straightforward to calculate:
Page 32 from D. Higdon, A primer on space-time
modelling from a Bayesian perspective, in Statistical
Methods for Spatio-Temporal Systems, B.
Finkenstadt, L. Held, and V. Isham, eds., Chapman
& Hall/CRC, Boca Raton, FL, 2007, pp. 217–279.
To get a sample from the posterior, after sampling λy and λz, it is expensive to sample this
R2.6 x 10^5 Gaussian (this is an expensive step for any linear Bayesian model y = Az + ε) :
Apply iterative samplers from a MASSIVE Gaussian N(µ, Λ-1)
Solving Λx=b:
Gauss-Seidel
Chebyshev-GS
CG
Sampling y ~ N(µ, Λ-1):
Gibbs
Chebyshev-Gibbs
CG-Lanczos sampler
The sampler error decreases according to a polynomial,
(E(yk) - µ)= Pk(I-G) (E(y0) – µ)
(Λ -1 - Var(yk)) = Pk(I-G) (Λ-1 - Var(y0)) Pk(I-G)T
Gibbs
Pk(I-G) =
for any Krylov vector v
CG-Lanczos
Chebyshev-Gibbs
Gk,
with error reduction
factor
p(G)2
(Λ-1 - Var(yk))v = 0
Pk(I-G)
Var(yk) is the kth
order
CG polynomial
kth order Chebyshev
polyomial,
optimal variance asymptotic
average reduction factor is
 1  cond ( I  G ) 
2

  

 1  cond ( I  G ) 
2
converges in a finite
number of steps* in a
Krylov space
depending on eig(I-G)
In theory and on a computer (finite precision),
a Chebyshev accelerated sampler is faster than a Gibbs sampler
but slower than a Cholesky sampler
Example: In 10x10 domain, N(µ,
Covariance
matrix
convergence,
||A-1 – Var(yk)||2 /||A-1 ||2
Benchmark for cost in
finite precision is the cost
of a Cholesky
factorization
Benchmark for
convergence in
finite precision is
105 Cholesky samples
-1
) in R100
… but in R2.6 x 10^5,
iterative sampling is much faster than Cholesky sampling
Each Gaussian sample of a single movie frame costs:
1. 6.9x1010 flops for a Cholesky factorization
[O(b2n) = O(5124)]
2. 9.7x107 flops for an iterative Chebyshev sampler
[O(#iters x 2n) = O(185x2x5122)]
3. 2.2x107 flops for an iterative CG sampler
[O(#iters x n) = O(85x5122)]
These iterative samplers of
N(0, Λ-1) are of the form:
1. Split Λ = M - N for M invertible.
2. Sample ck ~ N(0, (2-vk)/vk ( (2 – uk)/ uk M + N)
3. xk+1 = (1- vk) xk-1 + vk xk + vk uk M-1 (b- Λ xk)
4. yk+1 = (1- vk) yk-1 + vk yk + vk uk M-1 (ck - Λyk)
5. Check for convergence:
Quit if ||b - Λ xk+1 || is small.
Otherwise, update linear solver parameters vk and uk, go to step 2.
Examples of matrix splittings …
yk+1 = (1- vk) yk-1 + vk yk + vk uk M-1 (ck - Λ yk)
ck ~ N(0, (2-vk)/vk ( (2 – uk)/ uk MT + N)
Gibbs
MGS = D + L,
vk = uk = 1
Chebyshev-Gibbs
M = MGS D MTGS,
vk and uk are functions
of the 2 extreme
eigenvalues of
I-G=M-1Λ
CG-Lanczos
M = I,
vk , uk are
functions of
the residuals
b - Λ xk
My attempt at the historical development of
iterative Gaussian samplers:
Type
Stationary
(vk = uk = 1)
Sampler
Matrix Splittings
Gibbs (Gauss-Seidel)
Literature
Adler 1981,
Goodman & Sokal 1989,
Amit & Grenander 1991
BF (SOR)
Barone & Frigessi 1990
REGS (SSOR)
Roberts & Sahu 1997
Chen & Oliver 1996
Bardsley, Solonen, Haario & Laine 2014
RML or linear RTO
Generalized
Non-stationary
Multi-Grid
Lanczos Krylov
subspace
Krylov sampling
CD Sampler
with conjugate
directions
CG Sampler
Krylov sampling
with Lanczos
Lanczos sampler
vectors
Chebyshev
Fox & P 2014
Goodman & Sokal 1989
Liu & Sabatti 2000
Schneider & Wilsky 2003
Fox 2007
Ceriotti, Bussi & Parrinello 2007
P & Fox 2012
Simpson, Turner, & Pettitt 2008
Aune, Eidsvik, & Pokern 2014
Chow & Saad 2014
Fox & P 2014
Posterior based
on linear model:
Mean(z)
SD(z)
med(λy-1/2) med(λz-1/2)
Cholesky sampler
0.3717
0.1
Chebyshev sampler
0.3715
0.1
0.7130
10
1. extreme e-vals of M-1Λ
2. CG sample
CG sampler
Bayesian UQ with iterative samplers using a linear model of heights
Salt water added
here
25-27%
reduction
based on 99% credible interval
Bayesian UQ with iterative samplers using a linear model of heights
These blips in
volume are due to
peristaltic pump
rate (2mL/m)
en.wikipedia.org/wiki/File:Peristal
tic_pump.gif
Model assessment and tweaking
Joint work with Colin Fox
“Assess and Tweak” approach
resembles Doug Nichka’s (director of
Computational & Information System Lab's (CISL)
Institute for Mathematics Applied to Geosciences (IMAGe)
at the National Center for Atmospheric Research (NCAR))
SIAMUQ2014 workshop
Hierarchical models of tropical ocean winds given high-resolution satellite
scatterometer observations and low-resolution assimilated model output.
(Wikle, Milliff, Nychka, Berliner, JASA, 2001)
Assessing the Linear Model’s Image Reconstruction:
Linear model yields samples z є R512^2 over 2-D domain that do not:
 capture holes or over-hanging biofilm features
 allow an accurate reconstruction of the CSLM images
Instead, consider a non-linear model 3D binary τ є {0,1}17x512^2 over 3-D
domain (Lewandowski & Beyenal 2014; Higdon 2002)
 Let τ(𝑖, 𝑗, 𝑘) =
0⁡if⁡no⁡biofilm
1⁡if⁡biofilm
 l (τ⁡) = min(255, 𝐹(l0τ 𝑟 𝑑 ))⁡= blurred attenuated light profile
𝐹⁡implements⁡smoothing⁡in⁡the⁡z⁡direction,
l0⁡⁡is⁡luminescense⁡of⁡top⁡of⁡the⁡biofilm,
⁡⁡⁡⁡⁡⁡⁡⁡𝑟⁡ < 1⁡models⁡attentation⁡of⁡light⁡at⁡biofilm⁡depth⁡𝑑
Example: A 2D vertical slice from CSLM
The CSLM user typically thresholds images
before quantitation,
i.e., only bright pixels are included
Example: A 2D vertical slice from CSLM
One might ‘fill in’ below the bright pixels to
estimate where the biofilm is ….
Nonlinear Bayesian model with 1 Gaussian layer:
 Let τ(𝑖, 𝑗, 𝑘) =
0⁡if⁡no⁡biofilm
1⁡if⁡biofilm
 l (τ⁡) = min(255, 𝐹(l0τ 𝑟 𝑑 ))⁡= blurred attenuated light profile
𝐹⁡implements⁡smoothing⁡in⁡the⁡𝑧⁡direction,
l0⁡⁡is⁡luminescense⁡of⁡top⁡of⁡the⁡biofilm,
⁡⁡⁡⁡⁡⁡⁡⁡𝑟⁡ < 1⁡models⁡attentation⁡of⁡light⁡at⁡biofilm⁡depth⁡𝑑
 y = l (τ⁡) + ε, ε ~ N(0,λyI)
likelihood:
y|τ ~ N(l (τ⁡) , λyI)
prior:
τ ~ Ising(J)
hyperparameters: ⁡⁡⁡⁡⁡⁡⁡⁡l0⁡, r, λy, J
 Apply MCMC to estimate p(τ|y)
Example: A 2D vertical slice from CSLM
An MCMC sample from the posterior using 1 Gaussian layer ...
y
τ|y
Nonlinear Bayesian model with 2 Gaussian layers:
 Let τ(𝑖, 𝑗, 𝑘) =
0⁡if⁡no⁡biofilm
1⁡if⁡biofilm
 l (𝑥|τ) = min(255, 𝐹(𝑥 𝑟 𝑑 ))⁡= blurred attenuated light profile
𝐹⁡implements⁡smoothing⁡in⁡the⁡z⁡direction,
𝑥⁡⁡is⁡luminescense⁡throughout⁡the⁡biofilm,
⁡⁡⁡⁡⁡⁡⁡⁡𝑟⁡ < 1⁡models⁡attentation⁡of⁡light⁡at⁡biofilm⁡depth⁡𝑑
 y = l (𝑥|τ) + ε, ε ~ N(0,λyI);
likelihood:
y|τ, 𝑥 ~ N(l (𝑥|τ) , λyI)
x|τ ~ N(l0τ, λxI)
priors:
l0
~ Unif(0, L0)
⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡τ
~ Ising(J)
hyperparameters: ⁡⁡⁡⁡⁡⁡⁡⁡r, λy, λx, L0, J
Example: A 2D vertical slice from CSLM
An MCMC sample from the posterior using 2 Gaussian layers ...
y
τ
x|τ
Example: A 2D vertical slice from CSLM
An MCMC sample from the posterior using 2 Gaussian layers ...
y
Example: A 2D vertical slice from CSLM
Assess residuals to further fine-tune the model ...
Next steps:
 MCMC implementation:
 Devise efficient schemes for sampling
hyperparameters, esp. variance components
 Can be slow due to single or paired pixel moves in proposal
 Model correlation across time
 Alternative to 3D binary model is to model each of N cells by points or
ellipsoids, z є R3N or R10N (Al-Awadhi, Hurn & Jennison 2011)
 Construct posterior for θ|… and model y = l (θ) + ε, where the vector of
unknowns θ are directly related to biofilm and confocal microscope
parameters like limiting nutrients, wavelength of laser light
 Devise sampling schemes over time (trade-off with spatial resolution)
 Compare with other estimation methods
(Lewandowski & Beyenal 2014; Jonkman & Stelzer 2002; COMSTAT (Heydorn et al. 2000); Errington & White 1999)
References:
 Iterative Sampling
Fox & P. Convergence in variance of Chebyshev accelerated Gibbs samplers. SIAM Journal on Scientific Computing, 36(1):A124-A147, 2014.
Bardsley, Solonen, Haario, & Laine. Randomize-then-Optimize: a method for sampling from posterior distributions in nonlinear
inverse problems," SIAM Journal on Scientific Computing 36(4): A1359-C399, 2014.
P & Fox. Sampling Gaussian Distributions in Krylov Spaces with Conjugate Directions. SIAM Journal on Scientific Computing, 34(3), 2012.

Confocal image analysis
Lewandowski & Beyenal. Fundamentals of Biofilm Research, 2nd ed. 2014.
Jonkman & Stelzer. Resolution and Contrast in Confocal and Two-Photon Microscopy. Confocal and Two-Photon Microscopy, Diaspro ed., 2002.
Heydorn, Nielsen, Hentzer, Sternberg, Givskov, Ersboll & Molin. Quantification of biofilm structures by the novel computer program COMSTAT,
Microbiology 146, 2000. (MATLAB package)
Rodenacker, Bruhl, Hausner, Kuhn, Liebscher, Wagner, Winkler & Wuertz. Quantification of biofilms in multi-spectral digital volumes from confocal
laser-scanning microscopes. Image Anal. Stereo 19:151-156, 2000. (math morphology, watershed algorithm)
Errington & White. Measuring Dynamic Volume In Situ by Confocal Microscopy. Confocal Microscopy, Paddock ed., 1999
Lewandowski, Webb, Hamilton & Harkin. Quantifying biofilm structure. Water Sci Tech., 1999.
Sheppard & Shotton, Microscopy Handbooks 38: Confocal Laser Scanning Microscopy, 1997. (surface representation)
Van Der Voort & Strasters, Restoration of confocal images for quantitative image analysis. J. Microscopy 178: 1995. (PSF
individual cells)
estimation, volumes of
Bloem, Veninga & Shephard. Fully Automated Determination of Soil Bacterium Numbers, cell volumes, and frequencies of dividing cells by confocal
laser scanning microscopy and image analysis. AEM. 1995. (math morphology)
Stewart, Peyton, Drury & Murga. Quantitative Observations of heterogeneities in Pseudomonas aeruginosa biofilms. AEM 59: 1993.

Bayesian image analysis
Al-Awadhi, Hurn & Jennison. Three-dimensional Bayesian analysis and confocal microscopy. Journal of Applied Statistics, 38(1), 2011 (applied to
cartilage cells).
Higdon. A primer on space-time modelling from a Bayesian perspective, in Statistical Methods for Spatio-Temporal Systems, B. Finkenstadt, L. Held, &
V. Isham, eds., 2007.
Geman & Geman. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. Advances in Applied Statistics, 1993.