Evaluation of atlas selection strategies for atlas

Transcription

Evaluation of atlas selection strategies for atlas
www.elsevier.com/locate/ynimg
NeuroImage 21 (2004) 1428 – 1442
Evaluation of atlas selection strategies for atlas-based
image segmentation with application to confocal microscopy
images of bee brains
Torsten Rohlfing, a,* Robert Brandt, b,c Randolf Menzel, b and Calvin R. Maurer Jr. a
a
Image Guidance Laboratories, Department of Neurosurgery, Stanford University, Stanford, CA 94305-5327, USA
Institut für Neurobiologie, Freie Universität Berlin, Berlin, Germany
c
Indeed – Visual Concepts GmbH, Berlin, Germany
b
Received 17 July 2003; revised 3 November 2003; accepted 4 November 2003
This paper evaluates strategies for atlas selection in atlas-based
segmentation of three-dimensional biomedical images. Segmentation
by intensity-based nonrigid registration to atlas images is applied to
confocal microscopy images acquired from the brains of 20 bees. This
paper evaluates and compares four different approaches for atlas
image selection: registration to an individual atlas image (IND),
registration to an average-shape atlas image (AVG), registration to the
most similar image from a database of individual atlas images (SIM),
and registration to all images from a database of individual atlas
images with subsequent multi-classifier decision fusion (MUL). The
MUL strategy is a novel application of multi-classifier techniques,
which are common in pattern recognition, to atlas-based segmentation.
For each atlas selection strategy, the segmentation performance of the
algorithm was quantified by the similarity index (SI) between the
automatic segmentation result and a manually generated gold standard. The best segmentation accuracy was achieved using the MUL
paradigm, which resulted in a mean similarity index value between
manual and automatic segmentation of 0.86 (AVG, 0.84; SIM, 0.82;
IND, 0.81). The superiority of the MUL strategy over the other three
methods is statistically significant (two-sided paired t test, P < 0.001).
Both the MUL and AVG strategies performed better than the best
possible SIM and IND strategies with optimal a posteriori atlas
selection (mean similarity index for optimal SIM, 0.83; for optimal
IND, 0.81). Our findings show that atlas selection is an important issue
in atlas-based segmentation and that, in particular, multi-classifier
techniques can substantially increase the segmentation accuracy.
D 2004 Elsevier Inc. All rights reserved.
Keywords: Atlas-based segmentation; Atlas selection; Nonrigid image
registration; Bee brain; Confocal microscopy imaging
* Corresponding author. Image Guidance Laboratories, Department of
Neurosurgery, Stanford University, MC 5327, Room S-012, 300 Pasteur
Drive, Stanford, CA 94305-5327. Fax: +1-650-724-4846.
E-mail address: [email protected] (T. Rohlfing).
Available online on ScienceDirect (www.sciencedirect.com.)
1053-8119/$ - see front matter D 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.neuroimage.2003.11.010
Introduction
Segmentation of biomedical images, that is, the assignment of a
tissue classification or label to each image voxel, is for many
applications still mostly a manual or at best a semiautomatic task
involving substantial user interaction. One promising approach to
perform fully automatic segmentation of an image from an unknown subject is to compute an anatomically correct coordinate
transformation (registration) between the image and an already
segmented atlas image (Baillard et al., 2001; Collins et al., 1995;
Crum et al., 2001; Dawant et al., 1999; Gee et al., 1993; Hartmann
et al., 1999; Iosifescu et al., 1997; Miller et al., 1993).
The more accurately the registration transformation maps the
atlas onto the image to be segmented, the more accurate the result
of the segmentation. There is typically considerable inter-individual variability in the shapes of anatomical structures in the brains of
humans and animals, and thus an effective registration-based
segmentation method requires a registration algorithm with a large
number of parameters or degrees of freedom (i.e., a nonrigid
registration algorithm).
Many different nonrigid registration methods have been used
for atlas-based segmentation. Most previously reported approaches used an optical flow registration algorithm (Baillard et
al., 2001; Dawant et al., 1999; Hartmann et al., 1999), or fluid
registration (Christensen et al., 1996; Crum et al., 2001). Both
types of algorithms effectively compute the deformation between
image and atlas based on local intensity gradients. Miller et al.
(1993) used an algorithm with an elastic deformable solid model,
which was later extended by Christensen and Johnson (2001) to
preserve consistency between forward and backward transformations and to accept an initialization using a landmark-based
nonrigid transformation.
Very little attention, however, has been paid to the influence of
the atlas image on the result of the atlas-based segmentation. The
majority of published works use a single segmented individual
image, usually randomly selected, as the atlas (Baillard et al., 2001;
Dawant et al., 1999; Hartmann et al., 1999; Iosifescu et al., 1997).
Often, the criteria used for atlas selection are not even mentioned.
More than a single atlas is used by Thompson and Toga (1997)
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
who use a database of atlases to generate a probabilistic segmentation of a new subject. Recently, atlases generated by averaging
multiple subjects from a population have become increasingly
popular. For the human heart, an average atlas derived from cardiac
MR images by Rao et al. (2003) has been used for atlas-based
segmentation (Lorenzo-Valdes et al., 2002). Similarly, an average
atlas of the lung has been derived from CT images by Li et al.
(2003).
The present paper explicitly focuses on the atlas selection and
compares different atlas selection strategies. In particular, we
compute the accuracies of segmentations generated using (1) a
single individual atlas, (2) an average-shape atlas, (3) the best
individual atlas from a database, and (4) all atlases from a database,
combined using multi-classifier decision fusion. For comparison,
the accuracies of all atlas-based segmentations are evaluated by
computing their accuracy with respect to a manually generated gold
standard segmentation.
The registration-based segmentation technique is applied to
confocal microscopy images acquired from the brains of 20 bees
(Fig. 1). Various artifacts in the images complicate the successful
application of optical flow and fluid registration methods commonly used for atlas-based segmentation. Substantial intensity variations are caused by a combination of variability in the dissection,
fixation, and staining process; temporal laser power fluctuation; and
spatial distribution of the synapse proteins that are labeled fluorescent. Spurious image edges are furthermore introduced by tiled
image acquisition and subsequent merging (see Imaging for a brief
1429
Table 1
The 22 anatomical structures of the bee brain that are labeled in this study
with assigned abbreviations
Abbreviation
Anatomical
structure
Abbreviation
Anatomical
structure
PL-SOG
r-medBR
CB
protocerebral
lobes
central body
l-latBR
l-Med
left medulla
r-latBR
r-Med
l-Lob
r-Lob
l-AL
r-AL
l-vMB
right medulla
left lobula
right lobula
left antennal lobe
right antennal lobe
left ventral
mushroom body
right ventral
mushroom body
left medial basal ring
l-medColl
r-medColl
l-latColl
r-latColl
l-medLip
r-medLip
right medial
basal ring
left lateral
basal ring
right lateral
basal ring
left medial collar
right medial collar
left lateral collar
right lateral collar
left medial lip
right medial lip
l-latLip
left lateral lip
r-latLip
right lateral lip
r-vMB
l-medBR
summary of the imaging process). For the above reasons, we have
chosen to apply a nonrigid registration algorithm by Rueckert et al.
(1999) that we have found to be reliable and efficient in previous
applications (Rohlfing and Maurer, 2003; Rohlfing et al., 2003a,b).
The algorithm uses a B-spline-based free-form deformation as the
transformation model and a global image similarity measure with a
penalty term to constrain the deformation to be smooth. A brief
review of the registration method is provided in Image registration
algorithm.
This work is part of a larger project that aims to quantify the
anatomy of the honeybee brain. In this project, the atlas-based
segmentation is intended to facilitate volumetric measurements of
certain brain compartments during development. We are also
developing and applying nonrigid registration algorithms using
free-form deformation and information-theoretic similarity measures to create a reference atlas of the honeybee brain used to
integrate functional and structural data coming from different
individuals (Brandt et al., submitted for publication; Rohlfing et
al., 2001). Thus, we hope to use these registration methods to
integrate into the atlases a variety of neurons, including optic and
olfactory neurons, which are imaged after single cell injections. The
atlases and shape models will be compared to look for gross volume
and shape differences, and to compare the densities and type of
projections. This methodology is also useful for medical image
processing, and we are currently applying it to the construction of
statistical shape models of human bones from CT images.
Materials and methods
Imaging
Fig. 1. Example of bee brain confocal microscopy (top) and corresponding
label image as defined by manual segmentation (bottom). Every gray level
in the label image represents a different anatomical structure. Due to
limitations of reproduction, different gray levels may look alike. The
correspondence between anatomical structures and abbreviations is listed in
Table 1. Note that two structures, the left and right antennal lobes (l-AL and
r-AL), are not visible in this slice.
For this study, 20 brains from adult, foraging honeybees served
as subjects. Staining followed an adapted immunohistological
protocol. Dissected and fixated brains were treated with antisera
raised against synapse proteins (nc46, SYNORF1) (Klagges et al.,
1996; Reichmuth et al., 1995) and labeled fluorescent using a Cy3conjugated secondary antibody. After dehydration and clearing, the
specimens were mounted in double-sided custom slides.
1430
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
Whole mounts were imaged with a confocal laser scanning
microscope (Leica TCS 4D) using a Leica HC PL APO 10/0.4 dry
lens. The fluorescence was excited with the 568-nm line of an
ArKr laser, detected using a 590-nm long-pass filter, and quantized
with a resolution of 8 bits. Due to the size of the dissected and
embedded brain, the specimens were scanned sequentially in 2 3
partially overlapping single scans, each using 512 512 pixels
laterally. Stacks were combined and resampled laterally to half of
the original dimensions so that the final image volume contained
84 – 114 slices (sections) with thickness of 8 Am, and each slice had
610 – 749 pixels in x direction and 379 – 496 pixels in y direction
with pixel size of 3.8 Am.
Subsequently, a gold standard for the automatic segmentation
was created by manually tracing the neuropil areas of interest on
each slice. This task was performed with the Amira 3-D scientific
visualization and data analysis package (ZIB, Berlin, Germany;
Indeed – Visual Concepts GmbH, Berlin, Germany; TGS Inc., San
Diego, CA). We distinguished 22 major compartments of the bee
brain, 20 of which are bilaterally symmetric on either brain
hemisphere. The paired structures we labeled were medulla; lobula;
the antennal lobe; the ventral mushroom body consisting of
peduncle, a- and h-lobe; and medial and lateral lip, collar, and
basal ring neuropil. The unpaired structures we considered were
the central body with its upper and lower division and the
protocerebral lobes including the subesophageal ganglion (see also
Mobbs, 1985). An example of a confocal microscopy and the
corresponding label image is shown in Fig. 1. The abbreviations
for all anatomical structures are listed in Table 1.
The manual segmentation was performed by two experts, each
of whom segmented a different subset of the available bee brains.
Repeated segmentation of the same individual by several experts
was not feasible due to the large amount of data and limited
resources. There is therefore no problem-specific estimate of the
inter-observer segmentation variability for human experts, and no
information regarding the accuracy of the gold standard. However,
the segmentation problem was posed in a way to facilitate human
segmentation, for example, by not separating substructures that
cannot be visually distinguished. The protocerebral lobes and the
subesophageal ganglion, for example, are treated as one structure
for the purpose of segmentation.
Image registration algorithm
Atlas-based segmentation requires the computation of an accurate coordinate transformation between the image to be segmented
and an already segmented atlas image. An initial alignment of both
images is first achieved using an affine registration method with 9
degrees of freedom (DOFs). The method we use is an implementation of the technique for rigid and affine registration described by
Studholme et al. (1997). It uses normalized mutual information
(NMI) as the similarity measure (Studholme et al., 1999). In the
first step, this method is employed directly for finding an initial
affine transformation to capture the global displacement of both
images. This transformation is then used as the initial estimate for
the nonrigid registration.
The nonrigid algorithm is a modified implementation of the
technique introduced by Rueckert et al. (1999). It uses the same NMI
similarity measure as the rigid registration. However, a different
optimization technique is used to address the problem of the high
dimensionality of the search space in the nonrigid case. Using
adaptive grid refinement (Rohlfing and Maurer, 2001) and a parallel
multiprocessor implementation (Rohlfing and Maurer, 2003), we are
able to keep computation times within reasonable bounds.
The confocal microscopy images in the present study suffer
from substantial intensity variations, which frequently cause problems for intensity-based registration methods. The same anatomical
structure may be imaged with a near-constant intensity in one
subject, but cover a large range of intensities in an image from
another subject. The nonrigid registration algorithm has a tendency
to align homogeneous substructures in one image with homogeneous structures in the other image. In the presence of the
aforementioned intensity distribution differences, the result is the
mapping of unrelated image regions, producing a grossly incorrect
coordinate transformation (Fig. 2).
Analogously to Rueckert et al. (1999), we incorporate a
regularizing penalty term in addition to the NMI similarity measure
to constrain the deformation of the coordinate space, drive the
registration process in areas of inconsistent image intensities, and
thus prevent grossly incorrect deformations. An illustrative example is shown in Fig. 2. In detail, we force the deformation to be
smooth by adding a biharmonic penalty term, which is based on the
energy of a thin plate of metal that is subjected to bending
deformations (Bookstein, 1989; Wahba, 1990). The penalty term
is composed of second-order derivatives of the deformation,
integrated over the domain D of the transformation T as follows:
Econstraint ¼
Z D
BT
Bx2
"
þ2
2 BT 2 BT 2
þ
þ
By2
Bz2
BT
BxBy
2 #
BT 2
BT 2
þ
þ
dx:
ByBz
BzBx
ð1Þ
With the constraint term incorporated, the total optimization
function becomes a weighted sum of the data-dependent image
similarity and the regularization constraint term:
Etotal ¼ ð1 wÞENMI þ wEconstraint :
ð2Þ
An important issue with constrained nonrigid registration methods is the relative weighting of the image similarity measure and
the deformation constraint penalty term in the cost function. Since
the two terms represent fundamentally unrelated quantities, there is
no obvious way to determine the correct weight w a priori. It would
also be desirable in many cases to choose different weights for
different images, or even for different regions within the same
image, making w a function of location. Again, since there is no
formal way to determine the weight globally, there is also no
solution for the harder problem of selecting the weight locally.
In the present study, a single global weight (w = 0.1) was
chosen for all individuals by repeating the registrations with
different values and choosing the one that produced the highest
median segmentation accuracy. Importantly, the segmentation
results turned out to be relatively insensitive to the value of the
weight over a fairly wide range of values. Both properties of the
relationship between registration accuracy and smoothness constraint weight, the peak at w = 0.1 and the relative insensitivity to
the value of the weight, are nicely illustrated in Fig. 3.
Atlas selection strategies
A major point of this paper is the investigation of the
influence of the choice of the reference atlas on the outcome
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
1431
Fig. 2. Illustration of registration process and importance of constraining nonrigid registration. These microscopy images are magnified in the area of the left
lobula (Fig. 1 shows a complete image). In the first step of the registration process, a floating image is initially globally aligned (b) to the reference image (a)
using a rigid registration algorithm. The rigid transformation is then used as the initial estimate for the nonrigid registration. In the reference image (a), the
lobula appears substantially darker on the lateral side (ellipse), while in the floating image (b) from another bee, the lobula has a more homogeneous intensity.
An unconstrained intensity-based nonrigid registration (c) computes a grossly incorrect deformation (arrows). A constrained nonrigid registration (d) does not
have this problem.
of the registration-based segmentation. In particular, we evaluate four different strategies, which are described in detail
below.
We refer to the already segmented image as the atlas image and
the image to be segmented as the raw image. The coordinates of
the atlas image are mapped by way of nonrigid registration onto
those of the raw image and thereby provide a segmentation of the
latter. In the context of nonrigid registration, the atlas image is to
be deformed while the raw image remains fixed. The correspondence between the common terms for both images in image
registration and in the present context is such that the atlas image
acts as the floating image during registration while the raw image
acts as the reference image.
For the present study, 20 manually segmented confocal microscopy images were available as candidates for use as both the raw
and the atlas image. For each registration-based segmentation
performed in this study, one of the images was used as the raw
image. This image is automatically segmented using a registrationbased method. The manual segmentation in this case is used only
for validation (see Validation study design). The remaining 19
images were then available as atlas images. We refer to each of
these atlas images as an individual atlas, since it corresponds to an
Fig. 3. Percentage of correctly labeled voxels (median over all segmented images) vs. smoothness constraint weight used for nonrigid registration. Note that to
visually separate the plotted lines, the vertical axis only covers the top 10% range between 90% and 100%. For a description of the four segmentation methods,
see Atlas selection strategies.
1432
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
actual individual subject. In addition, we consider below an
average atlas, which is a segmented image generated by averaging
the shapes of a population of subjects (Rohlfing et al., 2001).
We evaluate registration-based segmentation results obtained
using four different choices of the atlas image(s):
Individual atlas image (IND). One of the 20 manually
segmented images was chosen to serve as an individual atlas
and registered to each of the 19 remaining images. The choice
of the atlas was based on visual assessment of image quality
and intensity uniformity. To compare segmentation results for
the same set of raw images for all strategies and to avoid
potential bias of the validation, the image used as the atlas for
this paradigm is excluded from evaluation of the three other
strategies described below; thus, a total of 19 eligible raw
images are used for validation of all methods.
Average-shape atlas image (AVG). An average-shape atlas
image is registered to each of the 19 eligible raw images. The
average-shape atlas image was generated from all 20 individual
atlases using a method outlined in Appendix A. The motivation
for using an average atlas is that, since it represents the average
shape of a population, it should require less deformation than a
randomly selected individual atlas when nonrigidly registered
to a given raw image. Thus, it might provide higher
segmentation accuracy.
Most similar atlas image from a database (SIM). Each of the
eligible raw images is registered to the remaining 19 atlas
images. The most ‘‘similar’’ atlas image out of these 19 is then
used as the actual atlas image for segmentation. In Appendix B,
we compare four different criteria for selection of the most
similar atlas image. Based on the results described there, the
criterion chosen was the value of NMI after nonrigid registration.
All atlas images from a database with multi-classifier decision
fusion (MUL). Each of the eligible raw images is registered
nonrigidly to the remaining 19 atlas images. Each registration
produces a segmentation, a total of 19 segmentations per raw
image. All segmentations are in the coordinate system of the raw
image and can easily be combined into a final segmentation by
assigning to each voxel the label that received the most ‘‘votes’’
from the individual atlases (Rohlfing et al., 2001). This technique
is equivalent to decision fusion using the ‘‘Vote Rule’’ in a multiclassifier system (Xu et al., 1992). In more detail, we are applying
partial volume interpolation (PVI) as described by Maes et al.
(1997) to interpolate labels in the deformed atlas images. The
classifications from all atlases are then combined using the ‘‘Sum
Rule,’’ which is generally considered to be superior to the Vote
Rule (Kittler and Alkoot, 2003; Kittler et al., 1998). Table 2
provides an overview of the conceptual differences between the
four atlas selection strategies.
(s)
(s)
where Vmanual
and Vatlas
denote the sets of voxels labeled as belonging
to structure s by manual and atlas-based segmentation, respectively.
For perfect mutual overlap of manual and atlas-based segmentation,
the SI has a value of 1. Lesser overlap results in smaller values of SI.
No overlap between the segmentations results in an SI value of 0.
Note that the exact value of SI for a segmentation error of one voxel,
for example, depends on parameters of the segmented object, such as
its volume and its shape characteristics.
Results
Comparison of atlas selection strategies
To enable comparison of the different atlas selection strategies,
Fig. 4 shows the percentage of the segmentations using each
method that achieved an SI value greater than given thresholds
between 0.70 and 0.95. It is easy to see that the MUL paradigm
produced segmentation results superior to the ones produced by the
other methods. The MUL strategy achieved SI values of 0.7 or
higher for 97% of all segmentations. The AVG paradigm produced
SI values that are consistently lower than MUL, but with the
exception of the 0.7 threshold also consistently higher than both
IND and SIM. Finally, the SIM strategy produced slightly better
results than the IND strategy. The mean SI values of registrationbased segmentations produced by the four atlas selection strategies
are: MUL, 0.86; AVG, 0.84; SIM, 0.82; IND, 0.81. Each of these
values is the mean of 418 segmentation SI values, one for each of
22 anatomical structures in each of 19 segmented brains.
These observations are supported by statistical tests that were
performed as follows: First, the mean SI value over all anatomical
structures was computed for each segmented raw image. This was
done because all anatomical objects in one raw image are segmented using the same nonrigid registration transformation and
thus the SI values for all structures in one segmentation cannot be
considered to be statistically independent. Then the four sets (one
for each atlas selection strategy) of 19 mean SI values (one for each
segmented image) were compared using two-tailed paired t tests
for all combinations of atlas selection strategies. The results are
Validation study design
For every registration, the registration-based segmentation is
compared with the manual segmentation. As a measure of segmentation quality, we compute the similarity index (SI) (Zijdenbos
et al., 1994). For a structure s, the SI is defined as
ðsÞ
SIðsÞ ¼
ðsÞ
2jVmanual \ Vatlas j
ðsÞ
ðsÞ
jVmanual j þ jVatlas j
;
ð3Þ
Fig. 4. Percentage of registration-based segmentations with similarity index
SI better than given threshold, plotted by atlas selection strategy.
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
1433
Table 2
Overview of atlas selection strategies
Strategy
No. of atlases
per raw image
Type of
atlas
Assignment of atlas
to raw image
IND
SIM
AVG
MUL
one
one
one
multiple
individual
individual
average
individual
fixed
variable
fixed
fixed
The four strategies evaluated in this paper can be categorized according to
the number of atlases used per raw image (one or multiple), the type of atlas
used (individual or average), and the assignment of atlases to raw images
(fixed, i.e., same atlas(es) for all raw images, or variable, i.e., different atlas
image selected for each raw image). See text for details.
listed in Table 3. Segmentations produced using the MUL paradigm have SI values that are significantly better than segmentations
generated using the other three paradigms.
Upper and lower boundaries for IND and SIM strategies
Given the ground truth segmentations, we computed the upper
and lower boundaries of all possible atlas choices for the IND and
SIM strategies. The results are shown in Fig. 5. The meanings of
the columns are as follows: ‘‘Best IND’’ represents the best
possible result that can be achieved by using a single fixed
individual atlas to segment all raw images. ‘‘Worst IND’’ is the
opposite, which is the worst possible result with a single fixed
atlas. Likewise, ‘‘Best SIM’’ is the result achieved using the best
possible choice of atlas for each raw image, where we allow the use
of different atlases for different raw images. Similarly, ‘‘Worst
SIM’’ is the worst possible outcome of such a strategy. Appendix B
provides some more details on the computation of performance
bounds, in particular ‘‘Best SIM’’.
The center columns of Fig. 5, labeled ‘‘IND’’ and ‘‘SIM,’’ show
the results achieved using the actual IND and SIM strategies. For
IND, this is using the atlas selected based on visual assessment,
and for SIM it is using the most similar atlas based on the NMI
similarity measure after nonrigid registration. As Fig. 5 nicely
shows, the actual IND strategy achieves results in between ‘‘Worst
IND’’ and ‘‘Best IND.’’ Likewise, the performance of the actual
SIM strategy is within the bounds provided by ‘‘Worst SIM’’ and
‘‘Best SIM.’’ The figure also shows that both IND and SIM
perform close to the respective upper boundaries, indicating
reasonable criteria for atlas selection within each strategy. Finally,
it is interesting to note that the results of ‘‘Worst SIM’’ are worse
than those of ‘‘Worst IND,’’ while those of ‘‘Best SIM’’ are better
Fig. 5. Upper and lower accuracy boundaries of all possible atlas selections
for the IND and SIM strategies.
than those of ‘‘Best IND.’’ This is easily explained by the
observation that, in fact, any given IND strategy is a special case
of a general SIM strategy. The SIM strategy therefore, in theory,
provides more freedom of choice to improve the result (or make it
worse, when looking for the lower quality bound).
We note that the MUL and AVG atlas selection strategies
produce segmentation accuracies superior even to selection of
the best possible atlas based on knowledge of the a posteriori SI
values (Fig. 6). This strategy, which is obviously only available in
a validation study with known ground truth, represents the upper
limit of segmentation accuracy with a single individual atlas image.
In other words, in our study, any strategy that selects a single
individual atlas, even if one allows for a different atlas to be used
Table 3
Results of paired t tests between the atlas selection strategies with respect to
SI values over all 19 segmented brains
IND
IND
SIM
AVG
MUL
NS
+ ( P < 0.01)
+ ( P < 0.001)
SIM
AVG
MUL
NS
( P < 0.01)
NS
( P < 0.001)
( P < 0.001)
( P < 0.001)
NS
+ ( P < 0.001)
+ ( P < 0.001)
The row strategies are compared to the column strategies. A table entry ‘‘+’’
denotes statistically significant superiority of the former over the latter;
‘‘’’ denotes inferiority; ‘‘NS’’ denotes a statistically insignificant difference ( P > 0.05). For significant differences, the respective confidence
levels are given in parentheses.
Fig. 6. Upper and lower accuracy boundaries for the IND and SIM
strategies compared to AVG and MUL strategies.
1434
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
for each raw image, will produce results inferior to those achieved
using a combination of multiple individual atlases or an average
atlas.
Overall segmentation quality
Fig. 7 shows typical atlas-based segmentations of two frontal
(axial) slices of one bee brain. Atlas-based segmentation produced
results that are generally within two voxels of the manually
assigned segmentation and often within one voxel. This is illustrated by the segmentation error images, in which voxels with
different labels assigned by automatic and manual segmentation
are shown in black. There are occasionally a few isolated areas
where the differences between the automatic and manual segmentations are more substantial.
The structures most and least accurately segmented (with the
MUL method) are shown in Figs. 8 and 9, respectively. For spatial
orientation purposes, these structures are also marked in a threedimensional rendering of a segmented bee brain in Fig. 10. The
most accurately segmented structure (SI = 0.97) was a left lobula
(Fig. 8). It was bright and well delineated. This structure was
therefore easy to register correctly to the atlas images. The manual
and automatic segmentation differ by no more than one voxel in
most regions, which is nicely shown by the difference images.
The least accurately segmented structure (SI = 0.55) was a right
medial lip (Fig. 9). This structure was hard to segment due to its
complex shape and faint boundaries. Furthermore, the right medial
lip has the shape of a torus in most individuals. In the subject
shown in Fig. 9, however, the torus is not closed. This deviation
from the majority of atlases represents an additional challenge for
the registration-based segmentation. Note that the large area of
disagreement between automatic and manual segmentation in the
horizontal slices (bottom row of Fig. 9) was mostly due to an outof-plane misalignment along the edge of the segmented structure.
Similarity index vs. object size and shape
To appreciate the SI values computed in this study and to
compare them with other published values, we investigated the
dependence of SI values on object size. We performed a numerical
simulation in which discretely sampled spheres of various radii
were dilated by one or two voxels and the SI values between the
original and dilated spheres were computed. The resulting SI
values are plotted vs. object radius in Fig. 11. It is also easy to
Fig. 7. Results of segmentation using nonrigid image registration (MUL atlas selection paradigm). The two columns show frontal (i.e., axial) images at two
different slice locations. Top row: microscopy images. Center row: overlays of segmentation contours (shown in white) after nonrigid image registration. To
clearly show the contours, the dynamic range of the underlying microscopy image was reduced. Bottom row: segmentation error images. Voxels with different
labels assigned by manual and automatic segmentation are shown in black.
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
1435
Fig. 8. Most accurately segmented structure (out of 418): left lobula (SI = 0.97). Columns from left to right: microscopy image, contour from manual
segmentation, contour from automatic segmentation (MUL paradigm), difference image between manual and automatic segmentation, and perspective surface
rendering of the isolated structure. The white pixels in the difference image show where manual and automatic segmentation disagree. Rows from top to
bottom: frontal, sagittal, and horizontal slice through the left lobula. The scale in the top left image represents 100 Am, or 25 voxels in x and y direction (12.5
voxels in z).
Fig. 9. Least accurately segmented structure (out of 418): right medial lip (SI = 0.55). See Fig. 8 for row and column descriptions.
1436
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
Fig. 10. Three-dimensional surface rendering of an individual segmented bee brain with marked structures corresponding to those shown in Fig. 8 (left lobula),
Fig. 9 (right medial lip), and Fig. 14 (right lateral basal ring). Note the considerable shape difference between the near-spherical lobula and the torus-like medial
lip.
derive a closed-form expression for the continuous case. The SI
between two concentric spheres, one with radius R and the other
dilated by d, that is, with a radius of R + d, is
SI ¼
2ðR=dÞ2
3
2ðR=dÞ þ 3ðR=dÞ2 þ 3ðR=dÞ þ 1
:
ð4Þ
The SI values for the discrete and continuous cases are almost
identical (Fig. 11). The SI value between a sphere and a concentric
dilated sphere approximates the SI value for a segmentation error
consisting of a uniform thickness misclassification on the perimeter
of a spherical object. Inspection of Fig. 11 and Eq. (4) shows that
SI depends strongly on object size and is smaller for smaller
objects. A one-voxel-thick misclassification on the perimeter of a
spherical object with a radius of 50 voxels has an SI value of 0.97,
but for a radius of 10 voxels the SI value is only 0.86.
In Fig. 12, the average volumes of the anatomical structures in
the bee brain images under consideration are shown with the
segmentation accuracies achieved for them using the MUL paradigm. It is easy to see that the larger a structure, the more
accurately it was segmented by the atlas-based segmentation. This
confirms the theoretical treatment above and illustrates the varying
bias of the SI metric when segmenting structures of different sizes.
Structure volume is not the only geometric characteristic that
causes a bias of SI values. Structure shape is also important. A
characteristic shape parameter of structures that is relevant in this
context is the surface-to-volume ratio (SVR). In the discrete case,
this ratio can be determined by computing the fraction q of voxels
on the surface of the structure relative to its total number of voxels.
Given q we can, for example, compute the SI between a structure
and the structure after erosion by one voxel, which corresponds to
a misclassification of all surface voxels, as
SIq ¼
Fig. 11. Dependence of SI values on size for spherical objects. Discretely
sampled spheres of various radii were dilated by one or two voxels and the
SI values between the original and dilated spheres were computed. The SI
value between a sphere and a concentric dilated sphere approximates the SI
value for a segmentation error consisting of a uniform thickness
misclassification on the perimeter of a spherical object. The squares show
SI values computed from discrete numerical simulation of dilation by one
voxel. The solid line shows SI values for the continuous case (Eq. (4)). The
triangles show SI values computed from discrete numerical simulation of
dilation by two voxels. The broken line shows SI values for the continuous
case. Note that while the units on the horizontal axis are voxels for the
discrete case, they are arbitrary units for the continuous case.
2V ð1 qÞ
1q
¼
:
V þ ð1 qÞV
1 q=2
ð5Þ
In Fig. 13, this formula is plotted in comparison to the actual
values in our study resulting from segmentation using the MUL
strategy. For most structures, the worst segmentation over all
individuals is near the simulated one voxel erosion. The average
values over all individuals are consistently better than the one
voxel erosion line. In general, the larger the value of the SVR q,
the lower the value of the SI metric. This is consistent with the
prediction of the theoretical treatment above, which suggests that
better SI values are easier to achieve on structures with a smaller
SVR. Most importantly, Fig. 13 shows that a substantial fraction of
the structures in the bee brain have SVRs near 0.5, which means
that for the purpose of interpreting SI values, they cannot be treated
as spherical objects by considering their size alone. In total, 389
out of 418 structures (93%) were segmented with an SI value better
than the theoretical one-voxel-erosion threshold determined from
their respective SVR.
When applying the above criteria to segmentations of individual structures, we also find the theoretical predictions confirmed.
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
1437
Fig. 12. Volumes of anatomical structures and corresponding segmentation accuracies. The gray bars show the volumes (in numbers of voxels) of the 22
anatomical structures, averaged over the 20 subjects in the present study. The black vertical lines represent the ranges of SI values achieved by the automatic
segmentation (MUL paradigm) over all subjects. The diamonds show the respective medians over all subjects.
The structure that was most accurately segmented in our study,
shown in Fig. 8, was fairly large (122,000 voxels) and near
spherical (SVR q = 0.17). There was therefore no substantial
negative bias from volume and shape, resulting in a high SI value
of 0.97. The least accurately segmented structure, shown in Fig. 9,
on the other hand, was rather small (29,000 voxels), and torusshaped (SVR q = 0.51), resulting in a strong negative bias of the SI
measure. An example of a structure segmented with SI = 0.70, a
right lateral basal ring, is shown in Fig. 14. The volume of this
structure was 16,000 voxels with SVR q = 0.58. This example
illustrates that, for small structures with a complex shape, an SI
value of 0.70, achieved for 97% of all structures using the MUL
strategy, still indicates satisfactory segmentation accuracy.
Computational performance
We did not perform a detailed formal analysis of the
computation times required to obtain the nonrigid image registrations as part of the present study. Our computing resources
were very heterogeneous. Thus, there was no meaningful way of
Fig. 13. SI values vs. surface-to-volume ratio. The curved lines show the numerical simulations of erosion by 1 voxel and 1/2 voxel, respectively, according to
Eq. (5). Each dot represents a single segmented structure from one image. In addition, the mean SI value over all subjects (using the MUL strategy) is marked
as ‘‘’’ for each structure.
1438
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
Fig. 14. Example of a structure segmented with accuracy SI = 0.70: right lateral basal ring. See Fig. 8 for row and column descriptions.
comparing computation times for the four approaches we investigated using different choices of the segmented reference atlas.
Nonetheless, we can summarize the approximate execution time
for a single nonrigid image registration on the various computers
we used in this study. The time to complete one registration is
approximately 3 h on a several-year-old workstation (Sun workstation with a single UltraSparc II processor at a clock speed
between 300 and 440 MHz) and less than an hour on a current
PC (about 20 min with a 3.0-GHz Intel Pentium 4 processor).
We have implemented a parallel version of the nonrigid image
registration algorithm that takes advantage of shared-memory
multiprocessor computer architectures using multi-threaded
programming (Rohlfing and Maurer, 2003). The time to complete one registration is approximately 10 min on a two-processor PC. Using 48 processors on a 128-processor SGI Origin
3800 supercomputer (MIPS R12K processors running at 400
MHz), the computation time per registration is about 1 min
(Rohlfing and Maurer, 2003).
Discussion
The results presented in this paper indicate that the accuracy of
atlas-based image segmentation is strongly influenced by the
strategies employed for selection of the atlas image(s). The MUL
strategy, which is a novel application of multi-classifier techniques
to atlas-based segmentation, produced segmentations that are
significantly better than those produced by the other three choices
of single atlas images. This finding confirms the belief in the
pattern recognition community that multiple-classifier systems are
generally superior to single classifiers (see, e.g., Xu et al., 1992), if
their misclassifications are somewhat independent of each other.
More importantly, therefore, our results also demonstrate that
multiple classifiers, generated in a straightforward way by using
multiple atlases, can be sufficiently independent in real-world
applications for decision fusion to benefit from their complementary behavior.
The AVG paradigm produced the second best segmentations.
This may be related to the observation that registration to the
average atlas for most raw images required smaller deformations
than registration to individual atlas images (Fig. 15). The IND
and SIM approaches produced almost identical results, both
clearly inferior to those of the MUL and AVG paradigms. It is
somewhat surprising that the SIM paradigm, that is, segmentation
using the most ‘‘similar’’ individual atlas image, performed so
poorly. It is important to note this is not an effect of the criterion
used to select the most similar atlas image, but rather a general
weakness of the approach itself. As we showed by using a
posteriori SI values to select the best possible out of 19 atlases
for each raw image (‘‘Best SIM’’ results), at least in our study, no
strategy that uses a single individual atlas could outperform the
MUL and AVG paradigms.
It needs to be pointed out that the average-shape atlas was
generated from the same individuals that were segmented in the
present study. As a result, there is most likely some bias in the
evaluation of the AVG paradigm. However, the use of an averageshape atlas necessarily assumes that it is possible to create a
meaningful shape average of a population. If a population is
sufficiently homogeneous to allow generation of a stable, meaningful average, then we have good reason to believe that the results
of our study apply to an independently generated average, since the
difference between an independent and a dependent average shape
can be expected to be small.
While it produces better segmentations, the obvious disadvantage of the MUL approach relative to the three other strategies we
investigated is that it requires the computationally expensive
nonrigid registration algorithm to be applied to many atlas images
instead of just one. However, because our implementation of a
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
1439
Fig. 15. Comparison of deformation magnitudes between subjects vs. between a subject and the average-shape atlas. The diamonds show the average
deformation (over all foreground voxels) in micrometers when registering the respective raw image to the average-shape atlas. The vertical lines show the range
of average deformations when registering the respective raw image to the remaining 19 individual atlas images. The boxes show the 25th and 75th percentile of
the respective distributions.
nonrigid registration algorithm is relatively fast (relative to other
published algorithms), because registrations with different atlas
images can be run independently on different computers, and
because our implementation takes full advantage of multiprocessor
computer systems (Rohlfing and Maurer, 2003), computation time
is not too serious an issue.
The automatically generated segmentations are generally very
good, but they are not yet sufficiently good to completely replace
manual segmentation. The registration-based segmentations could
be manually refined. Our method could also be used as the first
step to initialize a subsequent segmentation method such as a
deformable model (McInerney and Terzopoulos, 1996), active
contours, or level sets (Malladi et al., 1995). The suitability for
this purpose is illustrated by the fact that the automatic segmentations are generally within two voxels of the manual segmentations, and often within one voxel. Thus, although registrationbased segmentation is not perfect, it is extremely well suited for
generating initial solutions that can then be refined by other, more
locally operating techniques. Also, up to one voxel of error can
be due to interpolation error. A possible solution to this problem
is to use splines to construct better models of the structures of
interest.
Already however, in comparison with results published by
others, the segmentation accuracies reported in this paper are very
encouraging. For example, Dawant et al. (1999) reported mean SI
values of 0.96 for segmentation of the human brain from MR
images and mean SI values of only 0.85 for segmentation of
smaller brain structures such as the caudate. The mean SI value of
segmentations produced using the MUL method in this study is
0.86, which given the small size and complex shape of most of the
structures in the bee brains considered in this study is comparable
to the values reported by Dawant et al. and supports the visual
assessment observation (Fig. 7) that the automatic segmentations
of many structures in the present study differ from manual
segmentations on average by approximately one voxel. In fact,
Zijdenbos et al. (1994) state that ‘‘SI > 0.7 indicates excellent
agreement’’ between two segmentations. This criterion (SI > 0.7) is
satisfied by the vast majority (97%) of all contours generated by
our segmentations using the MUL method (Fig. 4).
How do our results translate to other segmentation problems,
for example, segmenting structures of the human brain? In
several ways, the image quality of confocal microscopy images
is inferior to clinical MR images due to imaging and technical
limitations such as the tiled acquisition. On the other hand, the
bee brain has a less complex shape than, for example, the human
cortex, therefore posing less problems for the mathematical
treatment of the coordinate transformation. So the problem of
segmenting confocal microscopy images of the bee brain is at the
same time both harder and easier than segmenting a human brain.
Overall, we believe that both problems are sufficiently similar for
our results to be relevant. However, applying the evaluation of
atlas selection strategies to human brain data constitutes an
important next step in our work.
Another interesting question is what influence the atlas selection has in the presence of abnormal data. A fundamental problem
with atlas-based segmentation methods is their inability to segment
objects that are not present in the atlas, for example, tumors in
clinical images. When using a single atlas, it may or may not be
from the correct population. The AVG strategy has limitations as
the average atlas could either be generated for one population, thus
being inappropriate for another, or cover several populations, thus
being unspecific for each of them. The strategies most likely to be
successful are the SIM and the MUL strategies, since the database
of atlases used for both methods can easily be built to contain
multiple samples of each population.
As we showed recently (Rohlfing et al., 2003b), the accuracy
of segmentation with multiple atlases can be further improved by
applying more sophisticated methods for combining the individual segmentations. This includes, but is not limited to, extensions
of an expectation maximization method for estimating expert
performance parameters originally proposed by Warfield et al.
(2002). The general idea of such methods is to estimate the
accuracy of each individual segmentation. Using these estimates,
the atlases believed to be more accurate can then be assigned a
higher weight in the decision fusion. When using atlases from
different populations, this would, at least in theory, automatically
select the appropriate atlases for a given subject and disregard the
inappropriate ones. Testing the effectiveness of this approach in
1440
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
practice will be another interesting direction for future work in
the field.
Conclusion
The accuracy of atlas-based image segmentation can be substantially improved over that obtained using a single individual
atlas. This paper presented two promising approaches, the use of an
average-shape atlas, and especially the application of multi-classifier techniques based on integrating multiple segmentations generated using different independent individual atlases.
Acknowledgments
Torsten Rohlfing was supported by the National Science
Foundation under Grant No. EIA-0104114. Robert Brandt was
supported by the BMBF under Grant No. 0310961. Most
computations were performed on an SGI Origin 3800 supercomputer in the Stanford University Bio-X core facility for
Biomedical Computation. Some computations were performed on
a dual-processor PC with AMD Athlon processors courtesy of
Dolf Pfefferbaum at SRI International, Menlo Park, CA. The
authors thank Andreas Steege and Charlotte Kaps for manually
tracing the microscopy images. The authors thank the anonymous
reviewers for their numerous helpful comments and suggestions
that we feel have substantially improved this paper.
Appendix A . Generation of the average-shape atlas
The average-shape atlas used for segmentation with the AVG
paradigm was generated from all 20 bee brains using a method
suggested by Ashburner (2000) and applied to the bee brain by
Rohlfing et al. (2001). The algorithm is a simple iterative procedure that, unlike other methods, does not require inverting nonrigid
coordinate transformations.
The first iteration selects one arbitrary individual image as a
reference and registers each of the remaining images to the
reference using an affine transformation. Using these transformations, an average image is computed. In the second iteration, all
individuals including the initial reference are registered to the
average image by nonrigid transformations. A new average image
is generated using the new transformations and used as the
reference for the following registration iteration. The procedure
is repeated until convergence.
Because the first iteration of the algorithm is an affine registration only, the shape of the arbitrary reference image does not
predetermine the shape of the resulting final average image. There
is admittedly little strict mathematical foundation for the algorithm,
but for the purpose of this paper, we are only interested in a rather
operational definition of ‘‘average shape’’: the average-shape atlas
should minimize the overall deformations required to match all
individuals of the population to it.
For the average-shape atlas generated as described above and
used in the present study, Fig. 15 illustrates that, in fact, the
differences between a raw image and an individual atlas are on
average substantially larger than the differences between a raw
image and the average atlas. Most raw images are more similar in
shape to the average-shape atlas than to any (or at least the
majority) of the remaining 19 individual atlas images.
Appendix B . Criteria for selecting the most similar atlas
We compare four different criteria for selecting the individual
atlas image that is ‘‘most similar’’ to a given raw image, that is,
which is expected to provide the best segmentation accuracy out of
all individual atlases for this particular raw image. These criteria
are:
Value of NMI after affine registration (NMIaffine). The
similarity between the raw image and each individual atlas
image is quantified by the final value of the NMI image
similarity measure after completing the affine registration. The
atlas image with the highest NMI value after registration is
selected and used for segmentation of the respective raw image.
This criterion requires only an affine registration to be
computed between the raw image and each of the individual
atlases. It is therefore considerably less computationally
expensive than the remaining three criteria described below.
Value of NMI after nonrigid registration (NMInonrigid). This
criterion compares the NMI image similarities between the raw
image and all individual atlases after nonrigid registration.
Again, the atlas with the highest NMI value after registration is
selected and used for segmentation.
Average deformation of the atlas over all voxels (DEFavg).
After nonrigid registration, the magnitude of the deformation
between the raw image and each individual atlas is computed
and averaged over all voxels. The atlas with the smallest
average deformation is selected and used for segmentation.
Whereas the above criteria are based on intensity similarity, this
criterion is based on geometric (i.e., shape) similarity.
Maximum deformation of the atlas over all voxels (DEFmax).
This criterion is identical to the previous one, except that it uses
Fig. 16. Plot of percentages of structures segmented with accuracy equal to
or better than given SI thresholds. Each column represents one criterion for
choosing the most ‘‘similar’’ individual atlas for a given raw image. The
stacked bars show the percentages of structures that were segmented with
SI better than 0.95 through 0.70 from bottom to top. For comparison, the
leftmost column shows the results when the atlas with the best a posteriori
segmentation result is used for each raw image. This is the upper bound for
the accuracy achievable with any criterion for selection of the most similar
atlas.
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
the maximum deformation over all voxels rather than the
average. This criterion pays more attention to outliers. The idea
is that atlases that match well overall may be substantially
inaccurate in some regions.
Segmentations were generated for each of the criteria above.
The accuracy of a segmentation was computed as the SI between
the segmentation and the manual gold standard. Fig. 16 shows a
graph of the percentages of structures segmented with varying
levels of accuracy. For comparison, this graph includes results
achieved when using the best atlas according to the a posteriori SI
values for each raw image (leftmost column, Best SIM). In other
words, this column shows the best possible result that can be
achieved using only a single individual atlas, where the selection of
this atlas is governed by the knowledge of the resulting segmentation accuracy (SI value). Obviously, this is not a strategy that
could be applied in practice. However, it provides an upper bound
for the segmentation accuracy that can be achieved using a single
individual atlas image, albeit a different one for each raw image,
chosen from the database of individual atlas images.
Among the four criteria that do not depend on a posteriori
accuracy evaluation, the NMI image similarity after nonrigid
registration performed slightly better than the others. It was
therefore selected as the criterion used for the SIM atlas selection
strategy in this paper.
We note that the selection of the most similar atlas based on
nonrigidly registered raw image and atlas depends on the registration method used, as well as to some extent on its parameterization.
This is desirable, since the best-matching atlas may in fact be
different, depending on the exact registration technique. However,
it is our experience that the atlas selection is fairly stable.
Specifically, when using different smoothness constraint weights
(see Fig. 3), we found that in 8 out of 20 cases the atlas selected
based on the final value of NMI after nonrigid registration was
identical for all six constraint weights. In another five cases, there
was one constraint weight value for which a different atlas would
have been selected.
References
Ashburner, J., 2000. Computational Neuroanatomy. PhD thesis, University
College London.
Baillard, C., Hellier, P., Barillot, C., 2001. Segmentation of brain 3D MR
images using level sets and dense registration. Med. Image Anal. 5 (3),
185 – 194.
Bookstein, F.L., 1989. Principal warps: thin-plate splines and the decomposition of deformations. IEEE Trans. Pattern Anal. Mach. Intell. 11
(6), 567 – 585.
Brandt, R., Rohlfing, T., Steege, A., Westerhoff, M., Menzel, R., 2004. An
average three-dimensional atlas of the honeybee brain based on confocal
images of 20 subjects. J. Comp. Neurol. (submitted for publication).
Christensen, G.E., Johnson, H.J., 2001. Consistent image registration.
IEEE Trans. Med. Imag. 20 (7), 568 – 582.
Christensen, G.E., Rabbitt, R.D., Miller, M.I., 1996. Deformable templates
using large deformation kinematics. IEEE Trans. Image Process. 5 (10),
1435 – 1447.
Collins, D.L., Holmes, C.J., Peters, T.M., Evans, A.C., 1995. Automatic
3-D model-based neuroanatomical segmentation. Hum. Brain Mapp. 3
(3), 190 – 208.
Crum, W.R., Scahill, R.I., Fox, N.C., 2001. Automated hippocampal segmentation by regional fluid registration of serial MRI: validation and
application in Alzheimer’s disease. NeuroImage 13 (5), 847 – 855.
1441
Dawant, B.M., Hartmann, S.L., Thirion, J.P., Maes, F., Vandermeulen, D.,
Demaerel, P., 1999. Automatic 3-D segmentation of internal structures
of the head in MR images using a combination of similarity and freeform transformations: Part I, methodology and validation on normal
subjects. IEEE Trans. Med. Imag. 18 (10), 909 – 916.
Gee, J.C., Reivich, M., Bajcsy, R., 1993. Elastically deforming a threedimensional atlas to match anatomical brain images. J. Comput. Assist.
Tomogr. 17 (2), 225 – 236.
Hartmann, S.L., Parks, M.H., Martin, P.R., Dawant, B.M., 1999. Automatic
3-D segmentation of internal structures of the head in MR images using
a combination of similarity and free-form transformations: Part II, validation on severely atrophied brains. IEEE Trans. Med. Imag. 18 (10),
917 – 926.
Iosifescu, D.V., Shenton, M.E., Warfield, S.K., Kikinis, R., Dengler, J.,
Jolesz, F.A., McCarley, R.W., 1997. An automated registration algorithm for measuring MRI subcortical brain structures. NeuroImage 6
(1), 13 – 25.
Kittler, J., Alkoot, F.M., 2003. Sum versus vote fusion in multiple classifier
systems. IEEE Trans. Pattern Anal. Mach. Intell. 25 (1), 110 – 115.
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J., 1998. On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20 (3), 226 – 239.
Klagges, B.R.E., Heimbeck, G., Godenschwege, T.A., Hofbauer, A., Pflugfelder, G.O., Reifegerste, R., Reisch, D., Schaupp, M., Buchner, E.,
1996. Invertebrate synapsins: a single gene codes for several isoforms
in Drosophila. J. Neurosci. 16, 3154 – 3165.
Li, B., Christensen, G.E., Hoffman, E.A., McLennan, G., Reinhardt, J.M.,
2003. Establishing a normative atlas of the human lung: intersubject
warping and registration of volumetric CT images. Acad. Radiol. 10 (3),
255 – 265.
Lorenzo-Valdes, M., Sanchez-Ortiz, G.I., Mohiaddin, R., Rueckert, D.,
2002. Atlas-based segmentation and tracking of 3D cardiac MR images
using non-rigid registration. In: Dohi, T., Kikinis, R. (Eds.), Medical
Image Computing and Computer-Assisted Intervention—MICCAI
2002: 5th International Conference, Tokyo, Japan, September 25 – 28,
2002, Proceedings, Part I. Lecture Notes in Computer Science, vol.
2488. Springer-Verlag, Heidelberg, pp. 642 – 650.
Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., Suetens, P., 1997.
Multi-modality image registration by maximisation of mutual information. IEEE Trans. Med. Imag. 16 (2), 187 – 198.
Malladi, R., Sethian, J.A., Vemuri, B.C., 1995. Shape modelling with front
propagation: a level set approach. IEEE Trans. Pattern Anal. Mach.
Intell. 17 (2), 158 – 175.
McInerney, T., Terzopoulos, D., 1996. Deformable models in medical image analysis: a survey. Med. Image Anal. 1 (2), 91 – 108.
Miller, M.I., Christensen, G.E., Amit, Y., Grenander, U., 1993. Mathematical textbook of deformable neuroanatomies. Proc. Natl. Acad. Sci.
U.S.A. 90 (24), 11944 – 11948.
Mobbs, P.G., 1985. Brain structure. In: Kerkut, G.A., Gilbert, L.I. (Eds.),
Comprehensive Insect Physiology Biochemistry and Pharmacology.
Nervous System: Structure and Motor Function, vol. 5. Pergamon,
Oxford, pp. 299 – 370.
Rao, A., Sanchez-Ortiz, G.I., Chandrashekara, R., Lorenzo-Valdes, M.,
Mohiaddin, R., Rueckert, D., 2003. Construction of a cardiac motion
atlas from MR using non-rigid registration. In: Magnin, I.E., Montagnat,
J., Clarysse, P., Nenonen, J., Katila, T. (Eds.), Functional Imaging and
Modeling of the Heart—Second International Workshop, FIMH 2003,
Lyon, France, June 5 – 6, 2003, Proceedings. Lecture Notes in Computer
Science, vol. 2674. Springer-Verlag, Heidelberg, pp. 141 – 150.
Reichmuth, C., Becker, S., Benz, M., Reisch, D., Heimbeck, G., Hofbauer,
A., Klagges, B.R.E., Pflugfelder, G.O., Buchner, E., 1995. The sap47
gene of Drosophila melanogaster codes for a novel conserved neuronal protein associated with synaptic terminals. Mol. Brain Res. 32,
45 – 54.
Rohlfing, T., Maurer Jr., C.R., 2001. Intensity-based non-rigid registration
using adaptive multilevel free-form deformation with an incompressibility constraint. In: Niessen, W., Viergever, M.A. (Eds.), Proceedings
of Fourth International Conference on Medical Image Computing and
1442
T. Rohlfing et al. / NeuroImage 21 (2004) 1428–1442
Computer-Assisted Intervention (MICCAI 2001). Lecture Notes in
Computer Science, vol. 2208. Springer-Verlag, Berlin, pp. 111 – 119.
Rohlfing, T., Maurer Jr., C.R., 2003. Non-rigid image registration in
shared-memory multiprocessor environments with application to
brains, breasts, and bees. IEEE Trans. Inf. Technol. Biomed. 7
(1), 16 – 25.
Rohlfing, T., Brandt, R., Maurer Jr., C.R., Menzel, R., 2001. Bee brains, Bsplines and computational democracy: generating an average shape
atlas. In: Staib, L. (Ed.), IEEE Workshop on Mathematical Methods
in Biomedical Image Analysis. IEEE Computer Society, Los Alamitos,
CA, pp. 187 – 194. Kauai, HI.
Rohlfing, T., Maurer Jr., C.R., Bluemke, D.A., Jacobs, M.A., 2003a. Volume-preserving nonrigid registration of MR breast images using freeform deformation with an incompressibility constraint. IEEE Trans.
Med. Imag. 22 (6), 730 – 741.
Rohlfing, T., Russakoff, D.B., Maurer Jr., C.R., 2003b. Expectation maximization strategies for multi-atlas multi-label segmentation. In: Taylor,
C., Noble, J.A. (Eds.), Information Processing in Medical Imaging. 18th
International Conference, IPMI 2003, Ambleside, UK, July 2003. Lecture Notes in Computer Science, vol. 2732. Springer-Verlag, Berlin,
Heidelberg, pp. 210 – 221.
Rueckert, D., Sonoda, L.I., Hayes, C., Hill, D.L.G., Leach, M.O., Hawkes,
D.J., 1999. Nonrigid registration using free-form deformations: application to breast MR images. IEEE Trans. Med. Imag. 18 (8), 712 – 721.
Studholme, C., Hill, D.L.G., Hawkes, D.J., 1997. Automated three-dimen-
sional registration of magnetic resonance and positron emission tomography brain images by multiresolution optimization of voxel similarity
measures. Med. Phys. 24 (1), 25 – 35.
Studholme, C., Hill, D.L.G., Hawkes, D.J., 1999. An overlap invariant
entropy measure of 3D medical image alignment. Pattern Recogn. 32
(1), 71 – 86.
Thompson, P.M., Toga, A.W., 1997. Detection, visualization and animation
of abnormal anatomic structure with a deformable probabilistic brain
atlas based on random vector field transformations. Med. Image Anal. 1
(4), 271 – 294.
Wahba, G., 1990. Spline models for observational data. CBMS-NSF Regional Conference Series, vol. 59. Society for Industrial and Applied
Mathematics, Philadelphia, PA.
Warfield, S.K., Zou, K.H., Wells, W.M., 2002. Validation of image segmentation and expert quality with an expectation-maximization algorithm. In: Dohi, T., Kikinis, R. (Eds.), Proceedings of Fifth International
Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), Part I. Lecture Notes in Computer Science, vol.
2488. Springer-Verlag, Berlin, pp. 298 – 306.
Xu, L., Krzyzak, A., Suen, C.Y., 1992. Methods of combining multiple
classifiers and their applications to handwriting recognition. IEEE
Trans. Syst. Man Cybern. 22 (3), 418 – 435.
Zijdenbos, A.P., Dawant, B.M., Margolin, R.A., Palmer, A.C., 1994. Morphometric analysis of white matter lesions in MR images: method and
validation. IEEE Trans. Med. Imag. 13 (4), 716 – 724.

Similar documents