Image Display Algorithms for High and Low Dynamic

Transcription

Image Display Algorithms for High and Low Dynamic
Image Display Algorithms
for High and Low Dynamic Range Display Devices
Erik Reinhard1,2,3
Timo Kunkel1
Yoann Marion2
Kadi Bouatouch2
Jonathan Brouillat2
Rémi Cozot2
Abstract
With interest in high dynamic range imaging mounting, techniques
for displaying such images on conventional display devices are
gaining in importance. Conversely, high dynamic range display
hardware is creating the need for display algorithms that prepare
images for such displays. In this paper, the current state-of-the-art
in dynamic range reduction and expansion is reviewed, and in particular we assess the theoretical and practical need to structure tone
reproduction as a combination of a forward and a reverse pass.
1
Introduction
Real-world environments typically contain a range of illumination
much larger than can be represented by conventional 8-bit images.
For instance, sunlight at noon may be as much as 100 million
times brighter than starlight [Spillmann and Werner 1990; Ferwerda 2001]. The human visual system is able to detect 4 or 5 log
units of illumination simultaneously, and can adapt to a range of
around 10 orders of magnitude over time [Ferwerda 2001].
On the other hand, conventional 8-bit images with values between 0 and 255, have a useful dynamic range of around 2 orders
of magnitude. Such images are represented typically by one byte
per pixel for each of the red, green and blue channels. The limited dynamic range afforded by 8-bit images is well-matched to the
display capabilities of CRTs. Their range, while being larger than
2 orders of magnitude, lies partially in the dark end where human
vision has trouble discerning very small differences under normal
viewing circumstances. Hence, CRTs have a useful dynamic range
of 2 log units of magnitude. Currently, very few display devices
have a dynamic range that significantly exceeds this range.
The notable exception are LCD displays with an LED back-panel
where each of the LEDs is separately addressable [Seetzen et al.
2003; Seetzen et al. 2004]. With the pioneering start-up BrightSide
being taken over by Dolby, and both Philips and Samsung demonstrating their own displays with spatially varying back-lighting,
hardware developments are undeniably moving towards higher dynamic ranges.
It is therefore reasonable to anticipate that the variety in display
capabilities will increase. Some displays will have a much higher
dynamic range than others, whereas differences in mean luminance
will also increase due to a greater variety in back-lighting technology. As a result, the burden on general purpose display algorithms
will change. High dynamic range (HDR) image acquisition has
already created the need for tone reproduction operators, which reduce the dynamic range of images prior to display [Reinhard et al.
2005; Reinhard 2007]. The advent of HDR display devices will
add to this the need to sometimes expand the dynamic range of images. In particular the enormous number of conventional eight-bit
1 University of
2 Institut
Bristol, Bristol, e-mail: [email protected]
de Recherche en Informatique et Systèmes Aléatoires (IRISA),
Rennes
3 University of Central Florida, USA
Figure 1: With conventional photography, some parts of the scene
may be under- or over-exposed (left). Capture of this scene with 9
exposures, and assemblage of these into one high dynamic range
image followed by tone reproduction, affords the result shown on
the right.
Figure 2: Linear scaling of high dynamic range images to fit a given
display device may cause significant detail to be lost (left). For
comparison, the right image is tone-mapped, allowing details in
both bright and dark regions to be visible.
images may have to be expanded in range prior to display on such
devices. Algorithms for dynamic range expansion are commonly
called inverse tone reproduction operators.
In this paper we survey the state-of-the-art in tone reproduction
as well as inverse tone reproduction. In addition, we will summarize desirable features of tone reproduction and inverse tone reproduction algorithms.
2 Dynamic Range Reduction
Capturing the full dynamic range of a scene implies that in many
instances the resulting high dynamic range (HDR) image cannot
be directly displayed, as its range is likely to exceed the 2 orders
of magnitude range afforded by conventional display devices. Figure 1 (left) shows a scene with a dynamic range far exceeding the
capabilities a conventional display. By capturing the full dynamic
range of this scene, followed by tone mapping the image, an acceptable rendition of this scene may be obtained (Figure 1, right).
A simple compressive function would be to normalize an image
(see Figure 2 (left)). This constitutes a linear scaling which is sufficient only if the dynamic range of the image is slightly higher than
the dynamic range of the display. For images with a significantly
higher dynamic range, small intensity differences will be quantized
to the same display value such that visible details are lost. For comparison, the right image in Figure 2 is tone-mapped non-linearly
showing detail in both the light and dark regions.
In general linear scaling will not be appropriate for tone reproduction. The key issue in tone reproduction is then to compress an
image while at the same time preserving one or more attributes of
the image. Different tone reproduction algorithms focus on different attributes such as contrast, visible detail, brightness, or appearance. Ideally, displaying a tone-mapped image on a low dynamic
range display device would recreate the same visual response in
the observer as the original scene. Given the limitations of display
devices, this is in general not achievable, although we may approximate this goal as closely as possible.
3
Spatial operators
In the following sections we discuss tone reproduction operators
which apply compression directly on pixels. Often global and local operators are distinguished. Tone reproduction operators in the
former class change each pixel’s luminance values according to a
compressive function which is the same for each pixel [Miller et al.
1984; Tumblin and Rushmeier 1993; Ward 1994; Ferwerda et al.
1996; Drago et al. 2003; Ward et al. 1997; Schlick 1994]. The
term global stems from the fact that many such functions need to
be anchored to some values that are computed by analyzing the full
image. In practice most operators use the geometric average L̄v to
steer the compression:
!
1 X
log(δ + Lv (x, y)
(1)
L̄v = exp
N x,y
The small constant δ is introduced to prevent the average to become
zero in the presence of black pixels. The luminance of each pixel
is indicated with Lv , which can be computed from RGB values if
the color space is known. If the color space in which the image is
specified is unknown, then the second best alternative would be to
assume that the image uses sRGB primaries and white point, so that
the luminance of a pixel is given by:
Lv (x, y) = 0.2125 R(x, y) + 0.7154 G(x, y) + 0.0721 B(x, y)
(2)
The geometric average L̄v is normally mapped to a predefined display value. The main challenge faced in the design of a global
operator lies in the choice of compressive function. Many functions are possible, which are for instance based on the image’s histogram (Section 4.2) [Ward et al. 1997] or on data gathered from
psychophysics (Section 4.3).
On the other hand, local operators compress each pixel according
to a specific compression function which is modulated by information derived from a selection of neighboring pixels, rather than the
full image [Chiu et al. 1993; Jobson et al. 1995; Rahman et al. 1996;
Rahman et al. 1997; Pattanaik et al. 1998; Fairchild and Johnson
2002; Ashikhmin 2002; Reinhard et al. 2002; Pattanaik and Yee
2002; Oppenheim et al. 1968; Durand and Dorsey 2002; Choudhury and Tumblin 2003]. The rationale is that the brightness of
a pixel in a light neighborhood is different than the brightness of
a pixel in a dark neighborhood. Design challenges for local operators involve choosing the compressive function, the size of the local
neighborhood for each pixel, and the manner in which local pixel
values are used. In general, local operators are able to achieve better compression than global operators (Figure 3), albeit at a higher
computational cost.
Both global and local operators are often inspired by the human
visual system. Most operators employ one of two distinct compressive functions, which is orthogonal to the distinction between
Figure 3: A local tone reproduction operator (left) and a global
tone reproduction operator (right) [Reinhard et al. 2002]. The local
operator shows more detail, as for instance seen in the insets.
local and global operators. Display values Ld (x, y) are most commonly derived from image luminances Lv (x, y) by the following
two functional forms:
Lv (x, y)
f (x, y)
Ln
v (x, y)
Ld (x, y) = n
Lv (x, y) + gn (x, y)
Ld (x, y) =
(3a)
(3b)
In these equations, f (x, y) and g(x, y) may either be constant, or
a function which varies per pixel. In the former case, we have a
global operator, whereas a spatially varying function results in a
local operator. The exponent n is a constant which is either fixed,
or set differently per image.
Equation 3a divides each pixel’s luminance by a value derived
from either the full image or a local neighborhood. As an example, the substitution f (x, y) = Lmax /255 in (3a) yields a linear
scaling such that values may be directly quantized into a byte, and
can therefore be displayed. A different approach would be to substitute f (x, y) = Lblur (x, y), i.e. divide each pixel by a weighted
local average, perhaps obtained by applying a Gaussian filter to the
image [Chiu et al. 1993]. While this local operator yields a displayable image, it highlights a classical problem whereby areas near
bright spots are reproduced too dark. This is often seen as halos, as
demonstrated in Figure 4.
The cause of halos stems from the fact that Gaussian filters blur
across sharp contrast edges in the same way that they blur small and
low contrast details. If there is a high contrast gradient in the neighborhood of the pixel under consideration, this causes the Gaussian
blurred pixel to be significantly different from the pixel itself. By
using a very large filter kernel in a division-based approach such
large contrasts are averaged out, and the occurrence of halos can
be minimized. However, very large filter kernels tend to compute a
local average that is not substantially different from the global average. In the limit that the size of the filter kernel tends to infinity,
the local average becomes identical to the global average and therefore limits the compressive power of the operator to be no better
than a global operator. Thus, the size of the filter kernel in divisionbased operators presents a trade-off between the ability to reduce
the dynamic range, and the visibility of artifacts.
Ld 300
250
200
150
100
50
0
0
1
2
3
4
5
6
Log (Lw)
Figure 5: Tone-mapping function created by reshaping the cumulative histogram of the image shown in Figure 6.
Figure 4: Halos are artifacts commonly associated with local
tone reproduction operators. Chiu’s operator is used here without
smoothing iterations to demonstrate the effect of division (left).
4
Global Tone Reproduction Operators
While linear scaling by itself would be sufficient to bring the dynamic range within the display’s limits, this does not typically lead
to a visually pleasant rendition. In addition to compressing the
range of values, it is therefore necessary to preserve one or more
image attributes. In the design of a tone reproduction operator, one
is free to choose which image attribute should be preserved. Some
of the more common examples are discussed in the following.
4.1 Brightness Matching
Tumblin and Rushmeier have argued that brightness, a subjective visual sensation, should be preserved [Tumblin and Rushmeier
1993]. Given the luminance L of a pixel, its brightness B can be
approximated using:
«γ
„
Lv
(4)
B = 0.3698
LA
where LA is the adapting luminance, γ models the visual system’s
non-linearity and the numeric constant is due to fitting the function
to measured data. To preserve brightness, the brightness reproduced
on the display Bd should be matched to the scene brightnesses Bv .
This leads to an expression relating scene Lv and display luminances Ld :
„
«γ(Lv )/γ(Ld )
Lv
Ld = Ld,A
(5)
Lv,A
where Ld,A and Lv,A are the adapting luminances for the display and the scene respectively. To account for the fact that it
may be undesirable to always map mid-range scene values to midrange display values, this expression is multiplied with a correction
term [Tumblin and Rushmeier 1993]. The γ() function is essentially the logarithm of its parameter:
(
2.655
if L > 100 cd/m2
γ(L) =
−5
1.855 + 0.4 log10 (L + 2.3 · 10 ) otherwise
(6)
The key observation to make is that matching brightnesses between scene and display leads to a power function, with the exponent determined by the adapting luminances of the scene and the
display. This approach therefore constitutes a form of gamma correction. It is known to work well for medium dynamic range scene,
but may cause burn-out if used to compress high dynamic range
images.
In the following, we will argue that allowing burned-out areas
can be beneficial for the overall appearance of the image. However, it is important to retain control over the number of pixels that
become over-exposed. This issue is discussed further in Section 7.
Furthermore, this operator is one of the few which essentially
comprise a forward and a backward pass. The forward pass consists
of computing brightness values derived from image luminances.
The backwards pass then computes luminance values from these
brightness values which can subsequently be displayed. This is
theoretically correct, as any tone reproduction operator should formally take luminance values as input, and produce luminance values as output. If the reverse step is omitted, this operator would
produce brightness values as output. If the human visual system is
presented with such values, it will interpret these brightness values
as luminance values, and therefore process values in the brain that
have been perceived twice.
To avoid this, it is theoretically correct to apply a forward pass to
achieve range compression, followed by a reverse pass to reconstitute luminance values, and adjust the values for the chosen display.
It should not matter whether this display is a conventional monitor
or a high dynamic range display. Furthermore, this operator exhibits the desirable property that tone-mapping an image that was
tone-mapped previously, results in an unaltered image [DiCarlo and
Wandell 2000]. These issues are discussed further in Section 8.
4.2 Histogram Adjustment
A simple but effective approach to tone reproduction is to derive
a mapping from input luminances to display luminances using the
histogram of the input image [Ward et al. 1997; Duan et al. 2005].
Histogram equalization would simply adjust the luminances so that
the probability that each display value occurs in the output image
is equal. Such a mapping is created by computing the image’s histogram, and then integrating this histogram to produce a cumulative
histogram. This function can be used directly to map input luminances to display values.
However, dependent on the shape of the histogram, it is possible
that some contrasts in the image are exaggerated, rather than attenuated. This produces unnatural results, which may be overcome by
restricting the cumulative histogram to never attain a slope that is
too large. The threshold is determined at each luminance level by a
model of human contrast sensitivity. The method is then called histogram adjustment, rather than histogram equalization [Ward et al.
1997]. An example of a display mapping generated by this method
Ld 1.0
Ld = Lw / (Lw + 1)
Ld = 0.25 log (Lw) + 0.5
Ld = Lw / (Lw + 10)
Ld = Lw / (Lw + 0.1)
0.5
0.0
-2
-1
0
1
2
log (Lw)
Figure 7: Over the middle range values, sigmoidal compression is
approximately logarithmic. The choice of semi-saturation constant
determines how input values are mapped to display values.
Ld 1.0
Ld = Lnw / (Lnw + gn)
n = 0.5
n = 1.0
n = 2.0
Figure 6: Image tone-mapped using histogram adjustment.
0.5
is shown in Figure 5. This mapping is derived from the image
shown in Figure 6.
While this method can be extended to include simulations of
veiling luminance, illumination dependent color sensitivity, visual
acuity, and although knowledge of human contrast sensitivity is incorporated into the basic operator [Ward et al. 1997], we may view
this operator as a refined form of histogram equalization.
The importance of this observation is that histogram equalization is essentially rooted in engineering principles, rather than a
simulation of the human visual system. It is therefore natural to regard this operator as “unit-less” — it transforms luminance values
to different luminance values, as opposed to Tumblin and Rushmeier’s operator which transforms luminance values to brightness
values if applied in forward mode only. As a consequence there is
no theoretical need to apply this model in reverse.
This argument can be extended to other tone reproduction operators. In general, when an algorithm simulates aspects of the human
visual system, this by itself creates the theoretical need to apply
both the forward and inverse versions of the algorithm to ensure
that the output is measured in radiometric units. This requirement
does not necessarily exist for approaches based on engineering principles.
4.3 Sigmoidal Compression
Equation 3b has an S-shaped curve on a log-linear plot, and is called
a sigmoid for that reason. This functional form fits data obtained
from measuring the electrical response of photo-receptors to flashes
of light in various species [Naka and Rushton 1966]. It has also provided a good fit to other electro-physiological and psychophysical
measurements of human visual function [Kleinschmidt and Dowling 1975; Hood et al. 1979; Hood and Finkelstein 1979].
Sigmoids have several desirable properties. For very small luminance values the mapping is approximately linear, so that contrast is preserved in dark areas of the image. The function has an
asymptote at one, which means that the output mapping is always
bounded between 0 and 1. A further advantage of this function is
that for intermediate values, the function affords an approximately
logarithmic compression. This can be seen for instance in Figure 7,
where the middle section of the curve is approximately linear on
a log-linear plot. To illustrate, both Ld = Lw /(Lw + 1) and
Ld = 0.25 log(Lw ) + 0.5 are plotted in this figure 4 , showing that
4 The
constants 0.25 and 0.5 were determined by equating the values as
g=1
0.0
-2
-1
0
1
2
log (Lw)
Figure 8: The exponent n determines the steepness of the curve.
Steeper slopes map a smaller range of input values to the display
range.
these functions are very similar over a range centered around 1.
In Equation 3b, the function g(x, y) may be computed as a global
constant, or as a spatially varying function. Following common
practice in electro-physiology, we call g(x, y) the semi-saturation
constant. Its value determines which values in the input image are
optimally visible after tone mapping, as shown in Figure 7. The
effect of choosing different semi-saturation constants is also shown
in this figure. The name semi-saturation constant derives from the
fact that when the input Lv reaches the same value as g(), the output
becomes 0.5.
In its simplest form, g(x, y) is set to L̄v /k, so that the geometric
average is mapped to user parameter k (which corresponds to the
key of the scene) [Reinhard et al. 2002]. In this case, a good initial
value for k is 0.18, which conforms to the photographic equivalent of middle gray (although some would argue that 0.13 rather
than 0.18 is neutral gray). For particularly light or dark scenes this
value may be raised or lowered. Alternatively, its value may be estimated from the image itself [Reinhard 2003]. A variation of this
global operator computes the semi-saturation constant by linearly
interpolating between the geometric average and each pixel’s luminance [Reinhard and Devlin 2005].
g(x, y) = a Lv (x, y) + (1 − a) L̄v
(7)
The interpolation is governed by user parameter a ∈ [0, 1] which
has the effect of varying the amount of contrast in the displayable
image.
The exponent n in Equation 3b determines how pronounced
the S-shape of the sigmoid is. Steeper curves map a smaller useful range of scene values to the display range, whereas shallower
curves map a larger range of input values to the display range.
well as the derivatives of both functions for x = 1.
Studies in electro-physiology report values between n = 0.2 and
n = 0.9 [Hood et al. 1979].
5
Local Tone Reproduction Operators
The several different variants of sigmoidal compression shown
above are all global in nature. This has the advantage that they
are fast to compute, and they are very suitable for medium to high
dynamic range images. Their simplicity makes these operators suitable for implementation on graphics hardware as well. For very
high dynamic range images, however, it may be necessary to resort
to a local operator since this may give somewhat better compression.
5.1 Local Sigmoidal Operators
A straightforward method to extend sigmoidal compression replaces the global semi-saturation constant by a spatially varying
function, which once more can be computed in several different
ways. Thus, g(x, y) then becomes a function of a spatially localized average. Perhaps the simplest way to accomplish this is to
once more use a Gaussian blurred image. Each pixel in a blurred
image represents a locally averaged value which may be viewed as
a suitable choice for the semi-saturation constant5.
As with division-based operators discussed in the previous section, we have to consider haloing artifacts. If sigmoids are used
with a spatially varying semi-saturation constant, the Gaussian filter kernel is typically chosen to be very small to minimize artifacts.
In practice filter kernels of only a few pixels wide are sufficient to
suppress significant artifacts while at the same time producing more
local contrast in the tone-mapped images. Such small filter kernels
can be conveniently computed in the spatial domain without losing too much performance. There are, however, several different
approaches to compute a local average, which are discussed in the
following section.
Figure 9: Scale selection mechanism. The left image shows the
tone-mapped result. The image on the right encodes the selected
scale for each pixel as a gray value. The darker the pixel, the
smaller the scale. A total of eight different scales were used to
compute this image.
It is now possible to find the largest neighborhood around a pixel
that does not contain sharp edges by examining differences of Gaussians at different kernel sizes i [Reinhard et al. 2002]:
˛
˛
˛ LDoG
(x, y) ˛˛
i
˛
i = 1...8
(9)
˛ Ri σ (x, y) + α ˛ > t
Here, the DoG filter is divided by one of the Gaussians to normalize
the result, and thus enable comparison against the constant threshold t which determines if the neighborhood at scale i is considered
to have significant detail. The constant α is added to avoid division
by zero.
For the image shown in Figure 9 (left), the scale selected for each
pixel is shown in Figure 9 (right). Such a scale selection mechanism
is employed by the photographic tone reproduction operator [Reinhard et al. 2002] as well as in Ashikhmin’s operator [Ashikhmin
2002].
Once for each pixel the appropriate neighborhood is known, the
Gaussian blurred average Lblur for this neighborhood may be used
to steer the semi-saturation constant, such as for instance employed
by the photographic tone reproduction operator:
Ld =
5.2 Local Neighborhoods
In local operators, halo artifacts occur when the local average is
computed over a region that contains sharp contrasts with respect
to the pixel under consideration. It is therefore important that the
local average is computed over pixel values that are not significantly
different from the pixel that is being filtered.
This suggests a strategy whereby an image is filtered such that
no blurring over such edges occurs. A simple, but computationally
expensive way is to compute a stack of Gaussian blurred images
with different kernel sizes, i.e. an image pyramid. For each pixel,
we may choose the largest Gaussian which does not overlap with a
significant gradient. The scale at which this happens can be computed as follows.
In a relatively uniform neighborhood, the value of a Gaussian
blurred pixel should be the same regardless of the filter kernel size.
Thus, in this case the difference between a pixel filtered with two
different Gaussians should be around zero. This difference will
only change significantly if the wider filter kernel overlaps with a
neighborhood containing a sharp contrast step, whereas the smaller
filter kernel does not. A difference of Gaussians (DoG) signal
LDoG
(x, y) at scale i can be computed as follows:
i
LDoG
(x, y) = Ri σ (x, y) − R2 i σ (x, y)
i
5 Although
(8)
g(x, y) is now no longer a constant, we continue to refer to
it as the semi-saturation constant.
Lw
1 + Lblur
(10)
It is instructive to compare the result of this operator with its global
equivalent, which is defined as:
Ld =
Lw
1 + Lw
(11)
Images tone-mapped with both forms are shown in Figure 10. The
CIE94 color difference metric shown in this figure shows that the
main differences occur near (but not precisely at) high-frequency
high-contrast edges, predominantly seen in the clouds. These are
the regions where more detail is produced by the local operator.
An alternative approach includes the use of edge preserving
smoothing operators, which are designed specifically for removing
small details while keeping sharp contrasts in tact. Such filters have
the advantage that sharp discontinuities in the filtered result coincide with the same sharp discontinuities in the input image, and
may therefore help to prevent halos [DiCarlo and Wandell 2000].
Several such filters, such as the bilateral filter, trilateral filter, Susan filter, the LCIS algorithm and the mean shift algorithm are suitable [Durand and Dorsey 2002; Choudhury and Tumblin 2003; Pattanaik and Yee 2002; Tumblin and Turk 1999; Comaniciu and Meer
2002], although some of them are expensive to compute. Edge preserving smoothing operators are discussed Section 5.4.
5.3 Sub-band Systems
Image pyramids can be used directly for the purpose of tone reproduction, provided the filter bank is designed carefully [Li et al.
Figure 10: This image was tone-mapped with both global and local
versions of the photographic tone reproduction operator (top left
and right). Below, the CIE94 color difference is shown.
2005]. Here, a signal is decomposed into a set of signals that can
be summed to reconstruct the original signal. Such algorithms are
known as sub-band systems, or alternatively as wavelet techniques,
multi-scale techniques or image pyramids.
An image consisting of luminance signals Lv (x, y) is split into
a stack of band-pass signals LDoG
(x, y) using (8) whereby the n
i
scales i increase by factors of two. The original signal Lv can then
be reconstructed by simply summing the sub-bands:
Lv (x, y) =
n
X
LDoG
(x, y)
i
(12)
i=1
Assuming that Lv is a high dynamic range image, a tone-mapped
image can be created by first applying a non-linearity to the bandpass signals. In its simplest form, the non-linearity applied to each
LDoG
(x, y) is a sigmoid [DiCarlo and Wandell 2000]. However,
i
as argued by Li et al, summing the filtered sub-bands then leads
to distortions in the reconstructed signal [Li et al. 2005]. To limit
such distortions, either the filter bank may be modified, or the nonlinearity may be redesigned.
Although sigmoids are smooth functions, their application to a
(sub-band) signal with arbitrarily sharp discontinuities will yield
signals with potentially high frequencies. The high-frequency content of sub-band signals will cause distortions in the reconstruction
of the tone-mapped signal Lv [Li et al. 2005]. The effective gain
G(x, y) applied to each sub-band as a result of applying a sigmoid
can be expressed as a pixel-wise multiplier:
′
LDoG
(x, y) = LDoG
(x, y) G(x, y)
i
i
(13)
To avoid distortions in the reconstructed image, the effective gain
should have frequencies no higher than the frequencies present in
the sub-band signal. This can be achieved by blurring the effective
gain map G(x, y) before applying it to the sub-band signal. This
approach leads to a significant reduction in artifacts, and is an important tool in the prevention of halos.
The filter bank itself may also be adjusted to limit distortions
in the reconstructed signal. In particular, to remove undesired frequencies in each of the sub-bands caused by applying a non-linear
function, a second bank of filters may be applied before summing
the sub-bands to yield the reconstructed signal. If the first filter
bank which splits the signal into sub-bands is called the analysis
filter bank, then the second bank is called the synthesis filter bank.
The non-linearity described above can then be applied in-between
Figure 11: Tone reproduction using a sub-band architecture, computed here using a Haar filter [Li et al. 2005].
the two filter banks. Each of the synthesis filters should be tuned to
the same frequencies as the corresponding analysis filters.
An efficient implementation of this approach, which produces
excellent artifact-free results, is described by Li et al [Li et al.
2005]; an example image is shown in Figure 11. The image benefits
from clamping the bottom 2% and the top 1% of the pixels, which
is discussed further in Section 6.2.
Although this method produces excellent artifact-free images, it
has the tendency to over-saturate the image. This effect was ameliorated in Figure 11 by desaturating the image using the technique
described in Section 6.1 (with a value of 0.7, which is the default
value used for the sub-band approach). However, even after desaturating the image, its color fidelity remained a little too saturated.
Further research would be required to determine the exact cause
of this effect, which is shared with gradient domain compression
(Section 5.5). Finally, it should be noted that the scene depicted in
Figure 11 is a particularly challenging image for tone reproduction.
The effects described here would be less pronounced for many high
dynamic range photographs.
5.4 Edge-Preserving Smoothing Operators
An edge-preserving smoothing operator attempts to remove details
from the image without removing high-contrast edges. An example
is the bilateral filter [Tomasi and Manduchi 1998; Paris and Durand
2006; Weiss 2006], which is a spatial Gaussian filter multiplied with
a second Gaussian operating in the intensity domain. With Lv (x, y)
the luminance at pixel (x, y), the bilateral filter LB (x, y) is defined
as:
w(x, y, u, v) Lv (u, v)
(14)
w(x, y, u, v)
XX
w(x, y, u, v) =
Rσ1 (x − u, y − v) Rσ2 (Lv (x, y), Lv (u, v))
LB (x, y) =
u
v
Here, σ1 is the kernel size used for the Gaussian operating in the
spatial domain, and σ2 is kernel size of the intensity domain Gaussian filter. The bilateral filter can be used to separate an image into
’base’ and ’detail’ layers [Durand and Dorsey 2002]. Applying the
bilateral filter to an image results in a blurred image in which sharp
edges remain present (Figure 12 left). Such an image is normally
called a ’base’ layer. This layer has a dynamic range similar to the
Figure 12: Bilateral filtering removes small details, but preserves
sharp gradients (left). The associated detail layer is shown on the
right.
Figure 14: The image on the left is tone-mapped using gradient
domain compression. The magnitude of the gradients k∇Lk is
mapped to a grey scale in the right image (white is a gradient of
0; black is the maximum gradient in the image).
represented with image gradients (in log space):
∇L = (L(x + 1, y) − L(x, y), L(x, y + 1) − L(y))
Figure 13: An image tone-mapped using bilateral filtering. The
base and detail layers shown in Figure 12 are recombined after
compressing the base layer.
input image. The high dynamic range image may be divided pixelwise by the base layer, obtaining a ’detail’ layer LD (x, y) which
contains all the high frequency detail, but typically does not have a
very high dynamic range (Figure 12 right):
LD (x, y) = Lv (x, y)/LB (x, y)
(16)
Here, ∇L is a vector-valued gradient field. By attenuating large
gradients more than small gradients, a tone reproduction operator
may be constructed [Fattal et al. 2002]. Afterwards, an image can
be reconstructed by integrating the gradient field to form a tonemapped image. Such integration must be approximated by numerical techniques, which is achieved by solving a Poisson equation
using the Full Multi-grid Method [Press et al. 1992]. The resulting
image then needs to be linearly scaled to fit the range of the target display device. An example of gradient domain compression is
shown in Figure 14.
(15)
By compressing the base layer before recombining into a compressed image, a displayable low dynamic range image may be created (Figure 13). Compression of the base layer may be achieved by
linear scaling. Tone reproduction on the basis of bilateral filtering
is executed in the logarithmic domain.
Edge-preserving smoothing operators may be used to compute
a local adaptation level for each pixel, to be applied in a spatially
varying or local tone reproduction operator. A local operator based
on sigmoidal compression can for instance be created by substituting Lblur (x, y) = LB (x, y) in (10).
Alternatively, the semi-saturation constant g(x, y) in (3b) may
be seen as a local adaptation constant, and can therefore be locally
approximated with LB (x, y) [Ledda et al. 2004], or with any of the
other filters mentioned above. As shown in Figure 7, the choice
of semi-saturation constant shifts the curve horizontally such that
its middle portion lies over a desirable range of values. In a local
operator, this shift is determined by the values of a local neighborhood of pixels, and is thus different for each pixel. This leads to
a potentially better compression mechanism than a constant value
could afford.
5.5 Gradient-Domain Operators
Local adaptation provides a measure of how different a pixel is from
its immediate neighborhood. If a pixel is very different from its
neighborhood, it typically needs to be attenuated more. Such a difference may also be expressed in terms of contrast, which could be
5.6 Lightness Perception
The theory of lightness perception, provides a model for the perception of surface reflectances6. To cast this into a computational model, the image needs to be automatically decomposed
into frameworks, i.e. regions of common illumination. As an example, the window in the right-most image of Figure 17 would
constitute a separate framework from the remainder of the interior.
The influence of each framework on the total lightness needs to
be estimated, and the anchors within each framework must be computed [Krawczyk et al. 2004; Krawczyk et al. 2005; Krawczyk et al.
2006].
It is desirable to assign a probability to each pixel of belonging to
a particular framework. This leaves the possibility of a pixel having
non-zero participation in multiple frameworks, which is somewhat
different from standard segmentation algorithms that assign a pixel
to at most one segment. To compute frameworks and probabilities
for each pixel, a standard K-means clustering algorithm may be
applied.
For each framework, the highest luminance rule may now be applied to find an anchor. This means that within a framework, the
pixel with the highest luminance would determine how all the pixels
in this framework are likely to be perceived. However, direct application of this rule may result in the selection of a luminance value of
a patch that is perceived as self-luminous. As the anchor should be
the highest luminance value that is not perceived as self-luminous,
6 Lightness is
defined as relative perceived surface reflectance.
Figure 15: Per-channel gamma correction may desaturate the image. The images were desaturated with values of s = 0.2, s = 0.5,
and s = 0.8.
selection of the highest luminance value should be preceded by filtering the area of the local framework with a large Gaussian filter.
The anchors for each framework are used to compute the net
lightness of the full image. This then constitutes a computational
model of lightness perception, which can be extended for the purpose of tone reproduction.
One of the strengths of using a computational model of lightness
perception for the purpose of tone reproduction, is that traditionally
difficult phenomena such as the Gelb effect can be handled correctly. The Gelb effect manifests itself when the brightest part of a
scene is placed next to an object that is even brighter. Whereas the
formerly brightest object was perceived as white, after the change
this object no longer appears white, but light gray.
6
Post-processing
After tone reproduction, it is possible to apply several postprocessing steps to either improve the appearance of the image, adjust its saturation, or correct for the display device’s gamma. Here,
we discuss two frequently applied techniques which have a relatively large impact on the overall appearance of the tone-mapped
results. These are a technique to desaturate the results, and a technique to clamp a percentage of the lightest and darkest pixels.
6.1 Color in Tone Reproduction
Tone reproduction operators normally compress luminance values,
rather than work directly on the red, green and blue components of
a color image. After these luminance values have been compressed
into display values Ld (x, y), a color image may be reconstructed
by keeping the ratios between color channels the same as before
compression (using s = 1) [Schlick 1994]:
„
«s
Ir (x, y)
Ir,d (x, y) =
Ld (x, y)
(17a)
Lv (x, y)
„
«s
Ig (x, y)
Ig,d (x, y) =
Ld (x, y)
(17b)
Lv (x, y)
«s
„
Ib (x, y)
Ib,d (x, y) =
Ld (x, y)
(17c)
Lv (x, y)
Alternatively, the saturation constant s may be chosen smaller than
one. Such per-channel gamma correction may desaturate the results
to an appropriate level, as shown in Figure 15 [Fattal et al. 2002].
The results of tone reproduction may sometimes appear unnatural, because human color perception is non-linear with respect to
overall luminance level. If we view an image of a bright outdoors
scene on a monitor in a dim environment, we are adapted to the dim
environment rather than the outdoors lighting. By keeping color
ratios constant, we do not take this effect into account. In addition, other phenomena such as for example the Stevens, Hunt, and
Bezold-Brücke effects are not accounted for. The above approach
should therefore be seen as a limited control to account for a complex phenomenon.
A more comprehensive solution is to incorporate ideas from the
field of color appearance modeling into tone reproduction operators [Pattanaik et al. 1998; Fairchild and Johnson 2004; Reinhard
and Devlin 2005]. The iCAM image appearance model is the first
color appearance model operating on images [Fairchild and Johnson 2002; Fairchild and Johnson 2004; Moroney and Tastl 2004].
It can also be used as a tone reproduction operator. It therefore
constitutes an important trend towards the incorporation of color
appearance modeling in dynamic range reduction, and vice-versa.
A rudimentary form of color appearance modeling within a
tone reproduction operator is afforded by the sigmoidal compression scheme outlined in Section 4.3 [Reinhard and Devlin 2005].
Nonetheless, we believe that further integration of tone reproduction and color appearance modeling is desirable, for the purpose of
properly accounting for the differences in adaptation between scene
and viewing environments.
6.2 Clamping
A common post-process to tone reproduction is clamping. It is for
instance part of the iCAM model, as well as the sub-band encoding
scheme. Clamping is normally applied to both very dark as well
as very light pixels. Rather than specify a hard threshold beyond
which pixels are clamped, a better way is to specify a percentile
of pixels which will be clamped. This gives better control over the
final appearance of the image.
By selecting a percentile of pixels to be clamped, inevitably detail will be lost in the dark and light areas of the image. However,
the remainder of the luminance values is spread over a larger range,
and this creates better detail visibility for large parts of the image.
The percentage of pixels clamped varies usually between 1% and
5%, dependent on the image. The effect of clamping is shown in
Figure 16. The image on the left shows the results without clamping, and therefore all pixels are within the display range. In the
image on the right, the darkest 7% of the pixels were clamped, as
well as the lightest 2% of the pixels. This has resulted in an image that has reduced visible detail in the steps of the amphi theater,
as well as in the wall and the bushes in the background. However,
the overall appearance of the image has improved, and the clamped
image conveys the atmosphere of the environment better than the
directly tone-mapped image.
To preserve the appearance of the environment, the disadvantage of losing detail in the lightest and darkest areas is frequently
outweighed by a better overall impression of brightness. The photograph shown in Figure 16 was taken during a very bright day, and
this is not conveyed well in the unclamped images.
Finally, the effect of clamping has a relatively large effect on
the results. For typical applications it is an attractive proposition
to add this technique to any tone reproduction operator. However,
as only a few tone reproduction operators incorporate this feature
as standard, it also clouds the ability to assess the quality of tone
reproduction operators. The difference between operators appears
to be of similar magnitude as the effect of clamping.
7 Mappings for HDR Displays
The light emitted by an HDR display is, by definition, spread over
a much larger range than we see on most current display devices.
As a result, many physical scenes could be captured in HDR, and
then displayed directly. As such, the need for tone reproduction
will be removed for some images. However, this is not the case
for all environments. If the image has a range much higher than
Figure 16: Example of clamping. Both images were tone-mapped
using photographic tone reproduction. The left image is not
clamped, whereas 7% of the darkest pixels and 2% of the lightest
pixels were clamped in the right image.
the HDR display can handle, then non-linear compression schemes
will continue to be a necessary pre-display step.
As opposed to conventional displays, HDR display devices emit
enough light to be sufficiently different from the average room
lighting conditions. An important, yet poorly understood issue is
that the human visual system (HVS) will adapt in part to the device and in part to the room environment. Such partial (or mixed)
adaptation is notoriously difficult to model, and is certainly not a
feature of current tone reproduction algorithms. A good tone reproduction algorithm for HDR display devices would probably have to
account for the partial adaptation of the viewer. This would include
all forms of adaptation, including for instance, chromatic adaptation
However, other than this fundamental issue, the distinction between HDR and LDR displays is arbitrary. As such, we would
argue that a good tone reproduction algorithm needs to be adaptable to any kind of display. Similarly, in the context of dynamic
range management we see little difference between tone reproduction operators and inverse tone reproduction operators, even though
the former is used for reducing the dynamic range of an image to
match a display with a lower range, and the latter is used for expanding the range of an image to match the dynamic range of a
display with a higher dynamic range. A good tone reproduction
operator would be able to both compress the dynamic range of an
image, as well as expand it.
Nonetheless, there are other issues related to dynamic range expansion, which will have to be taken into account. These do not
relate to the expansion of values per se, but are related to artifacts
in the source material that may be amplified to become more visible. For instance, by expanding the luminance range of an image,
the lossy compression applied to JPEG images may become visible. Second, the non-linear encoding of pixel values may, after
expansion, lead to visible quantization artifacts. Finally, under- and
over-exposed areas may require separate processing to salvage a
lack of data in these regions [Wang et al. 2007].
Some solutions for these problems have been proposed. For instance, Banterle et al invert the photographic operator for dynamic
range expansion [Banterle et al. 2006]. As this effectively results in
an inverse sigmoid, this makes the implicit, but erroneous assumption that the input image is given in units which correspond to either photo-receptor output or some perceptual quantity. Blocky ar-
tifacts, for instance those arising from JPEG encoding, are avoided
by determining pixels belonging to light sources in the input image,
and applying a different interpolation scheme for those pixels.
Rempel et al have found that linear up-scaling can typically be
performed up until a contrast of 5000 : 1 before the image takes
on an unnatural appearance [Rempel et al. 2007]. Their solution
is therefore to rely predominantly on linear up-scaling, which is
consistent with the finding that, at least for relatively short exposure times, humans prefer to view linear up-scaled image over nonlinearly scaled images [Akyüz et al. 2007]. The appearance of artifacts is minimized by application of noise filtering and quantization
reduction, through the use of a bilateral filter. Pixel encodings of
235 or higher in video formats are normally assumed to indicate
light sources or highlights. Those pixels can be enhanced separately. Alternatively, highlights could be detected with a dedicated
algorithm, before being scaled separately from the remainder of the
image [Meylan et al. 2007].
In summary, the main problems associated with displaying conventional images on high dynamic range display devices, revolve
around avoiding the visibility of artifacts in the capture and encoding of legacy content. So far, simple functions have proved adequate for dynamic range expansion, although we would not draw
the conclusion that this would be the case for all images and all
display conditions.
8 Tone Reproduction and Inverse Tone
Reproduction
Many tone reproduction operators are modeled after some aspects
of human vision. The computed display values therefore essentially
represent perceived quantities, for instance brightness if the tone reproduction operator is based on a model of brightness perception.
If we assume that the model is an accurate representation of some
aspect of the HVS, then displaying the image and observing it will
cause the HVS to interpret these perceived values as luminance values.
The HVS thus applies a second perceptual transform on top of
the one applied by the algorithm. This is formally incorrect. A
good tone reproduction operator should follow the same common
practice as employed in color appearance modeling, and apply both
a forward and a reverse transform [Fairchild 1998]. The forward
transform can be any algorithm thought to be effective at compressing luminance values. The reverse transform will then apply the
algorithm in reverse, but with display parameters inserted. This approach compresses luminance values into perceived values, while
the reverse algorithm will convert the perceived values back into
luminance values.
Tumblin and Rushmeier’s approach correctly takes this approach, as does the Multi-Scale Observer model [Pattanaik et al.
1998], all color appearance models and gradient-domain operators, and the aforementioned sub-band system. However, several
perceptually-based operators are applied only in forward mode, including the photographic operator7 and the sigmoidal operator inspired by photo-receptor physiology [Reinhard and Devlin 2005].
While these operators are known to produce visually plausible results, we note that they are effectively not producing display luminances, but brightnesses or other equivalent perceptual attributes.
7 Although this operator can be explained as based on photographic principles, it’s underlying model is a perceptual model of brightness perception [Blommaert and Martens 1990].
8.1 Sigmoids Revisited
Here we discuss the implications of adding a reverse step to sigmoidal compression. Recall that equation (3b) can be rewritten as:
V (x, y) =
Ln
v (x, y)
n
Lv (x, y) + gn (x, y)
(18)
where V is a perceived value (for instance a voltage, if this equation
is thought of as a simple model of photoreceptor physiology). The
function g continues to return either a globally or locally computed
adaptation value, which is based on the image values.
To convert these perceived values back to luminance values, this
equation then needs to be inverted, whereby g is replaced with a display adaptation value (see also Section 4.1). For instance, we could
try to replace g with the mean display luminance Ld,mean . The other
user parameter in this model is the exponent n, which for the reverse model we will replace with a display related exponent m. By
making these substitutions, we have replaced all the image-related
user parameters (n and g) with their display-related equivalents (m
and Ld,mean ). The resulting inverse equation, computing display
values Ld from previously computed perceived values V is then:
Ld (x, y) =
„
V (x, y) Lm
d,mean
1 − V (x, y)
«1/m
(19)
Figure 17: Forward and reverse model with n = 0.7 and k = 0.3
for an HDR image with a relatively modest dynamic range 2.8 log
units (left) and an image with a much higher dynamic range (right;
n = 0.7, k = 0.08, 8.3 log units dynamic range.
Figure 18: Forward model only with n = 0.7 and k = 0.3 (left)
and n = 0.7, k = 0.08 (right).
For a conventional display, we would set Ld,mean to 128. The exponent m is also a display related parameter and determines how display values are spread around the mean display luminance. For low
dynamic range display devices, this value can be set to 1, thereby
simplifying the above equation to:
Ld (x, y) =
V (x, y) Ld,mean
1 − V (x, y)
(20)
The computation of display values is now driven entirely by the
mean luminance of the image (through the computation of g), the
mean display luminance Ld,mean , as well as the exponent n which
specifies how large a range of values around the mean image luminance will be visualized. As a result, the inverse transform may
create display values that are outside the display range. These will
have to be clamped.
As for most display devices the peak luminance Ld,max as well
as the black level Ld,min are known, it is attractive to use these
natural boundaries to clamp the display values against. This is arguably a more natural choice than clamping against a percentile, as
discussed in Section 6.2.
Given that we will gamma correct the image afterwards, we may
assume that the display range of Ld is linear. As such, we can now
compute the mean display luminance as the average of the display’s
black level and peak luminance:
Ld,mean =
1
(Ld,min + Ld,max )
2
(21)
As there is no good theoretical ground for choosing any specific
value for the exponent m, we will set this parameter to 1 for now.
As a result, all display related parameters are fixed, leaving only
the image-dependent parameters n as well as the choice of semisaturation constant g(). For our example, we will follow Reinhard
et al, and set g() = L̄v /k, where the user parameter k determines
how overall light or dark the image should be reproduced (see Section 4.3) [Reinhard et al. 2002]. The exponent n can be thought of
as a measure of how much contrast there is in the image.
Two results using visually determined optimal parameter settings
are shown in Figure 17. The display settings are those for an average display with an assumed black level of 1cd/m2 , and a peak luminance of 300cd/m2 . As a consequence Ld,mean was set to 150.5.
Figure 19: Photographic operator with key k = 0.3 (left) and
k = 0.08 (right). By using the photographic operator, we have
effectively changed the exponent to n = 1.
The left image is reproduced in a satisfactory manner. However,
the amount of burn-out that has occurred in the window of the right
image is too much. However, it is difficult to find a good trade-off
between having less burn-out, and introducing other artifacts with
this method.
For comparison, we show the forward only transform with otherwise identical parameter settings in Figure 18. Note that the left
image looks more flat now, largely because the exponent n is now
no longer optimal. The window in the right image now appears
more correct, as the brown glass panels are now clearly visible. We
also show the output of the photographic operator for the same pair
of images in Figure 19. The exponent n is effectively set to 1, but
the key value is the same as in the previous figures. Although these
images are computed with a forward transform only, their visual
appearance remains closer to the real environment than the images
in Figure 17.
Finally, it is desirable that a tone reproduction operator does not
alter an image that is already within the display range [DiCarlo
and Wandell 2000]. In the model proposed here this is implicitly
achieved, as for n = m and g = Ld,mean , the reverse transform is
the true inverse of the forward transform. This is borne out in the
CIE94 color difference metric, which is uniformly 0 for all pixels
after running the algorithm twice.
8.2 Combined Forward/Reverse Sigmoids
Although applying both a forward and a reverse transform is formally the correct approach to tone reproduction, there is thus a
problem for images with a very high dynamic range. For such images it is difficult, if not impossible, to find parameter settings that
lead to an acceptable compression.
To see why this is, we can plug the forward transform of (3b)
into the inverse (19):
11/m
Ln
`
´n Lm
d,mean
B
C
+ L̄/k
C
Ld = B
@
A
Ln
´n
1− n `
L + L̄/k
0
Ln
Ln/m Ld,mean
= `
´n/m
L̄/k
= c Ln/m
where
c= `
Ld,mean
´n/m
L̄/k
(22)
(23)
(24)
(25)
is a constant. Of course, this is essentially the same result as was
obtained by matching image and display brightnesses in Tumblin
and Rushmeier’s brightness matching operator (see Section 4.1).
Hence, applying a sigmoid in forward and reverse mode amounts
to applying a power function.
In our experience, this approach works very well in cases where
a medium amount of compression is required. For instance, a
medium dynamic range image can be effectively tone-mapped for
display on a low dynamic range display device. Alternatively, it
should be possible to tone-map most high dynamic range images
for display on high dynamic range display devices using this technique. However, for high compression ratios, a different approach
would be required.
A direct consequence is that we predict that color appearance
models such as CIECAM02 cannot be extended to transform data
over large ranges. It is well-known that CIECAM02 was never intended for transforming between significantly different display conditions. However, this can be attributed to the fact that the psychophysical data on which this model is based, was gathered over a
limited dynamic range. The above findings suggest that in addition,
extension of CIECAM02 to accommodate large compression ratios
would require a different functional form.
Whether the inclusion of spatial processing, such as a spatially
varying semi-saturation constant, yields more satisfactory results
remains to be seen. As can be understood from (23), replacing
g() = L¯v /k with a spatially varying function means that each pixel
is divided by a spatially determined denominator. Such an approach
was pioneered in Chiu et al’s early work [Chiu et al. 1993], and
has been shown to be prone to haloing artifacts (see Section 3 and
Figure 4). To minimize the occurrence of halos in such a scheme,
the size of the averaging kernel used to compute g() must be chosen
to be very large; typically a substantial fraction of the whole image.
But in the limit that the filter kernel becomes the whole image, this
means that each pixel is divided by the same value, resulting in a
spatially invariant operator.
9
Discussion
Tone reproduction for low dynamic range display devices is nowadays a reasonably well understood problem. The majority of images can be compressed well enough for applications in photography and entertainment, and any other applications that do not
critically depend on accuracy. Recent validation studies show that
some algorithms perform well over a range of different tasks and
displayed material [Drago et al. 2002; Kuang et al. 2004; Ledda
et al. 2005; Yoshida et al. 2005; Yoshida et al. 2006; Ashikhmin
and Goral 2007].
When dealing with different displays, each having their own dynamic range, it becomes more important to consider tone reproduction operators that can be parameterized for both different types of
images and different types of display. Following common practice
in color appearance modeling, we have argued that both a forward
and a reverse transform are necessary.
However, we have identified a disconnect between theory and
practice. In particular, a reverse operation should follow a forward
transform to convert perceived values back to values amenable for
interpretation as luminances. If we apply this to the class of sigmoidal functions, of which color appearance models form a part,
then we effectively reduce the compressive function to a form of
gamma correction. It is more difficult to produce visually plausible
results this way, as less control can be exercised over the trade-off
between burn-out, contrast, and visual appearance.
To solve this problem, we would either have to find an alternative
reasoning whereby the inverse model does not have to be applied,
or instead develop new tone reproduction operators which both include a forward and reverse model and produce visually plausible
and controllable results. The backward step would have the additional benefit of being adaptable to any type of display, including
high dynamic range display devices. We anticipate that this implies
that such operators would obviate the need for dedicated inverse
tone reproduction operators, although image processing to counter
the effects of quantization, spatial compression, as well as possible
exposure artifacts will remain necessary.
References
A KY ÜZ, A. O., R EINHARD , E., F LEM ING , R., R IEC KE, B., AND B ÜLTHOFF , H.
2007. Do HDR displays support LDR content? A psychophysical evaluation. ACM
Transactions on Graphics 26, 3.
A SHIKHMIN , M., AND G OR AL , J. 2007. A reality check for tone mapping operators.
ACM Transactions on Applied Perception 4, 1.
A SHIKHMIN , M. 2002. A tone mapping algorithm for high contrast images. In
Proceedings of 13th Eurographics Workshop on Rendering, 145–155.
BANTERLE, F., L EDDA , P., D EBATTISTA , K., AND C HALM ERS , A. 2006. Inverse
tone mapping. In GRAPHITE ’06: Proceedings of the 4th International Conference
on Computer Graphics and Interactive Techniques in Australasia and Southeast
Asia, 349–356.
B LOM MAERT, F. J. J., AND M ARTENS , J.-B. 1990. An object-oriented model for
brightness perception. Spatial Vision 5, 1, 15–41.
C HIU , K., H ER F, M., S HIR LEY, P., S WAM Y, S., WANG , C., AND Z IM MERM AN ,
K. 1993. Spatially nonuniform scaling functions for high contrast images. In
Proceedings of Graphics Interface ’93, 245–253.
C HOUDHURY, P., AND T UM BLIN , J. 2003. The trilateral filter for high contrast
images and meshes. In Proceedings of the Eurographics Symposium on Rendering,
186–196.
C OM ANICIU , D., AND M EER , P. 2002. Mean shift: a robust approach toward feature
space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence
24, 5, 603–619.
D I C ARLO , J. M., AND WANDELL , B. A. 2000. Rendering high dynamic range
images. In Proceedings of the SPIE Electronic Imaging 2000 conference, vol. 3965,
392–401.
D R AGO , F., M ARTENS , W. L., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2002. Perceptual evaluation of tone mapping operators with regard to similarity and preference. Tech. Rep. MPI-I-2002-4-002, Max Plank Institut für Informatik.
D R AGO , F., M YSZKOWSKI , K., A NNEN , T., AND C HIBA , N. 2003. Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum
22, 3.
D UAN , J., Q IU , G., AND C HEN , M. 2005. Comprehensive fast tone mapping for high
dynamic range image visualization. In Proceedings of Pacific Graphics.
D UR AND , F., AND D OR SEY, J. 2002. Fast bilateral filtering for the display of highdynamic-range images. ACM Transactions on Graphics 21, 3, 257–266.
FAIR CHILD , M. D., AND J OHNSON , G. M. 2002. Meet iCAM: an image color
appearance model. In IS&T/SID 10th Color Imaging Conference, 33–38.
FAIR CHILD , M. D., AND J OHNSON , G. M. 2004. The iCAM framework for image
appearance, image differences, and image quality. Journal of Electronic Imaging.
FAIR CHILD , M. D. 1998. Color appearance models. Addison-Wesley, Reading, MA.
FATTAL , R., L ISC HINSKI , D., AND W ER MAN , M. 2002. Gradient domain high
dynamic range compression. ACM Transactions on Graphics 21, 3, 249–256.
F ERWERDA , J. A., PATTANAIK , S., S HIR LEY, P., AND G R EENBERG , D. P. 1996. A
model of visual adaptation for realistic image synthesis. In SIGGRAPH 96 Conference Proceedings, 249–258.
F ERWERDA , J. A. 2001. Elements of early vision for computer graphics. IEEE
Computer Graphics and Applications 21, 5, 22–33.
H OOD , D. C., AND F INKELSTEIN , M. A. 1979. Comparison of changes in sensitivity and sensation: implications for the response-intensity function of the human
photopic system. Journal of Experimental Psychology: Human Perceptual Performance 5, 3, 391–405.
H OOD , D. C., F INKELSTEIN , M. A., AND B UC KINGHAM , E. 1979. Psychophysical
tests of models of the response function. Vision Research 19, 4, 401–406.
J OB SON , D. J., R AHM AN , Z., AND W OODELL, G. A. 1995. Retinex image processing: Improved fidelity to direct visual observation. In Proceedings of the
IS&T Fourth Color Imaging Conference: Color Science, Systems, and Applications, vol. 4, 124–125.
K LEINSCHMIDT, J., AND D OWLING , J. E. 1975. Intracellular recordings from gecko
photoreceptors during light and dark adaptation. Journal of General Physiology
66, 5, 617–648.
K R AWCZYK , G., M ANTIUK , R., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2004.
Lightness perception inspired tone mapping. In Proceedings of the 1st ACM Symposium on Applied Perception in Graphics and Visualization, 172.
PATTANAIK , S. N., AND Y EE , H. 2002. Adaptive gain control for high dynamic
range image display. In Proceedings of Spring Conference in Computer Graphics
(SCCG2002), 24–27.
PATTANAIK , S. N., F ERWERDA , J. A., FAIR CHILD , M. D., AND G R EENBERG , D. P.
1998. A multiscale model of adaptation and spatial vision for realistic image display. In SIGGRAPH 98 Conference Proceedings, 287–298.
P R ESS , W. H., T EUKOLSKY, S. A., V ETTER LING , W. T., AND F LANNERY, B. P.
1992. Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge
University Press.
R AHM AN , Z., J OB SON , D. J., AND W OODELL , G. A. 1996. A multiscale retinex
for color rendition and dynamic range compression. In SPIE Proceedings: Applications of Digital Image Processing XIX, vol. 2847.
R AHM AN , Z., W OODELL , G. A., AND J OB SON , D. J. 1997. A comparison of
the multiscale retinex with other image enhancement techniques. In IS&T’s 50th
Annual Conference: A Celebration of All Imaging, vol. 50, 426–431.
R EINHARD , E., AND D EVLIN , K. 2005. Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics
11, 1, 13–24.
R EINHARD , E., S TAR K , M., S HIR LEY, P., AND F ERWERDA , J. 2002. Photographic
tone reproduction for digital images. ACM Transactions on Graphics 21, 3, 267–
276.
R EINHARD , E., WAR D , G., PATTANAIK , S., AND D EB EVEC , P. 2005. High Dynamic Range Imaging: Acquisition, Display and Image-Based Lighting. Morgan
Kaufmann Publishers, San Francisco.
R EINHARD , E. 2003. Parameter estimation for photographic tone reproduction. Journal of Graphics Tools 7, 1, 45–51.
R EINHARD , E. 2007. Overview of dynamic range reduction. In The International
Symposium of the Society for Information Display (SID 2007).
R EM PEL , A., T R ENTACOSTE, M., S EETZEN , H., Y OUNG , D., H EIDRICH , W.,
W HITEHEAD , L., AND WAR D , G. 2007. On-the-fly reverse tone mapping of
legacy video and photographics. ACM Transactions on Graphics (Proceedings of
SIGGRAPH 2007) 26, 3.
S C HLICK , C. 1994. Quantization techniques for the visualization of high dynamic
range pictures. In Photorealistic Rendering Techniques, Springer-Verlag, Berlin,
P. Shirley, G. Sakas, and S. Müller, Eds., 7–20.
K R AWCZYK , G., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2005. Lightness perception
in tone reproduction for high dynamic range images. Computer Graphics Forum
24, 3, 635–645.
S EETZEN , H., W HITEHEAD , L. A., AND WAR D , G. 2003. A high dynamic range
display using low and high resolution modulators. In The Society for Information
Display International Symposium.
K R AWCZYK , G., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2006. Computational
model of lightness perception in high dynamic range imaging. In Proceedings of
IS&T/SPIE Human Vision and Electronic Imaging, vol. 6057.
S EETZEN , H., H EIDRICH , W., S TUER ZLINGER , W., WAR D , G., W HITEHEAD , L.,
T R ENTACOSTE , M., G HOSH , A., AND V OROZCOVS , A. 2004. High dynamic
range display systems. ACM Transactions on Graphics 23, 3, 760–768.
K UANG , J., YAM AGUCHI , H., J OHNSON , G. M., AND FAIR CHILD , M. D. 2004.
Testing HDR image rendering algorithms. In Proceedings of IS&T/SID 12th Color
Imaging Conference, 315–320.
S PILLMANN , L., AND W ER NER , J. S., Eds. 1990. Visual perception: the neurological
foundations. Academic Press, San Diego.
L EDDA , P., S ANTOS , L.-P., AND C HALMERS , A. 2004. A local model of eye adaptation for high dynamic range images. In Proceedings of ACM Afrigraph, 151–160.
L EDDA , P., C HALMERS , A., T ROSCIANKO , T., AND S EETZEN , H. 2005. Evaluation
of tone mapping operators using a high dynamic range display. ACM Transactions
on Graphics 24, 3, 640–648.
L I , Y., S HAR AN , L., AND A DELSON , E. H. 2005. Compressing and companding high
dynamic range images with subband architectures. ACM Transactions on Graphics
24, 3, 836–844.
M EYLAN , L., D ALY, S., AND S ÜSSTRUNK , S. 2007. Tone mapping for high dynamic
range displays. In Proceedings of IS&T/SPIE Electronic Imaging: Human Vision
and Electronic Imaging XII, vol. 6492.
M ILLER , N. J., N GAI , P. Y., AND M ILLER , D. D. 1984. The application of computer
graphics in lighting design. Journal of the IES 14, 6–26.
T OM ASI , C., AND M ANDUCHI , R. 1998. Bilateral filtering for gray and color images.
In Proceedings of the IEEE International Conference on Computer Vision, 836–
846.
T UM BLIN , J., AND RUSHMEIER , H. 1993. Tone reproduction for computer generated
images. IEEE Computer Graphics and Applications 13, 6, 42–48.
T UM BLIN , J., AND T UR K , G. 1999. LCIS: A boundary hierarchy for detailpreserving contrast reduction. In SIGGRAPH 1999, Computer Graphics Proceedings, A. Rockwood, Ed., 83–90.
WANG , L., W EI , L.-Y., Z HOU , K., G UO , B., AND S HUM , H.-Y. 2007. High dynamic range image hallucination. In Eurographics Symposium on Rendering.
WAR D , G., RUSHMEIER , H., AND P IATKO , C. 1997. A visibility matching tone
reproduction operator for high dynamic range scenes. IEEE Transactions on Visualization and Computer Graphics 3, 4.
WAR D , G. 1994. A contrast-based scalefactor for luminance display. In Graphics
Gems IV, P. Heckbert, Ed. Academic Press, Boston, 415–421.
M ORONEY, N., AND TASTL, I. 2004. A comparison of retinex and iCAM for scene
rendering. Journal of Electronic Imaging 13, 1.
W EISS , B. 2006. Fast median and bilateral filtering. ACM Transactions on Graphics
25, 3.
N AKA , K. I., AND RUSHTON , W. A. H. 1966. S-potentials from luminosity units in
the retina of fish (cyprinidae). Journal of Physiology (London) 185, 3, 587–599.
Y OSHIDA , A., B LANZ , V., M YSZKOWSKI , K., AND S EIDEL, H.-P. 2005. Perceptual
evaluation of tone mapping operators with real-world scenes. In Proceedings of
SPIE Human Vision and Electronic Imaging X, vol. 5666, 192–203.
O PPENHEIM , A. V., S C HAFER , R., AND S TOC KHAM , T. 1968. Nonlinear filtering
of multiplied and convolved signals. Proceedings of the IEEE 56, 8, 1264–1291.
PAR IS , S., AND D UR AND , F. 2006. A fast approximation of the bilateral filter using
a signal processing approach. In European Conference on Computer Vision.
Y OSHIDA , A., M ANTIUK , R., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2006. Analysis of reproducing real-world appearance on displays of varying dynamic range.
Computer Graphics Forum 25, 3, XXX.