Additional Material, Journal of Imaging Science and Technology, Vol

Transcription

Additional Material, Journal of Imaging Science and Technology, Vol
JULY/AUGUST 1998
Volume 42 • Number 4
The Journal of
IMAGING SCIENCE
and
TECHNOLOGY
IS&T
The Society for Imaging Science and Technology
EDITORIAL STAFF
M.R.V. Sahyun, Editor
IS&T
7003 Kilworth Lane, Springfield, VA 22151
715-836-4175
E-mail: [email protected]
FAX: 703-642-9094
Pamela Forness, Managing Editor
The Society for Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151
703-642-9090; FAX: 703-642-9094
E-mail: [email protected]
Vivian Walworth, Editor Emeritus
Martin Idelson, Associate Editor
Eric Hanson, Associate Editor
David R. Whitcomb, Associate Editor
Michael M. Shahin, Associate Editor
Mark Spitler, Associate Editor
David S. Weiss, Associate Editor
This publication is available in microform.
Papers published in this journal are covered in BECITM,
INSPEC, Chemical Abstracts, and Science Citation
Index.
Address remittances, orders for subscriptions and single
copies, claims for missing numbers, and notices of
change of address to IS&T at 7003 Kilworth Lane,
Springfield, VA 22151. 703-642-9090; FAX: 703-6429094; E-mail: [email protected].
The Society is not responsible for the accuracy of
statements made by authors and does not necessarily
subscribe to their views.
Copyright © 1998, The Society for Imaging Science
and Technology. Copying of materials in this journal
for internal or personal use, or the internal or personal
use of specific clients, beyond the fair use provisions
granted by the U.S. Copyright Law is authorized by
IS&T subject to payment of copying fees. The Transactional Reporting Service base fee for this journal
should be paid directly to the Copyright Clearance
Center (CCC), Customer Service, (508) 750-8400, 222
Rosewood Drive, Danvers, MA 01923 or check CCC
Online at http://www.copyright.com. Other copying for
republication, resale, advertising or promotion, or any
form of systematic or multiple reproduction of any
material in this journal is prohibited except with permission of the publisher.
Library of Congress Catalog Card No. 59-52172
Printed in the U.S.A.
Guide for Authors
Scope. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY is dedicated to the advancement of
knowledge in the imaging sciences, in practical applications of such knowledge, and in
related fields of study. The pages of this journal are open to reports of new theoretical
or experimental results and to comprehensive reviews.
Submission of Manuscripts. Send manuscripts to Pamela Forness, Managing Editor,
IS&T, 7003 Kilworth Lane, Springfield, VA 22151. Submit only original manuscripts
not previously published and not currently submitted for publication elsewhere. Prior
publication does not refer to conference abstracts, paper summaries or non-reviewed
proceedings.
Editorial Process. All manuscripts submitted are subject to peer review. If a manuscript
appears better suited to publication in the JOURNAL OF ELECTRONIC IMAGING, published jointly
by IS&T and SPIE, the editor will make that recommendation to both the editor of that
journal and the author. The author will receive confirmation, reviewers' reports, notification of acceptance (or rejection), and tentative date of publication from the Editor.
The author will receive page proofs and further instructions directly from IS&T. All
subsequent correspondence about the paper should be addressed to the Managing Editor.
Manuscript Preparation. Manuscript should be typed or printed double-spaced, with
all pages numbered. The original manuscript and three duplicates are required, with a
set of illustrations to accompany each copy. The illustrations included with the original
manuscript must be of quality suitable for reproduction or available in digital form. Legible
copies of illustrations may be submitted with the duplicate manuscripts.
Title and Abstract Pages. Include on the title page, page one, the address and affiliation
of each author. Include on page two an abstract of no more than 200 words stating briefly
the objectives, methods, results, and conclusions.
Style. The journal will generally follow the style specified in the AIP Style Manual,
published by the American Institute of Physics.
Equations. Number equations consecutively, with Arabic numbers in parentheses at the
right margin.
Illustrations. Number all figures consecutively and type captions double-spaced on a
separate page or pages. Figures should be presented in such form that they will remain
legible when reduced, usually to single column width (3.3 in., 8.4 cm). Recommended
font for figure labels is Helvetica, sized to appear as 8-point type after reduction.
Recommended size for original art is 1-2 times final size. Lines must be at least 1 point.
Color. Authors may either submit color separations or be billed by the publisher for the
cost of their preparation. Digital submission of color figures should be in CMYK EPS
or TIFF format if possible. Additional costs associated with reproduction of color illustrations will be charged to the author or the author’s supporting institution.
References. Number references sequentially as the citations appear in superscript form in
the text. Type references on pages separate from the text pages, using the following format:
Journal papers: Author(s) (first and middle initials, last name), title of article (optional), journal name (in italics),
volume (bold), first page number, year (in parentheses).
Books: Author(s) (first and middle initials, last name), title (in italics), publisher, city, year, page reference.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY (ISSN:1062-3701) is published bimonthly
by The Society for Imaging Science and Technology,
7003 Kilworth Lane, Springfield, VA 22151. Periodicals postage paid at Springfield, VA and at additional mailing offices. Printed at Imperial Printing
Company, Saint Joseph, Michigan.
Society members may receive this journal as part of
their membership. Thirty dollars ($30.00) of the
membership dues is allocated to this subscription.
IS&T members may refuse this subscription by written
request. Subscriptions to domestic non-members of
the Society, $120.00 per year; single copies, $25.00
each. Foreign subscription rate, US $135.00 per year.
POSTMASTER: Send address changes to JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 7003 Kilworth Lane, Springfield, VA 22151.
Examples:
1. G. K. Starkweather, Printing technologies for images, gray scale, and color, Proc. SPIE 1458, 120 (1991).
2. E. M. Williams, The Physics and Technology of Xerographic Processes, John Wiley and Sons, New York,
1984, p. 30.
Page charges. To help support the expenses of publication the author’s institution will
be billed at $80 per printed page. Payment is expected from the sponsoring institution,
not from the author. Such payment is not a condition for publication, and in appropriate
circumstances page charges are waived. Requests for waivers must be made in writing
to the managing editor.
Additional charges. The author will be held responsible for the cost of extensive alterations
after typesetting and for the costs of materials if the paper is withdrawn after typesetting.
Manuscripts in Digital Form. Following acceptance of a manuscript, the author will
be encouraged to submit the final version of the text and illustrations both as hardcopy
and as digital files on diskette. The managing editor will provide specifications for preparing
and submitting digital files.
Vol. 42, No. 4, July/Aug. 1998
i
Calendar
IS&T Meetings
September 7–11, 1998—International Congress on Imaging Science; Univ. of Antwerp, Belgium.
Organized by: Int’l. Committee on
the Science of Photography (ICSP)
and Royal Flemish Chemical Society
(KVCV); IS&T cooperating society.
Contact: Jan De Roeck, c/o AgfaGevaert N.V. +32-3444 30 42; Fax:
32-3444 76 97; Web: http://www.
ICPS98.be; or E-mail: dekeyzer@
twi.agfa.be
October 18–23, 1998—NIP14: The
l4th International Conference on
Digital Printing Technologies;
General Chair: David Dreyfuss,
Westin Harbour Square - Toronto;
Toronto, Ontario, Canada
November 17–20, 1998—IS&T/
SID’s 6th Color Imaging Conference–Color Science, Systems &
Applications, General Co-chairs:
Sabine Susstrunk (IS&T) and
Andras Lakatos (SID); The SunBurst
Resort Hotel, Scottsdale, Arizona
January 23–29, 1999—IS&T/SPIE
Electronic Imaging: Science and
Technology, General Co-chairs: Jan
P. Allebach (IS&T) and Richard N.
Ellson (SPIE); San Jose Convention
Center, San Jose, CA
April 25–28, 1999—The PIC Conference (IS&T’s 52nd Annual
Spring Conference), General
Chair: Shin Ohno; Hyatt Regency
Hotel, Savannah, Georgia
October 17–22, 1999—NIP15: The
l5th International Congress on
Digital Printing Technologies,
The Caribe Royal Resort Suites,
Lake Buena Vista, Florida
November 16–19, 1999—7th Color
Imaging Conference - Color Science, Systems & Applications, cosponsored by the Society for
Information Display; The SunBurst
Resort Hotel, Scottsdale, Arizona
For more details,
contact IS&T at
703-642-9090;
FAX: 703-642-9094;
E-mail: [email protected];
or visit us at
http://www.imaging.org
ii
Journal of Imaging Science and Technology
CODEN: JIMTEG 42(4) 295–380 (1998)
ISSN: 1062-3701
July/August 1998
Volume 42, Number 4
Journal of
IMAGING SCIENCE
and
TECHNOLOGY
Official publication of IS&T—The Society for Imaging Science and Technology
Contents
Special Section: 3-D Imaging
vi
From the Guest Editor
Vivian K. Walworth
295
Photography in the Service of Stereoscopy
Samuel Kitrosser
300
Advancements in 3-D Stereoscopic Display Technologies: Micropolarizers, Improved LC Shutters, Spectral
Multiplexing, and CLC Inks
Leonard Cardillo, David Swift, and John Merritt
307
Full-color 3-D Prints and Transparencies
J. J. Scarpetti, P. M. DuBois, R. M. Friedhoff, and V. K. Walworth
311
Stereo Matching by using a Weighted Minimum Description of Length Method Based on the Summation of
Squared Differences Method
Nobuhito Matsushiro and Kazuyo Kurabayashi
319
Diffuse Illumination as a Default Assumption for Shape-From-Shading in the Absence of Shadows
Christopher W. Tyler
325
3-D Shape Recovery from Color Information for a Non-Lambertian Surface
Wen Biao Jiang, Hai Yuan Wu and Tadayoshi Shioyama
General Papers
331
Optical Effects of Ink Spread and Penetration on Halftones Printed by Thermal Ink Jet
J. S. Arney and Michael L. Alber
Contents continued
iv
Journal of Imaging Science and Technology
Contents continued
335
Modeling the Yule–Nielsen Effect on Color Halftones
J. S. Arney, Tuo Wu and Christine Blehm
341
Optical Dot Gain: Lateral Scattering Probabilities
Geoffrey L. Rogers
346
Diffuse Transmittance Spectroscopy Study of Reduction-sensitized and Hydrogen-hypersensitized AgBr Emulsion
Coatings
Yoshiaki Oku and Mitsuo Kawasaki
349
Silver Clusters of Photographic Interest III. Formation of Reduction-Sensitization Centers in Emulsion Layers on
Storage and Mechanism for Stabilization by TAI
Tadaaki Tani, Naotsugu Muro and Atsushi Matsunaga
355
A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides
Ingo H. Leubner
364
Carrier Transport Properties in Polysilanes with Various Molecular Weights
Tomomi Nakamura, Kunio Oka, Fuminobu Hori, Ryuichiro Oshima, Hiroyoshi Naito, and Takaaki Dohmaru
370
Edge Estimation and Restoration of Gaussian Degraded Images
Ziya Telatar and Önder Tüzünalp
DEPARTMENTS
ii
Calendar
375
Business Directory
376
IS&T Honors and Awards—Call for Nominations
Vol. 42, No. 4, July/Aug. 1998
v
From the Guest Editor
This Special Section is devoted to topics that were addressed in the 3-D Session of the IS&T Annual Conference held in May of last year (1997), as well as in the
Imaging Technologies part of the Celebration of Imaging
held during the same Annual Conference. The Section includes a broad range of topics, from perception of depth in
normal vision of natural scenes and objects to methods for
creating the illusion of depth in the course of recording
and viewing two-dimensional representations of images
in depth. The 3-D Session was co-chaired by John Merritt
and myself. It was a privilege to share this responsibility
with John, who is well known as co-chairman of the annual Conference on Stereoscopic Displays and Applications
held within the Electronic Imaging Symposium in San Jose
under the cosponsorship of IS&T and SPIE.
This year’s Conference was the ninth of the series. Binocular vision is a significant human trait that enables observers to perceive depth in the natural world. The brain
processes the disparate information received by our two
eyes to produce the perception of depth. We are assisted
in this perception by many external clues, including perspective, motion, illumination, shading, and color information. Stereoscopic, or 3-D, imaging depends largely on
translating the real-world depth of objects or scenes to twodimensional representations to be viewed with or without
the aid of various viewing devices.
The photographic representation of depth is as old as photography itself, and perception of depth was a subject of
investigation well before the birth of photography. Sir
Charles Wheatstone constructed his first mirror stereoscope
in 1832, and it was he who is credited with coining the word
stereoscope (Greek stereos, solid, skopein, to view).
Both Wheatstone and his contemporary, Sir David
Brewster, devised lenticular stereoscopic viewers. The introduction of the daguerreotype led to a surge of popular
enthusiasm for stereoscopic daguerreotypes. By the late
1800s no Victorian parlor was complete without its
Brewster–Holmes stereoscope and a selection of stereo
vi
Journal of Imaging Science and Technology
cards comprising side-by-side left and right-eye images.
Today such stereoscopes and stereo cards are popular collectors’ items.
The twentieth century has brought us an enormous variety of stereoscopic imaging technologies, from 3-D motion pictures, made practical by Edwin Land’s sheet
polarizers, to books of computer-generated autostereoscopic images. Amateur interest in stereophotography has
been sparked by the introduction of a variety of stereo attachments and twin-lens stereoscopic cameras. Stereoscopic image pairs have been encoded by polarization, by
color, by spatial separation, and by temporal separation.
In addition to the sporadic waves of interest in one or another of these stereoscopic imaging techniques for entertainment purposes, there has been a steady increase in
the applications of stereoscopy to technical and scientific
imaging. Aerial reconnaissance, molecular modeling, and
medical imaging are familiar examples.
With the rapid growth of 3-D capability in workstations
and desktop computers, as well as the development of sophisticated instrumentation, we are seeing a surge of stereoscopic imaging in new fields. Design engineers,
seismographers, microscopists, medical researchers, and
oceanographers are finding 3-D image information indispensable. There are new technologies in both 3-D hardcopy
and 3-D field-sequential LCD displays. Paralleling these
activities is contemporary psychophysical research on just
how the eye and brain cooperate to accomplish depth perception, both in the real world and in viewing two-dimensional representations.
We offer in this Special Section a glimpse of both theoretical and practical aspects of 3-D depth perception, stereoscopic image capture, and stereoscopic image rendition.
Each of the authors has presented new insights into this
ever-intriguing branch of imaging science and technology.
Vivian Walworth
Guest Editor
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Photography in the Service of Stereoscopy*
Samuel Kitrosser†
Consultant, 23 Oakland St., Lexington, MA 02173
Our gift of binocular vision, the principles of stereoscopic viewing methods, and the creation of stereoscopic images are all closely
related. Here we recapitulate the geometry connecting these fields and provide information that can assist in generating effective
stereoscopic image pairs.
Journal of Imaging Science and Technology 42: 295–300 (1998)
Introduction
Ever since its invention, photography has been a major
source of image pairs to be used in a variety of stereoscopic viewing systems. The classical parlor stereograms,
recent 3-D films at Disney theme parks, Imax 3-D presentations, and stereograms by scientists and engineers as
well as by a vast number of amateur and professional
stereographers rely largely on photographically produced
images.
The aim of the present work is to enable one to determine in advance the necessary conditions for creating stereoscopic image pairs having proper parallactic shift for
comfortable viewing and at the same time having correct
framing that does not require custom cropping. Those familiar with stereoscopic photography will recognize that
we refer to the lens interaxial distance and to the interaxial
spacing between the respective image apertures.
General Considerations
Principles of Stereoscopic Viewing Systems. The origins of present-day stereoscopy go back to the work of Sir
Charles Wheatstone1 and Sir David Brewster2 in mid-nineteenth century England. Devices constructed by both of
these inventors share the same basic concept. These devices introduce in front of the observer’s eyes a means of
channeling the view seen by each eye along a separate
path. At the far end of this path are located image pairs
consisting of specially prepared art work or specially prepared photographs. The distinguishing characteristic of
such an image pair is that they represent scenes or objects as seen from two different vantage points. In the process of stereoscopic viewing the two images are fused in
the mind and merged to form a single 3-D illusion.
Stereoscopic viewing devices not only display the images selectively to our eyes but also assist in superimpos-
Original manuscript received April 1, 1997
* Presented in part at the IS&T 50th Annual Conference, Cambridge,
MA, May 18–23, 1997.
† IS&T Fellow
© 1998, IS&T—The Society for Imaging Science and Technology
ing the images, using optical means such as lenses, mirrors, and prisms. Some systems physically overlay the
images and rely for selective channeling on other optical
techniques, such as encoding with polarizers, with complementary color filters, or with multiplexing rasters. Some
observers can dispense with viewing devices by “commanding” their eyes to merge a pair of side-by-side images; this
technique is known as free vision. This type of viewing is
very simple; however, its success still requires a correct
pair of stereoscopic images.
The Significance of Convergence and Accommodation in Binocular Vision and Stereoscopic Viewing.
Normally we are accustomed to judging distances by subtle
mental cues received from the muscular eye controls of
convergence and accommodation. These two functions are
coupled, and they operate in synchronism as we scan each
detail of the scene observed. [Fig. 1(a)]. However, in stereoscopic viewing the tie between the accommodation and
convergence functions of the observer’s eyes must be uncoupled. The viewing distance to the right- and left-eye
images remains constant, whereas the merged scene details of the 3-D image appear at the intersections of the
individual lines of sight, as shown in Fig. 1(b).
Convergence is also important in the original capture of
the stereoscopic image pair because it determines the framing of the subject matter within the apertures. One of the
practical methods of achieving convergence is to allow the
distance between the image apertures to be greater than
the lens interaxial distance. The geometry and calculation of these distances are discussed in a later section.
Awareness of the Stereoscopic Window. The stereoscopic window is a concept that is represented in practically every stereoscopic viewing system. If we look into an
empty stereoscopic slide viewer we see that the image
apertures appear as the boundaries of a single opening,
the stereoscopic window. A similar effect is observed by
looking at two blank cards through a Brewster–Holmes
parlor stereoscope. Whereas the edges of the window do
not participate in the creation of the 3-D image, they provide an enclosure for the 3-D image. Similarly, the edges
of a stereo print form a border for the 3-D image perceived
inside.
295
(a)
(b)
Figure 1. (a) Natural binocular vision, showing convergence at each of the three points A, B, and C; (b) viewing of a stereoscopic image
pair, showing convergence at far (F), near (N), and very near (VN) points.
The stereoscopic window also forms an interface between
the domains of coupling and uncoupling mentioned in the
previous section. Within the domain bounded by the window the accommodation of the eyes remains unchanged,
locked to the distance between the observer’s eyes and the
stereogram. In the case of lenticular stereoscopes the accommodation and focus are considerably relaxed, the focusing and convergence now being assisted by the lenses
of the stereoscope.
What is even more important is that the stereoscopic
window establishes a plane of reference for the illusionary stereo image. The image may appear in front of the
window, in back of the window, or both in front and in
back. The placement depends on the creative imagination
of the photographer in establishing the position of the 3-D
image in relation to the stereoscopic window.
Experimentation
The studies that follow originated with work by Rule,3
who served as a consultant to Polaroid Corporation during the 1940s. Further experimental work was conducted
4
by the author in the Polaroid Research Laboratories.
The Geometry of Stereo Image Recording
Figure 2 is a perspective drawing of a typical camera
setup, showing two lenses side-by-side, their image apertures, and the far and near points of the subject. To simplify the geometry, the far and near points, the center of
the lens, and the center of the aperture for the left-eye
camera are shown as collinear. Following is the key to the
linear variables indicated:
D
d
f
p
distance from lens to far point of subject or scene
distance from lens to near point of subject or scene
distance from lens to film plane
parallax, the shift between furthest homologous
points at the film plane
T1 = lens interaxial distance
T2 = center-to-center distance between image apertures
w = image aperture width.
296
=
=
=
=
Journal of Imaging Science and Technology
The parallax, p, should be known to the stereographer
from the specifications of the particular imaging format
and the intended viewing conditions. The focal length of
the lens can be used as the lens-to-film distance except in
the case of close-ups, which require the actual lens-to-film
distance. The values sought are T1 and T2.
Establishment of Parallax, p. Parallax as a linear dimension is established on the basis of limitations imposed
by the uncoupling of the convergence and accommodation
functions of our eyes during stereoscopic viewing. Because
of the tendency of favoring a level orientation during image capture, we often refer to parallax as the “horizontal
displacement” or “horizontal parallax.” We find that the
linear size of the horizontal displacement is proportional
to and limited by the viewing distance to the stereogram.
Conversely, the viewing distance is a function of the size
or width of the stereo image. Here we come to a point where
an exact mathematical model perhaps does not exist. However, a ratio of parallax to image width of 1:24 emerges
from the experience and actual practice of professional
stereographers.
As an example, in the StereoRealist format the image
width is 24 mm and the recommended parallax is 1 to 1.2
mm. For the 6 × 13 cm format the recommended parallax
is5a,6 approximately 2.5 mm. These specifications are stated
as approximations because differences exist among observers and there is great forgiveness in the human visual
system.
Calculation of T1. For convenience we construct the line
“L” in Fig. 2. Then
p f
pd
= and L =
.
L d
f
T1
L
=
.
D D−d
DL
p Dd
= ×
.
T1 =
D−d f D−d
(1)
Kitrosser
Figure 2. A typical stereoscopic camera setup, showing two side-by-side lenses of focal length f, separated by the distance T1, and the
two image apertures. The value T2 is the center-to-center distance between the camera apertures, N is the near point of the scene, and
F the far point. The value D is the distance from the lens plane to the far point, d is the distance from the lens plane to the near point,
and p is parallax.
If D = ∞, T1 =
p
×d .
f
(2)
Calculation of T2. The value of T2 can be determined from
the same diagram.
T2
T
= 1.
d+ f
d
d+ f
.
T2 = T1
d
p 1 1 1
× = − .
f T1 d D
(4)
Then, for a given value of D,
(3)
Note that T2 is only slightly larger than T1. In some cases
the construction of the stereo camera prevents adjusting
the T2 dimension—for example, on side-by-side 70-mm
motion picture cameras—but still allows the adjustment
of the interlens distance, T1. This adjustment is usually
small enough to achieve the desired T1 and sufficient to
provide the needed convergence between the camera units.
An alternative is to use auxiliary optical devices.7a
A few custom cameras built for close-up work provide
laterally adjustable lens mounts that allow for independent adjustment of T2. A patent of Land, Batchelder, and
Wolff describes a camera with fixed interaxial distance
but with coupled convergence and focusing adjustments.8
This feature made it possible to maintain the stereo window at the near distance.
Photography in the Service of Stereoscopy
Calculation of D and d. To calculate D and d we rewrite
Eq. 1 as follows:
TfD
1 1 p 1
= × +
and d = 1
.
d T1 f D
pD + T1 f
(5)
Similarly, for a given value of d,
T fd
p
1 1
= −
and D = 1
.
D d T1 f
T1 f − pd
(6)
The Design of a Lens Interaxial Calculator. To simplify and speed up the calculation of T1 we designed the
Polaroid Interocular Calculator.9 The reciprocal form of Eq.
1, as shown in Eq. 4, enabled us to design this calculator in
the form of a circular slide rule. An original calculator is
shown in Fig. 4. A convenience of such a calculator is the
rapid evaluation of any of the variables in cases where the
equipment does not allow full control of needed adjustments.
We provided several hundred of these calculators to stereo
Vol. 42, No. 4, July/Aug. 1998
297
Figure 4. The experimental 5˝ × 7˝ studio stereo camera.
In the case of cloud pictures from an airliner window,
here is some arithmetic. At 800 km/h, the plane travels
220 m/s. A rapid sequence stereo pair a half-second apart
would make T1 = 110 m. Multiplication by 30 gives a distance of about 3.3 km, a fair distance for pictures of cloud
formations. Oblique shots may also be practical. At an altitude of 10,000 m, 1/30 is 333 m of travel, equivalent to
about 1.3 s interval between shots.
Figure 3. The Polaroid Interocular Calculator, a circular slide
rule for determining appropriate lens interaxial distance, given
the intended final width of the image to be viewed, the lens-film
distance, the width of the negative image, and the near and far
distances, (Reprinted with permission of Polaroid Corporation).
photographers here and abroad. The calculator has been
cited in several publications on stereoscopic photography.7b,10
Before going further in our discussion it is appropriate
to note the compatibility of the above calculations with
traditional rules and recommendations.
The 1/30 Rule. Ferwerda5b and other writers on stereoscopy11 often recommend an interaxial value of 1/30 the
near distance. The origin of this rule lies in the assumptions that the subject extends to infinity and that the focal length of the lens is the conventional 1.25 times the
image width. The rule is valid under these conditions and
it is useful, but it does not apply to medium shots and
close-ups. There it does not make full use of the available
depth effect. Following is the rationale for the 1/30 rule.
If we solve Eq. 2 for the interaxial distance, T1, given f =
1.25 w and p = w/24,
T1 =
1
1
w
×
×d =
×d .
24 1.25w
30
Sidestep Landscapes and Aerial Photographs. Both
Wing12 and Weiler13 provide guidance for producing stereo
image pairs with a single handheld camera. In these cases
we again assume a camera with a conventional focal length
lens. For the sidestep method described by Wing, it may
be useful to assume that a 1-ft sidestep is good for a near
distance of 30 ft, 2 ft for 60 ft, and so on.
298
Journal of Imaging Science and Technology
Stereoscopic Cameras. Stereoscopic cameras follow the
pattern of our natural binocular vision by recording a pair
of images from two different vantage points. The traditional amateur stereoscopic camera is a side-by-side twolens unit with adjustable focusing. The fixed lens interaxial
distance is smaller than the aperture interaxial distance,
resulting in a convergence that establishes the stereoscopic
window at about 2 m from the camera.
Using a camera with fixed lens interaxial we can still
obtain correct parallax if we balance the subject composition within proper limits of far and near distances, D and
d, as indicated by Eqs. 5 and 6.
A number of stereo photographers use single or double
cameras on a mechanical slide bar or have two cameras
modified by permanent coupling. The greatest progress in
stereo camera construction has been made in the field of
professional motion picture equipment. Here we find single
track, side-by-side format on 70-mm film and double-track
35-mm cameras, in both cases with elaborate optical and
mechanical refinements.7c
For much of our research at Polaroid we used stereoscopic image pairs made with cameras such as the splitfield 5˝ × 7˝ Speed Graphic and aerial reconnaissance
photographs from government sources.
A 5” x 7” Studio Stereo Camera. We also constructed
an experimental studio-type stereo camera (Fig. 5). We
coupled two 5˝ × 7˝ Burke James view cameras on a slidebar-type bed at 90° to one another, separated by a 45°
semitransparent mirror. The large negatives and color
transparencies were easy to evaluate, and both enlargements and reductions were made as needed. The lens
interaxial distance could be adjusted from zero to about
20 cm, and the T2 distance could be adjusted by the “sliding back” mechanism. Two 30-cm Wollensak Velostigmats
were operated simultaneously by twin cable releases,
and the whole assembly balanced well on an Ansco studio tripod.
Experimental Stereoscopic Images. We used the 5˝ × 7˝
camera and the Polaroid Interocular Calculator to generate
Kitrosser
a series of studio photographs that included long shots,
medium shots, and close-ups. We also made photographs
of tabletop subjects. We obtained very gratifying results
confirming the validity and usefulness of a reference parallax-to-image width ratio of 1:24. We also found that a
ratio of 1:50 was of limited practical use because it provided insufficient depth effect. A ratio of 1:100 gave still
less parallax and approached the limits of stereo acuity.
We concluded that the maximum parallax afforded by
the ratio 1:24 was more satisfactory, even when it resulted in Lilliputism or other distortions. Our observations during recent years have been consistent with these
findings.
Discussion
Geometric Characteristics of a Stereoscopic Image
Pair. We have described the principles of stereoscopic viewing devices and the geometry of stereoscopic photography,
and we have briefly discussed stereoscopic cameras and
their use to achieve optimal stereoscopic effects. Now we
can delineate the most important characteristics of a stereo image pair—first, the parallax, which will determine
the visualized depth of the 3-D image, and second, the
framing, which will locate the image within the 3-D space.
Methods of Measurement and Evaluation. Original
stereoscopic negatives or transparencies are usually too
small for superimposition over a light table and must be
compared under magnification. Parallax is measurable if
we superimpose in register the homologous points of the
nearest detail and measure the shift of the homologous
points of the farthest detail. Assuming that the images
are presented in correct orientation, the shift will be displayed in the right-eye image. We can also observe and
evaluate the coincidence of the image borders, which will
give a clue about the correct framing.
Spatial Presentation. We can separate the presentation
of the subject matter within the 3-D space into three
groups: Group A, all of the image to appear beyond the
stereo window; Group B, part of the image beyond the stereo window and part extending forward from the stereo
window; and Group C, all of the image in front of the window . We could also add a subgroup to C for images that
“float” in front of the window.
Group A. This is the most conventional case. A majority of stereo cameras are constructed with this type of presentation in mind. To have all of the image appear beyond
the stereo window and be comfortable to view, the recommended maximum parallax is approximately 1/24 of the
width of the image, as discussed earlier. With stereo prints
to be viewed directly and transparencies projected onto
screens, the 1:24 ratio will allow a maximum width of 1.5
m, at which point the lateral shift on the screen will reach
its limit of 65 mm. For very large screens viewed at appropriate distances, the eyes will tolerate greater separation. Correct framing is achieved when the near
homologous points are in register and the right and left
borders of the stereo image coincide.
Group B. To obtain the effect of a partial image forward of the stereo window, we will need a total parallax
greater than 1/24 of the image width. If we double this
amount, we can use 1/24 for the portion behind the window and another 1/24 for the portion of the stereo image
in front of the window. The borders of the image must coincide when the zero parallax homologous points are su-
Photography in the Service of Stereoscopy
perimposed in register. For viewing comfort it is also desirable that the subject matter be composed fully within
the borders of the stereo window. When viewed on a 1.5-m
screen, such an arrangement should bring the near part
of the image halfway between the observer and the screen.
In other presentations the image’s placement will vary
according to the viewing distance.
Group C. When the entire image is created in front of
the stereoscopic window, the observer’s eyes will be in a
convergent position. This situation seems to be more easily tolerated than looking at an image beyond the window,
and the uncoupling of accommodation and convergence has
a wider tolerance. In observing drawings of geometric solids that appear placed on a tabletop, a lateral shift of 1/10
of the image width is easy to accept. If the entire image is
to appear in front of the stereo window, then the edges of
the image pair should be in coincidence when the far homologous points are in register.
Summary
Properly applied, the tools we have described facilitate
the generation of effective stereo images and permit the
full enjoyment of stereoscopic presentations. The system
may be summarized as follows:
1. Select the value of the parallax, p, according to the format of the camera and the requirements of the viewing
method, including the choice of location of the stereo
window.
2. Establish the near and far distances from lens to the
subject or scene to be photographed.
3. Note the focal length of the taking lens or, for extreme
close-ups, the lens-to-film distance.
4. Calculate T1, using Eq. 1 or using an interocular calculator such as the one shown in Fig. 4.
5. According to the location of the stereo window in relation to the reconstructed image, calculate T2, using Eq. 3.
6. Given a camera with fixed interaxial distance, calculate
D and/or d to determine suitable stereo composition.
Alternative Methods for Achieving Parallax. Stereoscopic pairs generated by means other than paired camera exposures and intended for viewing by conventional
stereo methods must follow similar guidelines for parallax and framing in the image output. The geometry of recording the stereo pairs will be specific to the technology
of the image capture source.
Conclusion
Stereoscopic imaging provides an interface between the
real-life scene and the creative presentation possibilities
offered by various stereoscopic viewing methods. Parallax and convergence have been discussed as major contributing factors in the achievement of satisfactory
stereoscopic results. We recognize the lack of suitable stereoscopic equipment for the full implementation of the material presented here. However, the information is still of
value to any stereoscopist, whether a photographer with
a conventional twin-lens stereo camera, a computer user
generating stereoscopic images, an SEM microscopist, or
a painter creating stereoscopic artwork.
Acknowledgments. The author extends deep appreciation to many friends and colleagues for helpful discussion
of the subject and especially to Vivian Walworth for her
encouragement and cooperation. I also greatly appreciate
the assistance of Jay Scarpetti and his associates at the
Rowland Institute for Science.
Vol. 42, No. 4, July/Aug. 1998
299
References
1.
2.
3.
4.
5.
6.
300
C. Wheatstone, On some remarkable, and hitherto unobserved, phenomena of binocular vision, Phil. Trans. 1838, 371-394 (1838).
D. Brewster, The Stereoscope. Its History, Theory, and Construction,
London, 1856, Facsimile Edition, Morgan & Morgan, 1971.
J. T. Rule, The geometry of stereoscopic projection, J. Opt. Soc. Am. 31,
325 (1941).
S. Kitrosser, Stereoscopic photography as applied to Vectographs, in
Polaroid Organic Chemical Research Seminars, Vol. 1, Polaroid Corp.,
Cambridge, MA, 1945.
J. G. Ferwerda, The World of 3-D: A Practical Guide to Stereo Photography, Netherlands Society for Stereo Photography, 1982 (a) p. 238;
(b) pp. 103–104.
F. G. Waack, Stereo Photography: An Introduction to Stereo Photo Tech-
Journal of Imaging Science and Technology
7.
8.
9.
10.
11.
12.
13.
nology and Practical Suggestions for Stereo Photography, translation
by L. Huelsbergen, Reel 3-D Enterprises, Culver City, CA, 1985, p. 13.
L. Lipton, Foundations of the Stereoscopic Cinema: A Study in Depth,
Van Nostrand Reinhold, New York, 1982, (a) p. 169; (b) pp. 147–148; (c)
pp. 49–50.
E. H. Land, A. J. Bachelder and O. E. Wolff, U.S. Patent 2,453,075 (Nov.
2, 1948).
S. Kitrosser, Polaroid Interocular Calculator, PSA J. 19B, 74 (1953).
W. H. Ryan, Photogr. J. 125, 473 (1985).
D. Burder and P. Whitehouse, Photographing in 3-D, 3rd ed., The Stereoscopic Society, London, 1992. p. 6.
P. Wing, Hypers by walk, water, wire, and wing, Stereo World 16, 20
(1989).
J. Weiler, Tips for hypers from airliners, Stereo World 16, 33 (1989).
Advancements in 3-D Stereoscopic Display Technologies: Micropolarizers,
Improved LC Shutters, Spectral Multiplexing, and CLC Inks
Leonard Cardillo and David Swift
VRex, Inc., Elmsford, New York
John Merritt
Interactive Technologies, Princeton, New Jersey
An overview of four new technologies for stereoscopic imaging developed by VRex, Inc., of Elmsford, New York, is presented. First, the
invention of µPol micropolarizers has made possible the spatial multiplexing of left and right images in a single display, such as an LCD
flat panel or photographic medium for stereoscopic viewing with cross-polarized optically passive eyewear. The µPol applications include
practical, commercially available stereoscopic panels and projectors. Second, improvements in fabrication of twisted nematic (TN) liquid
crystals and efficient synchronization circuits have increased the switching speed and decreased the power requirements of LC shutters
that temporally demultiplex left and right images presented field-sequentially by CRT devices. Practical low-power wireless stereoscopic
eyewear has resulted. Third, a new technique called spectral multiplexing generates flicker-free field-sequential stereoscopic displays at
the standard NTSC video rate of 30 Hz per frame by separating the color components of images into both fields, eliminating the dark field
interval that causes flicker. Fourth, new manufacturing techniques have improved cholesteric liquid crystal (CLC) inks that polarize in
orthogonal states by wavelength to encode left and right images for stereoscopic printers, artwork, and other 3-D hardcopy.
Journal of Imaging Science and Technology 42: 300–306 (1998)
Introduction
Since Wheatstone (1838) first reported that binocular disparity is the cue for stereopsis, or what he called “seeing in
solid,” many new techniques and devices for producing stereoscopic views from left and right perspective flat images
have evolved.1 Beginning with the Helioth–Wheatstone
stereoscope, every new technique or device has in common
some advancement in one or more of the three necessary
conditions to simulate depth: (1) a means to capture left
and right perspective views; (2) a means to combine, or
multiplex, those views; or (3) a means to deliver each view
to the correct eye, or demultiplex. The Wheatstone stereoscope had (1) two perspective views captured by artists, or,
early in the history of photography, captured by twin pho-
Original manuscript received September 15, 1997
© 1998, IS&T—The Society for Imaging Science and Technology
300
tographs; (2) as a means of multiplexing, simply placing
the views side-by-side on the same stereogram viewing card;
and (3) as a means of demultiplexing, providing a viewing
aperture and convergence optics for each eye in front of the
stereogram and a septum between the viewing apertures.
A breakthrough in stereoscopy was the invention of practical polarizers by Edwin Land2 in 1932, with Land devising 3 the cross-polarized multiplexing/demultiplexing
technique for stereoscopic films in 1935. Improvements in
the 3 necessary conditions to simulate depth came from
(1) the rapid development of dual motion picture cameras
by Zeiss-Ikon and Ciné-Kodak; (2) multiplexing by superimposing on a metallized screen a projection of the left
image through a P1 state polarizer and the right image
through a P2 state polarizer; (3) demultiplexing with polarized glasses, P1 state at the left eye and P2 state at the
right eye to pass polarized light of the same phase and
extinguish cross-polarized light.
Land’s technique has been noted because many innovations in stereoscopy, including the four to be outlined here,
Figure 1. One- and two-dimensional µPol arrays. The one-dimensional pattern is used for TFT-LCD displays, with a half-period
resolution of 201 µm used for commercially available 1280 × 1024 panels. Now µPols with half-period resolutions less than 20 µm are
possible with present manufacturing techniques.
are based in some way upon cross-polarization. We will
outline the following:
1. µPol micropolarizers for spatial multiplexing.
2. Improved LC shutters and electronics for temporal multiplexing.
3. Spectral multiplexing for flicker free video at standard
video rates.
4. Cholesteric liquid crystal (CLC) inks.
The µPol
The µPol, invented by Faris in 1991, provided the necessary multiplexing and demultiplexing functions for stereoscopic LCD displays and stereoscopic hardcopy
printing.4 The µPol is a passive optical element that transforms incident unpolarized light into periodic, spatially
varying (square wave form) polarized light with polarization alternating between two orthogonal states P1 and
P2 (linear or circular). The most common fabrication
method for µPol is photo-lithography to form a specific
micropattern. An additive method prints a pattern on the
PVA surface with a high-precision gravure cylinder and iodine-based dichroic ink, producing P1 and P2
micropolarizers; a subtractive method prints the desired
pattern on the PVA with photoresist, then bleaches away
exposed parts, producing a λ/2 waveplate in a pattern to be
optically coupled with a polarized source.2 When the image
source is a Thin-Film-Transistor (TFT)-LCD panel, all transmitted light is polarized in a P1 state since a P1 “analyzer”
is incorporated over the electrically controllable birefringent (ECB) “light valve” that turns each pixel on or off. With
the patterned µPol placed over the panel, active portions of
the λ/2 waveplate rotate the phase of light polarization from
P1 to P2, while ablated portions leave P1 unchanged.
As illustrated in Fig. 1, the µPol can be either a onedimensional or two-dimensional array with half-periods
as small as 20 µm possible with current manufacturing
processes. For TFT-LCD displays, the one-dimensional pattern is used, the finest resolution to date having a halfperiod of 201 µm on a 15˝ diagonal 1280 × 1024 panel.
The first step in creating the stereoscopic image is by
spatially multiplexing the left and right perspective
views of a 3-D scene, as illustrated in Fig. 2. The left
and right images, which are represented by pixel arrays, are spatially modulated with the modulators MOD
and MOD, producing the spatial patterns that are then
combined into a spatially multiplexed image (SMI). The
multiplexing algorithm can be implemented in software,
hardware, or by optical means; the µPol itself can perform the multiplexing function when placed in front of the
CCD array of a camera or a photographic medium.
By placing a µPol in contact with an SMI having the
same spatial period, the demultiplexing step is carried out
as shown in Fig. 3. The µPol codes each pixel of the right
Advancements in 3-D Stereoscopic Display Technologies: ...
Figure 2. Spatial multiplexing. Images from digital sources are
multiplexed with software, images from video sources are multiplexed with field-switching hardware, and photographic images
can be multiplexed using the µPol array self-aligned to the film
for both multiplexing and demultiplexing. Two-dimensional multiplexing is shown.
Figure 3. Demultiplexing a spatially multiplexed image (SMI).
Right image pixels are aligned with the P1 elements of the µPol
array, left image pixels with P2 elements. Through cross-polarization, only the right image pixels are transmitted through the polarized lenses to the right eye and left image pixels to the left eye.
Vol. 42, No. 4, July/Aug. 1998 301
image with a polarization state P1 and each pixel of the
left image with state P2, thus encoding the two images.
The viewer, wearing a pair of polarized glasses (or looking
through a polarized visor), is able to view the right image
only with the right eye and the left image only with the
left eye, fusing the two views into a stereoscopic image.
Because the left and right image information is simultaneously present in a single frame, the technique is general purpose. The SMI could be displayed by conventional
devices, printed by conventional printers, recorded by photographic cameras, and video cameras, or projected by a
single slide or a single movie projector. In all cases, color
3-D stereo images can be produced. In contrast, techniques
that produce stereo images by means of two separate left
and right frames (sequential or in parallel) do not have
the µPol’s range of application and are also incapable of
producing 3-D hardcopy.
µPol Applications and Products. VRex has incorporated
µPol technology in commercially available 3-D stereoscopic
LCD panels and projectors ranging from 640 × 480 pixel
resolution to 1280 × 1024. Other µPol-based devices in
production include a 3-D LCD notebook computer, an interactive 3-D information center utilizing a touch-screen
LCD and polarized visor, and an immersive environment
consisting of wrap-around rear-screen 3-D projection.
Hardcopy has been produced using photographic medium
with a self-aligned µPol for both multiplexing during exposure and demultiplexing during viewing.
Improved Liquid Crystal Shutters
A drawback to µPol display applications is their unsuitability for CRT devices. The thickness of a CRT display would position a µPol 10 to 20 mm in front of the
image plane scanned on the phosphor screen, introducing parallax between horizontal image raster lines and
horizontal µPol lines when viewed above or below the
plane orthogonal to the CRT screen. This parallax results in a limited viewing zone, with cross-talk or
pseudoscopic images perceived outside this zone. A solution lies in finding polarized material that can be coated
inside the CRT in front of the phosphor screen and can
withstand the intense heat generated by the cathode
heater filament; until then, shutter devices are the preferred technique to demultiplex stereoscopic left and right
perspective images time-multiplexed on the CRT.
The theory of operation for stereoscopic viewing of CRTs
through shuttered eyewear is simple. Images are timemultiplexed so a left perspective image is displayed on
the CRT device when the left eye shutter is open and a
right perspective image displayed when the right eye shutter is open. At suitable repetition rates, the viewer perceives a continuously present 3-D image. In video
applications, the two interleaved fields in each frame of
an NTSC display provide a convenient multiplexing
method: the right image encoded in Field 1, the left 16 ms
later in Field 2. PC monitors driven in page-flipped or interlaced mode provide even faster repetition rates.
Early time-multiplexed implementations used mechanical shutters,6,7 but these shutters were cumbersome and
obtrusive. PLZT ceramics were an interim solution for
shutters,8 but now most devices use variations of liquid
crystal (LC) shutters.9 In general, an LC shutter consists
of an electrically controllable birefringent (ECB) plate in
which molecules are in a liquid-crystalline state. When
an electric field is applied across the plate, the molecules
align in parallel, producing a 90° phase shift between the
horizontal and vertical components of linearly polarized
302
Journal of Imaging Science and Technology
light passing through the plate; when the field is removed,
the molecules return to random alignment, passing the
components of the polarized light in phase. Using a second polarizer, or analyzer, on the exit side of the plate,
light is passed when a voltage is applied and the ECB plate
is phase shifted, and light is extinguished when the voltage is removed and the plate is in ordinary phase. The
reader is referred to Bos10 for an excellent review of LC
shutter material and operation.
Two basic liquid crystal types are the Π-cell and the
twisted nematic (TN) cell; to date, most shutter glass systems have used Π-cells.11 Although the Π phase shifting
material used in these cells allows extremely fast switching times (<3 ms), they require very high excitation voltages, 20 Vp-p minimum, which makes wireless
battery-powered operation difficult and costly to achieve.
A second characteristic of Π-cells is that the cell is not
transparent when the excitation voltage is removed; with
no power applied the cell will retain a semi-opaque color
hue. A low-level excitation voltage, 8 Vp-p approximately,
needs to be applied to the cell to achieve transition from
the full transparent to the full opaque state.
A practical time-multiplexed stereoscopic shutter system using TN liquid crystals has recently been developed.12
A major advantage of TN cells is that, unlike Π-cells, no
background excitation voltage is needed to keep the shutters in the transmissive state. However, TN cells have had
the disadvantage of slow transition time (>10 ms) from
the transmissive to the opaque state and back to transmissive as excitation voltage is applied and disconnected.13
In addition, the transmissive to opaque (turn on) time may
differ from the opaque to transmissive (turn off) time.
However, the performance of the TN cell has been optimized both in the manufacturing process of the cell itself
and in the timing of the applied excitation voltage to overcome these limitations.
A major improvement was reducing cell thickness to a
minimum for faster switching and lower excitation voltages. This was key to obtaining long battery life in wireless battery shutter glasses because no high-voltage dc–dc
converters would be required as with Π-cell shutters. During normal operation, the shutter drivers draw 130 µA
when shutters are transmissive with a DC signal, 150 µA
with a 60-Hz signal and 200 µA with a 120-Hz signal. Each
shutter can switch at frequencies in excess of 120 Hz with
no interfield cross talk. Higher frequency switching was
accomplished by synchronizing shutter transitions with
video fields. Previous devices synchronized the shutter
transition to the beginning of each video field: once a vertical reset pulse or similar signal was detected, pulse coded
information was sent to toggle the optical state of the shutters. For the shutters to change state before the first line
of displayed video, the pulse codes had to be very short,
requiring high speed circuitry in the receiver that consumed much power. To reduce the power requirements of
the present system further, the field identification information is sent prior to the vertical blanking interval so
the pulse information may be transmitted at a much slower
rate. The detection circuitry functions at a slower frequency and battery life is greatly increased.
A further improvement implemented in the stereoscopic
shutter system was the ability to synchronize the shutters to all popular display formats used by TVs and PCs:
the IR transmitter that sends driving codes to the shutters in the eyewear can detect synchronization signals,
polarities, and frequencies present in all VGA, SVGA, and
XGA computer formats, as well as NTSC and PAL video
sources. The image field rate and the mode of operation,
i.e., 2-D, 3-D interlaced, or 3-D page flipped, is determined
Cardillo , et al.
Figure 4. Stages of spectral multiplexing. In Stage 1, r,g,b image pixels are captured and in Stage 2, pixels are separated into a
magenta buffer (r + b) and a green buffer (g) for the left and right images. In Stage 3, filler pixels are added so pure red, green, blue,
or magenta areas of the image will not be dark during the alternate field. Stage 4 shows the field-sequential presentation of left and
right images.
from these signals. Detecting carrier synchronization signals is a benefit because earlier attempts at field coding
required tagging the video content itself with black and
white markers on a horizontal scan line.14
Shutter Applications and Products. The largest commercial application of the improved LC shutters is in stereoscopic eyewear called VR Surfer. System software
enables the user to optimize the performance of the shutter system to a particular PC monitor: a basic mode of
operation is provided that encodes image identification information in the display sync signals during the vertical
reset interval. This mode will enable the display of stereoscopic images in DOS applications but does possess some
degree of perceived image flicker because the image switching rate is in the 60 to 72-Hz range. An advanced mode
detects the video card chip set driving the PC monitor and
will automatically implement the best stereoscopic display
mode at rates up to 120 Hz. In this mode, Microsoft Windows applications are supported. The system is also compatible with field-sequential stereoscopic video in NTSC
and PAL standards.
While the improved LC shutters have been used chiefly
for demultiplexing in eyewear, a device is under development that uses the shutters to multiplex stereoscopic perspective views to a single CCD recording device.
Specifically, if a beam splitter is placed in the primary viewing path of a video camera and mirrors relay a second perspective view offset from the primary, the left and right
perspectives necessary for a stereoscopic view will be imaged on the CCD. By placing a pair of LC optical shutters
in each of the viewing paths, each viewing perspective can
be alternately imaged by the camera if the shutters are
opened and closed in synchronization with the video field
output. By convention, the right perspective will be encoded in Field 1 of the video frame and the left perspective in Field 2. A promising commercial application for the
device is as an attachment for the home video camcorder.
Spectral Multiplexing
Because of the predominance of television as a display
medium, it would be beneficial if field-sequential multi-
Advancements in 3-D Stereoscopic Display Technologies: ...
plexing could operate at NTSC television standard 60-Hz
field rates with no flicker at all. In the 60-Hz standard
field-sequential LC shutter system just described, some
residual flicker is perceived because left and right images
are coded in one field of video so there is alternately in
each eye the full brightness of the image and a 16.6-ms
dark interval. The human visual system will integrate light
energy over time, reaching a critical fusion frequency (CFF)
beyond which these alternating light and dark intervals
are perceived as steady, but at normal viewing luminance,
the 60 Hz field rate is15,16 below the CFF. Flicker is more
apparent with brighter displays, the fusion threshold increasing with the logarithm of luminance according to the
Ferry–Porter law,17 and the contours defining images increase the threshold even more because contrast increases
with fast visual “off” transients generated at image offset.18 While some evidence exists that visual persistence
of stereoscopic stimuli is longer than mono-planar
stimuli,19,20 thus decreasing the fusion threshold, the effect is not long-lasting enough to bridge the 16-ms dark
interval. To prevent flicker, a new field-sequential system
is under development that eliminates the dark interval
within the video frame, maintaining light energy at both
eyes at all times, decreasing or eliminating luminance
modulation between video fields.21
Figure 4 shows the image capture and multiplexing functions of the spectral multiplexing technique.
In Stage 1, left and right perspective views from cameras or computer graphics are captured and analyzed on a
pixel-by-pixel basis. In Stage 2, each is separated into two
spectral buffers: one for red and blue (r + b) and one for
green (g). (Note that “pixel” here refers to an individual
color component, not one of three points comprising a color.)
The luminance at each eye is of different wavelength components, but these will still summate luminance to decrease the CFF relative to light/dark stimulation.22 At this
stage, the pixel data in the buffers could be field-sequentially multiplexed, as shown in the final stage, so each eye
receives light energy during both Field 1 and Field 2 presentations, a technique similar to that described by
Street.23 However, pixels that represent a pure primary
color (r or b or g) or magenta (r + b) will not have spectral
components in one of the buffers and will still be dark
Vol. 42, No. 4, July/Aug. 1998 303
Figure 5. Eyewear for spectral demultiplexing. Left and right eye optics are identical: shown in order from the image back to the eye
are green cholesteric liquid crystal filters in a P1 state, magenta cholesteric liquid crystal filters in a P2 state, active TN cells, and a
broadband polarizer (analyzer) in a P1 state. The TN cells are shown in a state to transmit the Field 1 image; to transmit the Field 2
image, the voltage polarities are reversed.
during one field. The advancement over Street’s technique
is in maintaining some spectral luminance at the eye even
when a primary- or magenta-colored object has no spectral components in the subsequent field. Therefore, at
Stage 3 each pixel is analyzed, zero values identified, and
a “filler pixel” inserted. A pixel of a suitable minimum luminance value replaces dark r + b pixels in one buffer or g
pixels in the other buffer for each eye’s view, maintaining
energy at all pixels in both eyes during both Field 1 and
Field 2. The luminance of filler pixels from the alternate
buffer is chosen to shift the chromaticity or saturation of
the r,g,b, or magenta color a minimum perceived amount
in color space24 yet maintain enough energy during the
otherwise dark field to prevent flicker. This is possible
because the visual system does not discriminate colors
perfectly, with observers perceiving similar colors with
substantial shifts in wavelength. Attempted isomeric or
metameric matches to a given wavelength show large just
noticeable differences (JND) in chromaticity space as well
as saturation space.25 Suitably chosen color pixels can be
added to the otherwise dark pixel space in the alternate
field to summate with the luminance of the primary r,g,b
or magenta pixels, resulting in minimum shifts in perceived color yet decreasing luminance modulation, so
flicker is not perceived.
After the frame buffers are updated with “filler pixel”
data, the buffers are shifted into the NTSC field format in
Stage 4: r + b pixels from the left view and g pixels from
the right view into Field 1; g pixels from the left view and
r + b pixels from the right view into Field 2. An ordinary
CRT monitor with NTSC video input displays the field
sequential information directly or from standard recorded
video tape.
Figure 5 shows the implementation of the spectral
demultiplexing function at the eyes, with eyewear consisting of passive green and magenta CLC filters, active electronic TN cells, and passive broadband polarizers.
The CLC filters pass their respective colors circularly
polarized, right-hand polarized greens (P1) and left-hand
polarized magentas (P2). The TN cells are EBC devices
synchronized with Field 1 and Field 2 of the NTSC signal
using the same circuit techniques described previously.
During Field 1, the left eye’s TN cell is activated (V-), reversing the polarized state passed by the filters while the
right eye’s TN cell is inactive (V+), maintaining the polarization. Figure 6 shows the Field 1 state of the TN cells;
during Field 2 the right eye’s TN cell switches to V- and
304
Journal of Imaging Science and Technology
the left to V+. The broadband polarizer, or analyzer, passes
P1 light and blocks P2.
Referring to Fig. 4, Stage 4, the CRT monitor displays r
+ b (magenta) information from the left eye image and g
information from the right eye image simultaneously during Field 1. During Field 2, g information from the left eye
and r + b information from the right eye is displayed simultaneously. By alternately activating and de-activating
the TN cells in synchrony with Field 1 and 2, it can be
seen that the eyewear performs the appropriate spectral
demultiplexing function, passing left and right image information to the correct eyes for field-sequential stereopsis without flicker.
3-D Printing Based on CLC Inks
A new method of printing 3-D images has been made
possible through several patented processes for manufacturing inks based on CLC materials.26,27 This CLC ink can
be made right circularly polarized (RCP) as well as left
circularly polarized (LCP) to enable polarization multiplexing of the left and right images, with circularly polarized 3-D glasses demultiplexing images. Applications for
these new inks include 3-D hardcopy from inkjet printing,
offset printing, gravure, and silk-screen processes that can
be printed on any paper, fabric, or other medium.
CLC Properties. CLC is a nematic liquid crystal with
chiral additives or polysiloxane side-chain polymers that
cause the cigar-shaped molecules to be spontaneously
aligned in an optically active structure both LCP and RCP.
Chiral additives give the CLC molecules a degree of twist
and helical structure with pitch “p.” (Pitch can be thought
of as the length of a bolt that a nut must travel to make a
360° turn, tighter pitch resulting in less travel.) The higher
the concentration of the chiral additive, the tighter the
molecules are arranged in the helix, resulting a shorter
pitch. CLCs can be either left-handed (LH) or right-handed
(RH), each having a unique property known as selective
wavelength reflection. When light is incident upon the CLC
surface, the selective reflection is described by the following equation:
λ = λo = na p,
(1)
where λ is the reflective wavelength, na is the mean index
of refraction (approx. 1.6) of the CLC material, and p is
Cardillo , et al.
Figure 6. Reflective properties of left-handed CLC. The wavelength of light reflected is determined by the pitch of the CLC, and the
direction of polarization, RCP or LCP, is determined by the direction of helical twist of the CLC.
the pitch of the helical structure. Figure 6 illustrates the
selective reflection of LH CLCs.
If the CLC is RH, then it reflects 50% of the incoming
light at the selected wavelength in RCP light and transmits 50% of that wavelength in LCP light. All other wavelengths are transmitted through the material. Similarly,
LH CLC material will reflect LCP light and transmit RCP
light. The reflected wavelength, or color, can be tuned by
changing the length of the pitch, which is dependent on
the chiral additive concentration. The polarization of the
reflected light, RCP or LCP, can be altered by using RH or
LH CLC material.
New CLC Ink Fabrication Processes. A major breakthrough in CLC ink fabrication was a process to make the
inks useable at room temperature. Formerly, room-temperature use required CLC to be in the liquid phase, encapsulated or confined to cells; in the solid phase, the CLC
inks had to be applied at very high temperature. In the
new process,26 molten CLC material above the glass temperature (for polysiloxane-based CLC polymers, about
120°C to 150°C) is deposited onto a rotating belt and aligned
using a knife edge. After a cooling stage, the CLC film is
transferred to another rotating belt coated with an adhesive. The second belt, after receiving the CLC film, goes
through an air jet stage where an ultrasonic air jet or air
jet with fine powder abrasives removes the ultrabrittle CLC
film from the adhesive. The result is tiny CLC flakes that
retain the helical structure normal to the CLC flake surface. The CLC flakes range in thickness from 1 to 20 µm
and in size from 5 to 75 µm with an average size of 25 µm.
The geometry of the flakes can be regular or irregular.
To produce the CLC inks, these CLC flakes are mixed
with a host fluid or host matrix, the carrier. The carrier
must be chosen for suitable tackiness, drying speed, adhesion to surfaces, friendliness to environment, etc., depending on the application: offset printing, ink-jet printing,
painting, drawing, xerography, or other imaging methods.
When the CLC flakes are mixed with a suitable host matrix such as wax or other sticky material that is a solid at
Advancements in 3-D Stereoscopic Display Technologies: ...
room temperature, crayons, pencils, or other drawing devices can be made.
Printing 3-D Images using CLC Inks. Unlike conventional pigments and dyes, CLCs work on a reflective
mechanism, with six types of crystal necessary to render
the visible spectrum in two polarization states: red, blue,
and green pitched crystals in LCP and RCP states. (In
practice, it has been found necessary to use two additional
types, namely, white pitched crystals in each polarization
state). Instead of printing or plotting on a white piece of
paper, CLCs are applied onto light absorbing or
nonreflective surfaces. When viewed by itself, the ink appears almost transparent. When the ink is applied to a
black paper or other medium, the color corresponding to
the wavelength of the CLC material can be seen. All of the
other colors are absorbed by the black medium. Moreover,
if viewed with an RCP or LCP polarizer, the CLC material
will be seen only through the same-phase polarizer. It is
the circular polarization property of the CLC inks that is
used to multiplex left and right images for 3-D stereoscopic
printing, the left image put on the black substrate using
LCP ink and the right image using RCP ink. The 3-D images are demultiplexed at the eyes by cross-polarization
through ordinary circularly polarized glasses.
Conclusion
Four new advancements in 3-D stereoscopic display technology developed by VRex, Inc., of Elmsford, New York,
have been outlined. µPol optics have been applied to 3-D
LCD displays with benefits of reduced cost, self-alignment
of images, compatibility with video and computer standards, and single projector implementation of 3-D, leading to commercial desirability over other cross-polarization
displays for multiple viewers. Improvements in LC shutter materials and electronics have led to commercially
desirable 3-D eyewear for personal viewing of stereoscopic
3-D images from TVs and PCs, while the technique of spectral multiplexing, now under development, promises
Vol. 42, No. 4, July/Aug. 1998 305
flicker-free time-multiplexed stereoscopic content from
popular, low-cost 60-Hz TV and video displays. Finally,
advancements in CLC inks have led to stereoscopic
hardcopy printable on any medium. The first practical CLC
applications are for posters and clothing using a silk screen
process, with other printing techniques, including ink jet,
under development.
12.
13.
References
16.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
306
C. Wheatstone, Phil. Trans. Roy. Soc. London 128, 371–394 (1838).
M. McCann, ed., Edwin H. Land’s Essays. Volume I: Polarizers and Instant Photography, Society for Imaging Science and Technology, Springfield, VA, 1993.
R. M. Hayes, 3-D Movies: A History and Filmography of Stereoscopic
Cinema, McFarland, Jefferson, NC, 1989.
S. M. Faris, Micro-polarizer arrays applied to a new class of stereoscopic imaging, SID Dig. 38, 840–843 (1991).
S. M. Faris, U.S. Patent 5,327,285 (1994).
R. J. Beaton, R. J. DeHoff and S. T. Knox, Revisiting the display flicker
problem: refresh rate requirements for field-sequential stereoscopic display systems, Dig. Tech. Pap. SID Int. Symp. 17, 150 (1986).
J. Lipscomb, Experience with stereoscopic display devices and output
algorithms, Proc. SPIE Non-Holographic True 3-D Display Techniques
1083, 28–34 (1989).
J. A. Roese and A. S. Khallafalla, Stereoscopic viewing with PLZT ceramics, Ferroelec. 10, 47 (1976).
J. A. Roese, Liquid crystal stereoscopic viewer, U.S. Patent 4,021,846
(1977).
P. J. Bos, Liquid crystal shutter systems for time-multiplexed stereoscopic displays, in Stereo Computer Graphics and Other True 3-D Technologies, Princeton University Press, Princeton, NJ, 1993.
P. J. Bos and K. R. Koehler-Beran, The pi-cell: a fast liquid crystal optical device, Mol. Cryst. Liq. Cryst. 113, 329 (1984).
Journal of Imaging Science and Technology
14.
15.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
S. M. Faris, U.S. Patent pending.
M. R. Harris, A. J. Geddes and A. C. T. North, Frame-sequential stereoscopic system for use in television and computer graphics, Disp. 7(1), 12.
L. Lipton and J. Halnon, Universal electronic stereoscopic display,
in Stereoscopic Displays and Virtual Reality Systems III, M. T. Bolas, S. S. Fisher and J. O. Merritt, Eds., Proc. SPIE 2653, 219–223
(1996).
L. Ganz, Temporal factors in visual perception, in Handbook of Perception Vol. 5, E. C. Carterette and M. P. Friedman, Eds., Academic Press,
NY (1975).
A. B. Watson, Temporal sensitivity, in Handbook of Perception and Human Performance, K. R. Boff, L. Kaufman and J. P. Thomas, Eds., Vol. 3,
Wiley, NY, 1986.
H. de Lange, Research into the dynamic nature of the human foveacortex systems with intermittent and modulated light: I. Attenuation characteristics with white and colored light, J. Opt. Soc. Am. 48, 777–784
(1958).
R. W. Bowen, Isolation and interaction of ON and OFF pathways in human vision: contrast discrimination at pattern offset, Vision Res. 37(2),
185–198 (1997).
G. R. Engel, An investigation of visual responses to brief stereoscopic
stimuli, Q. J. Exp. Psychol. 21, 148–166 (1970).
W. Skrandies, Visual persistence of stereoscopic stimuli: electrical brain
activity without perceptual correlate, Vision Res. 27(12), 2109–2118 (1987).
S. M. Faris, U.S. Patent pending.
K. Uchikawa and M. Ikeda, Temporal integration of chromatic double
pulses for detection of equal-luminance wavelength changes, J. Opt.
Soc. Am. A 3, 2109–2115 (1986).
G. S. B. Street, U.S. Patent 4,641,178 (1987).
G. Wyszecki and W. S. Stiles, Color Science , Wiley & Sons, NY
(1982).
W. R. J. Brown and D. L. MacAdam, Visual sensitivities to combined
chromaticity and luminance differences, J. Opt. Soc. Am. 39, 808–818
(1949).
S. M. Faris, U.S. Patent 5,364,557 (1994).
S. M. Faris, U.S. Patent 5,457,554 (1995).
Cardillo , et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Full-color 3-D Prints and Transparencies*
J. J. Scarpetti,† P. M. DuBois,† R. M. Friedhoff,† and V. K. Walworth‡
The Rowland Institute for Science, Cambridge, MA 02142
We have put into practical form one of Edwin Land’s early inventions, the stereoscopic Vectograph image. We have reinterpreted the
concept to produce digital 3-D hardcopy conveniently from both photographic and digital 3-D records. Digital 3-D images may be
produced directly by digital cameras, by computers and workstations, and by various instrumental outputs or they may be acquired
by scanning and digitizing photographic image pairs. Digital 3-D polarizing images are printed conveniently with an ink-jet printer.
To produce full-color 3-D hardcopy on standard ink-jet printers we have formulated special inks and substrates. Our technique unites
two very significant growing technologies: ink-jet printing and 3-D imaging.
Journal of Imaging Science and Technology 42: 307–310 (1998)
Introduction
The most commonly used method of 3-D presentation comprises encoding left- and right-eye images in terms of polarization by mounting oppositely oriented polarizing filters over the lenses of paired projectors and superimposing the left- and right-eye images on the screen, as shown
in Fig. 1. To preserve the polarization, the screen used
must have a metallic, nondepolarizing surface. Suitable
aluminum screens are commercially available. Observers
view the composite image through 3-D polarizing glasses.
Under these circumstances each eye sees only the assigned
image and the observer perceives the composite as a single
three-dimensional image. This method is effective, but it
is difficult to achieve correct and consistent stereoscopic
registration and alignment without specialized precision
equipment.
In 1940 Edwin Land introduced the Vectograph concept.1 Instead of using polarizing filters to encode paired
images, he used images that were themselves polarizers.
He printed the respective left- and right-eye images on
opposite sides of a single transparent support, as indicated
in Fig. 2, so that once the two images had been properly
registered and printed, they could not be misaligned.
The earliest Vectograph images were in black and
white, and they were formed by staining preregistered
paired gelatin relief images with an iodine ink and transferring that ink to oppositely oriented polyvinyl alcohol
(PVA) layers laminated to opposite surfaces of a transparent film base.§ In each image area the effective density is
directly related to the amount of iodine transferred to the
PVA and thus to the degree of polarization.
Original manuscript received November 10, 1997
* Presented in part at the IS&T 50th Annual Conference, Cambridge,
MA, 18–23, 1997.
†
IS&T Member
‡ IS&T Fellow
© 1998, IS&T—The Society for Imaging Science and Technology
The chemistry of this Vectograph imaging process resembles that used in the production of sheet polarizers
such as the Polaroid H-sheet. In each case the substrate
is a PVA sheet that has been heated and stretched to orient the polymeric molecules, then laminated to a support
and stained with an iodine ink. Such polarizers are defined as dichroic polarizers. Iodine may be described as a
dichroic stain, and dyes that form spectrally selective dichroic polarizers are described as dichroic dyes.2
In 1953 Land showed three-color Vectograph images
that had been formed by successively transferring cyan,
Figure 1. 3-D projection display with two projectors.
§ The black-and-white Vectograph process was used extensively for both
military and industrial imaging during World War II. For many years
Stereo Optical Co., Chicago, IL, has been producing black-and-white
Vectograph prints for use by ophthalmologists in binocular vision testing and training.
307
Figure 3. Color Vectograph image printing by dye-transfer process.
Figure 2. The Vectograph concept, using paired, oppositely oriented polarizing images.
magenta, and yellow dichroic dye images from paired gelatin relief images.¶ The color Vectograph printing process
required for each 3-D image a set of six flawless, perfectly
registered matrices, as indicated in Fig. 3. The process
produced excellent stereoscopic color prints and transparencies and even experimental 3-D motion pictures. However, preparation was difficult, time-consuming, and costly,
and these factors precluded widespread utilization of the
process.
In recent years 3-D computer technology has dramatically influenced scientific, engineering, and medical imaging. For example, molecules, airplanes, oil rigs, and buildings are now designed or investigated using 3-D computers, often with real-time stereoscopic renditions displayed
on monitors. Until now, however, there has been no convenient form of hardcopy for these 3-D display images.
The purpose of our investigation has been to develop
a greatly simplified process for generating full-color 3-D
prints and transparencies, using contemporary ink-jet
printing technology with dichroic inks and specially constructed substrates. New materials and techniques make
it practical to use ink-jet technology for printing full-color
3-D hardcopy images.3
Materials and Methods
Preparation of 3-D Digital Image Pairs. Figure 4 details the various paths from initial image acquisition to
the finished stereo image. If the initial stereo images are
conventional photographic negatives or positive transparencies, the images are first scanned and digitized. Image
pairs from digital cameras, CAD images, and digital images based on various instrumental data are used directly.
th
¶ Presentation by E. H. Land at the 38 Annual Meeting of the Optical
Society of America, Rochester, NY, 1953.
308
Journal of Imaging Science and Technology
Figure 4. Flow diagram illustrating the preparation of digitized
stereoscopic polarizing images. The principal steps are: (1) the
preparation of a digitized left–right image pair, (2) storage of the
two images as a left–right image file, (3) paste-layering, using
Adobe Photoshop. Following adjustment of contrast and color,
suitable cropping, and stereo registration, the two images are
printed sequentially, one on each surface of the two-sided sheet.
In addition to using an array of external sources, we can
import digital 3-D images electronically from remote instruments and computers.
The paired left- and right-eye digital images are transferred to Adobe Photoshop or a comparable program. We
adjust the contrast and color balance of each image, then
register the pair stereoscopically, as illustrated by the two
upper images in Fig. 5. Each of the two images is “pastelayered” into a new canvas, in which the dimensions match
the dimensions of the printing medium, commonly 8.5 ×
11˝ or 8.5 × 14˝. We then select the right-eye image, reduce its transparency to 50%, and move the image into
stereoscopic alignment with the left-eye image, superimposing precisely the points that are to lie in the plane of
the stereoscopic “window,” i.e., the plane of the screen or
frame, as shown in the lower left panel of Fig. 5. Finally,
we restore full density and reverse the right-eye image
right to left (Fig. 5, lower right panel), so that when the
two images are printed face to face both images will again
have the same left–right orientation.
Image Dyes. Figure 6 shows the structure of a typical
dichroic azo dye suitable for forming polarizing images
Scarpetti, et al.
Figure 5. Monitor screen views showing the steps in stereoscopic registration of image pair.
Figure 6. A typical dichroic image dye, Direct Green 27.
upon imbibition into an oriented PVA layer. The dye
shown is Direct Green 27, which we use in the cyan ink.4
The polyazo dye molecule is sufficiently flexible to align
readily with oriented molecules of PVA to form an efficient dichroic polarizer. The sulfonic acid groups confer
high solubility in aqueous solution, providing mobility of
the dye within water-permeable polymeric layers. For our
application the dyes are carefully purified and formulated
into dichroic inks that perform well in standard ink-jet
cartridges.
Sheet Material. We print the paired images on the two
surfaces of a multilayer sheet, as represented in Fig. 7.
The film base is a nondepolarizing transparent support of
cellulose triacetate or cellulose acetate butyrate. An image-receiving layer of stretched PVA is laminated to each
surface of the film base, with the stretch axes of the two
Full-color 3-D Prints and Transparencies
PVA layers oriented at 90° to one another and at 45° to
the edge of the sheet (Polaroid Corporation, Cambridge,
MA). A thin metering layer of a nonoriented ink-permeable polymer, such as carboxymethyl cellulose, overlies the
surface of each of the PVA layers, as indicated in Fig. 7.
Printing Equipment. Desktop ink-jet printers are characterized as drop-on-demand printers. The printhead comprises a bank of ink cartridges, one for each of three subtractive color inks and in most cases one for black ink as
well. As the printhead moves rapidly across a sheet of
paper or transparent base, microscopic nozzles eject ink
onto the sheet. In certain drop-on-demand printers, such
as those provided by Epson, electronic signals actuate piezoelectric diaphragms within the head to force the
imagewise ejection of droplets. In bubble-jet systems, such
as the Hewlett-Packard (HP) and Canon printers, heat-
Vol. 42, No. 4, July/Aug. 1998
309
Figure 7. Schematic drawing of sheet structure.
ing elements create bubbles that expand to force out the
ink droplets.
Most of our images to date have been produced on HP
printers, including Models 500C, 550C, 850C, and 820.
We are also printing images on an Epson Stylus 800
printer. In each case the image resolution is determined
by the resolution of the printer. The Hewlett-Packard
printers are rated at 300 dpi and the Epson at 1440 dpi.
To print the 3-D images, we insert standard ink-jet cartridges filled with dichroic inks into otherwise unmodified ink-jet printers. To make a transparency with the HP
printer we select the media setting “transparency” and
we set the intensity at “normal” to “darkest.” To make a
reflection print we select “glossy” as the media setting and
choose “lighter” to “normal” ink intensity. We print the
left-eye image on one surface of the sheet and the righteye image on the opposite surface.
Finishing Stage. After transfer of the images to the oriented PVA layers has taken place, the metering layers are
removed. Transparencies are mounted for viewing by overhead projection or for direct viewing on a light box. Images prepared for viewing as reflection prints are laminated to reflective aluminized backing sheets.
Results
We have used a variety of stereoscopic images in the
course of our development work. These images represent
many applications, including molecular modeling, microscopy, data visualization, entertainment, and pictorial photography. The 3-D image pair used in Fig. 5 was taken
from a Photo CD file. Several of the images shown during the presentation of this paper originated as computergenerated or instrument-generated stereoscopic data. For
example, a model of a complex protein molecule was produced from its molecular coordinates, using the program
MOLMOL in a Silicon Graphics workstation.5,6 The image information was saved as a TIFF file and transferred
to a PowerMac in our laboratory via FTP, using NCSA
TelNet. Fig. 8 illustrates the image as a side-by-side stereo pair.
The 3-D transparencies produced in desktop equipment are convenient hardcopy of size and quality suitable
for projection in a standard overhead projector onto an
aluminum screen or for direct viewing by transmitted light.
In most situations the observers wear conventional lin-
310
Journal of Imaging Science and Technology
Figure 8. Molecular structure of α-t-α, a 35-residue peptide with
a helical hairpin conformation in solution.5 Left, left-eye image;
right, right-eye image.
early polarized viewers. The same images may be projected
directly through an overlaid quarter-wave retarder for
observation with circularly polarized viewers. The transparencies may also be viewed without glasses, using tabletop autostereoscopic display apparatus.
Although all of the prints and transparencies produced
so far have been printed on desktop equipment, we are
exploring the applicability of larger format drop-on-demand printers and continuous-flow ink-jet printers.
Summary and Conclusions
We have developed a 3-D imaging system that should
be compatible with many of the color ink-jet printers now
in use. All indicators suggest that ink-jet printing will continue to be a leading technology for producing digital
hardcopy of high quality. Advances in resolution, image
quality, speed, and convenience are occurring rapidly in
the ink-jet industry, and these advances will contribute to
the utility of our process. The increasing use of 3-D information in many fields makes it desirable to have ready
access to high-quality 3-D hardcopy. We believe that our
technology offers new opportunities for modern stereoscopic imaging.
Acknowledgments. We thank David Burder for the use
of several of his 3-D images, including the pair used in
Fig. 5. We also thank John Osterhout for the image of the
molecule shown in Fig. 8.
References
1.
2.
3.
4.
5.
6.
E. H. Land, J. Opt. Soc. Am. 30, 230–238 (1940).
(a) W. A. Shurcliff, Polarized Light: Production and Use, Chap. 4, Harvard
University Press, Cambridge, MA, 1966; (b) E. H .Land and C. D.
West, “Dichroism and dichroic polarizers” in Colloid Chemistry, J.
Alexander, Ed., Vol. 6, Reinhold Publishing Corp., New York, 1946, pp.
160–190.
J. Scarpetti, International Patent WO 96/23663, 1996, assigned to The
Rowland Institute for Science.
R. Bernhard, U.S. Patent 1,829,673, asssigned to J. R. Geigy, S.A.
Y. Fezoui, P. J. Connolly and J. J. Osterhout, Solution structure of α-t-α,
a helical hairpin peptide of de novo design, Prot. Sci. 1869–1877 (1997).
R. Koradi, M. Billeter and K. Wuthrich, MOLMOL: A program for display
and analysis of macromolecular structures, J. Mol. Graph. 14, 51–55
(1996).
Scarpetti, et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Stereo Matching by Using a Weighted Minimum Description of Length
Method Based on the Summation of Squared Differences Method
Nobuhito Matsushiro† and Kazuyo Kurabayashi
First Research Laboratory, OKI Data Corporation, 3-1, Futaba Town, Takasaki City, Gunma Prefecture, 370-0843, Japan
A stereo matching method for 3-D scenes is described. The problems addressed are a definition of a search space for scanlines of stereo
images and a formulation of the determination of the optimal path in the search space. The SSD (Summation of Squared Differences)
method is a measure of similarity between a right image region and a left image region and is used as a cost function of the optimal
problem. There are some problems regarding the SSD modeling. To resolve the problems we propose a stereo matching method based
on the WMDL (weighted minimum description length) criterion in which parameters of the problem are optimized. The WMDL
criterion is an information criterion proposed by us previously. The efficiency of the method is shown by using both synthetic and real
images. As the results show, 3.1% to about 18.9% of the error values decreased for the synthetic images and 8.07% to about 12.19% of
the error value decreased for the real image in comparison with the SSD method.
Journal of Imaging Science and Technology 42: 311–318 (1998)
Introduction
This article describes a stereo matching method for 3-D
scenes. To synthesize stereoscopic view images, correspondence of one image (left or right eye image) to another
(right or left eye image) is necessary. In addition, from the
correspondence, depth information for each point can be
obtained by the principle of triangulation measurement.
The problems addressed are a definition of a search space
for each scan-line of stereo images and a formulation of the
determination of the optimal path in the search space.1–4
The SSD (summation of squared differences) method3,4 is
a measure of similarity between a right image region and
a left image region and is used as a cost function in the
optimization problem. There are some problems regarding the SSD modeling. To resolve the problems we propose a stereo matching method based on the WMDL
(weighted minimum description length), in which parameters are optimized by the WMDL criterion.
Originally, the MDL criterion5–8 was derived for model
parameter estimation of coding models. The criterion was
applied to model parameter estimation in various statistical problems, such as prediction and estimation. We have
generalized the MDL criterion for the observation system
and have derived the WMDL criterion. Stereo matching
can be formulated as a prediction problem in which the
right image is predicted by using the left image (and vice
versa), and the parameters of the prediction problem are
evaluated by the WMDL criterion.
The proposed algorithm was tested on both synthetic
and real images. Experimental results show the efficiency
of the proposed method.
SSD Method
The SSD is a measure of similarity between a right image region and a left image region and is used as a cost
function of the optimal problem. The SSD value is calculated as a summation of squared differences between two
pixel intensity values of corresponding positions in the left
image window and the right image window, as illustrated
in Fig. 1. The window size must be large enough to include enough intensity variation for reliable matching but
small enough to avoid projection distortion.
The first problem of the SSD modeling is that the SSD
values of variable windows cannot be compared with each
other because of model differences. The optimal window
size depends on local variations of image characteristics,
and an appropriate criterion for the comparison of different models is necessary.
The second problem of the SSD modeling is the projection distortion. The effect of the projection distortion can
be explained by a simple example. Figure 2 shows the left
and the right images of a cube. In the figure, the centers
of the windows correspond. Figure 3 shows the windows
observed from the vertical direction against the surface
Original manuscript received December 3, 1997
† e-mail: [email protected]; tel: 027-328-6172; fax: 027-328-6390
© 1998, IS&T—The Society for Imaging Science and Technology
Figure 1. Correspondence between the left image and the
right image.
311
Figure 2. The left image and the right image of a cubic.
a–b–c–d. In Fig. 3, the centers indicated by the symbol O
correspond but the larger the distance from the center of
the windows, the more the increase of the projection distortion from which a disparity distortion in the windows
is generated. The symbol × indicates the intensity comparison points on the windows that are in different positions on the real object. The larger the distance from the
center, the more the uncertainty of the correspondence.
WMDL Method
We have applied the MDL criterion5–8 framework to the
stereo matching problem9 to resolve the first problem of
the SSD modeling. Different models can be compared using the MDL criterion framework, which is based on information theory. Originally, the MDL criterion was
derived for model parameter estimation of coding models.
It has been applied to model parameter estimation in various statistical problems, such as prediction and estimation. We have generalized the MDL criterion regarding
the observation system and have derived the WMDL criterion.10 Stereo matching can be formulated as a prediction problem in which the right image is predicted by using
the left image (and vice versa), and the parameters of the
prediction problem are evaluated by the WMDL criterion.
By the generalized observation system of pixel information, the uncertainty of the correspondence caused by the
projection distortion (the second problem of the SSD modeling) can be absorbed.
The MDL Criterion. Before describing the WMDL criterion, the general MDL criterion is described.
The MDL criterion is formulated as follows:
n
1
MDL = − ∑ log Qk ,θ (i ) + K log n,
(1)
2
i =1
where
log
= natural logarithm
n
= number of observed data
θ
= (θ1, θ 2,…, θ k), a model parameter vector
k
= number of elements
Qk,θ(i) = a probability distribution.
The first term of Eq. 1 is the description length of data
by a model, and the second term is the description length
of the model itself. The larger the number of observed data,
the more precise the estimated model parameters. Consequently, the description length of the second term increases
because of the decrease of the description length of the
first term. But, the smaller the number of observed data,
the less precise the estimated model parameters. Consequently, the description length of the second term decreases because of the increase of the description length
of the first term. The first and the second term are in such
312
Journal of Imaging Science and Technology
Figure 3. Explanation of the projection distortion.
a trade-off relationship. The effect of the parameter k is
explained in the same way. The MDL model fitting requires
a shorter description by a model without redundancy of
the model.
The WMDL Criterion (Appendix A). The WMDL criterion is formulated as follows:
n
WMDL = − ∑ wi log Qk ,θ (i ) +
i =1
1
n 
K log ∑ wi ÷,
 i =1 
2
(2)
where wi is generalized observation coefficients, (i = 1,2, .
. . , n). By the correspondence between the second term of
Eq. 1 and the second term of Eq. 2 it is known that
n
∑ wi
i =1
is an apparent number of observed data.
In Appendix B, it is shown that the disparity distortion
can be modeled as a noise term added to the WMDL value
of the true probability distribution. The effect of the additive noise term that takes a positive value can be absorbed
by the generalized observation system. The generalized
observation system of the weighted coefficients effect on
the WMDL of the true probability distribution and the
additive noise term, and requires shorter description by a
model without redundancy of the model.
Probability Modeling for the Prediction Problem.
Let IL(i),IR(i) denote intensity values in the left window
and the right window indexed by i, respectively. It is assumed that prediction errors are subject to a Gaussian
distribution of zero mean and each error value is independent. It is also assumed that disparities are uniform in
each window. The probability modeling for the prediction
problem is as follows with a squared difference included
in the equation:
ε 2 (i )
Qk ,θ (i ) =
1
2
e 2σ ,
2π σ
(3)
Matsushiro, et al.
TABLE I. The SSD and the WMDL Comparison for Synthetic
Images of a Square Plate Against a Background
Methods
SSD
WMDL
where
θ =σ
ε (i) = I L (i) − I R (i)
σ 2 = ε 2 (i).
Calculation of the WMDL Criterion. By applying Eq.
3 to Eq. 2, the following equation is derived:
n
i =1
(
1
n 
K log ∑ wi 
 i =1 
2
)
1 n
1
 n  (4)
n 
=  ∑ wi  log 2π σ + 2 ∑ wi ε 2 (i ) + K log ∑ wi  .
 i =1 
 i =1 
2
2σ i =1
The second term of Eq. 4 corresponds to the SSD value.
The WMDL value is normalized as follows:
n
WMDL = WMDL ∑ wi .
i =1
3×3
5×5
7×7
9×9
11 × 11
13 × 13
15 × 15
17 × 17
ξ = 1.00
0.95
0.90
32.3 (best)
33.6
35.5
37.5
40.0
42.6
45.3
47.5 (worst)
29.2
the data including additive noise of disparity distortion.
The generalized observation coefficients satisfy ξwi = wi–1
(i = 1,2, . . . ,n), wn = 1, where x indicates an attenuation
parameter for one pixel distance from the center of a window. The predetermined attenuation parameters are 1.0,
0.95, and 0.90. In the WMDL method the optimal window
size and the optimal attenuation parameter ξ that minimize the WMDL value are selected for each window.
The stereo matching algorithm was tested on both synthetic and real images. In synthetic images, true disparity was known in advance.
Figure 4. The search space.
WMDL = − ∑ wi log Qk ,θ (i ) +
Disparity error (%)
(5)
Determination of the Optimal Path in the Search
Space. In this article, it is assumed that the x–y axes of
two cameras are parallel to each other and the centers of
the lenses are both on the x axis. Based on this assumption, epipolar lines that restrict the range of the search
space are parallel to the scanlines and each scanline is
treated independently. In the search space illustrated in
Fig. 4, the DP (dynamic programming) is applied to search
for the optimal path that minimizes the total WMDLnrm
value on a scanline. At each point in the search space, the
window size and the generalized observation coefficient
that minimizes WMDLnrm are selected from predetermined
values. In Eq. 5, k = 3 which are σ and a window position’s
two coordinates.
Experiments
As described above, different models can be compared
by using the WMDL criterion. The WMDL criterion is applied to model parameter estimation and the determination of the optimal path in the search space.
The predetermined window sizes are 3 × 3, 5 × 5, 7 × 7,
9 × 9, 11 × 11, 13 × 13, 15 × 15, and 17 × 17. The generalized observation system is designed so that the larger the
distance from the center of a window the less the effect of
Experiment I. In the experiments, a synthetic left and a
synthetic right image shown in Fig. 5 are used. Each of
the image sizes is 160 × 160 (pixel). The light source of the
synthetic images is a point source. The synthetic image
consists of a square plate against a background.
The comparison between the SSD method and the
WMDL method is performed by using the disparity errors.
Table I shows the disparity errors of the SSD method and
the WMDL method. The disparity error of the WMDL
method decreased by 3.1% to 18.3% in comparison with
the SSD method.
Figures 6(a) through 6(g) are the true disparity, the disparity estimated by the SSD method with a 3 × 3 window,
the disparity error (absolute value) by the SSD method with
a 3 × 3 window, the disparity (SSD, 17 × 17), the disparity
error (SSD, 17 × 17), the disparity (WMDL), and the disparity error (WMDL), respectively. By comparing Fig. 6(c)
with Fig. 6(e), it can be observed that the errors near the
edges are decreased for the small window and are increased
for the large window. In addition, it can be observed that
the errors increase for the small window on the flat plane
of few intensity variations and decrease for the large window. With the WMDL method, the window size and the
observing system are optimized with respect to local variations of image characteristics, and the uncertainty of the
assumption of disparity uniformity in a window is absorbed.
Hence, the disparity error is decreased as a whole in the
image in comparison with the SSD method.
Experiment II. In this experiment, the synthetic left and
right images shown in Fig. 7 are used. Each of the image
sizes is 160 × 160 (pixel). The light source of the synthetic
images is a point source. The synthetic image consists of
two square plates against a background.
Table II shows the disparity errors of the SSD method
and the WMDL method. The disparity error of the WMDL
method decreased by 13.5% to about 18.9% in comparison
with the SSD method.
Figure 8(a) through 8(g) are the true disparity; the disparity estimated by the SSD with a 7 × 7 window; the
Stereo Matching by Using a Weighted Minimum Description of Length Method...
Vol. 42, No. 4, July/Aug. 1998
313
(a)
(b)
Figure 5. (a) The left image; (b) the right image.
disparity error (absolute value) by the SSD with a 7 × 7
window; the disparity (SSD, 17 × 17); the disparity error
(SSD, 17 × 17) the disparity (WMDL); and the disparity
error (WMDL), respectively. The same behavior can be
observed near the edges and the flat planes in previous
experiments.
Experiment III. In the experiment the real images shown
in Fig. 9 are used. Each of the image sizes is 832 × 624
(pixel).
The comparison between the SSD method and the
WMDL method is performed by using the WMDL value in
which the SSD conditions are converted to the WMDL
value. The results are shown in Table III. In the WMDL
method, 8.07% to about 12.19% of the WMDL value decreased in comparison with the SSD method.
The experimental results (I, II, and III) show the efficiency of the proposed method.
Conclusions
A stereo matching method for 3-D scenes has been described. Stereo matching can be formulated as a prediction problem in which the right image is predicted by
using the left image (and vice versa). The problems addressed have been a definition of a search space for each
scanline of stereo matching and a formulation of the determination of the optimal path in the search space. We
have proposed the WMDL criterion based SSD method
in which parameters of the problem are optimized by
the WMDL criterion to resolve the problems of the SSD
modeling.
The stereo matching algorithm has been tested on both
synthetic and real images. In the experiment using a synthetic image, the disparity error of the WMDL method
decreased by 3.1% to about 18.3% in comparison with the
SSD method. In the experiments using another synthetic
image, the disparity error of the WMDL method decreased
13.5% to about 18.9% in comparison with the SSD method.
In the experiments using a real image, the WMDL value
314
Journal of Imaging Science and Technology
TABLE II. The SSD and the WMDL Comparison for Synthetic
Images of Two Square Plates Against a Background
Methods
SSD
WMDL
3×
5×
7×
9×
11 ×
13 ×
15 ×
17 ×
Disparity error (%)
3
5
7
9
11
13
15
17
ξ = 1.00
0.95
0.90
43.6
42.7
42.6 (best)
42.9
43.3
44.7
46.4
48.0 (worst)
29.1
TABLE III. The SSD and the WMDL Comparison of Real Images
Comparison Methods
➀
➁
WMDL
SSD 3 × 3
SSD 5 × 5
SSD 7 × 7
SSD 9 × 9
SSD11 × 11
SSD13 × 13
SSD15 × 15
SSD17 × 17
Decreased (➀-➁)
WMDL value (%))
8.07
8.07
8.84
9.65
10.38
11.05
11.65
12.19
decreased 8.07% to about 12.19% in comparison with the
SSD method. Experimental results have shown the efficiency of the proposed method.
In this article, only gray scale images are used in the
experiments. In the future we will apply the proposed
method to color images.
Matsushiro, et al.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Stereo Matching by Using a Weighted Minimum Description of Length Method...
Figure 6. (a) The true disparity; (b) the
SSD (3 × 3) disparity; (c) the SSD (3 ×
3) disparity error; (d) the SSD (17 × 17)
disparity; (e) the SSD (17 × 17) disparity error; (f) the WMDL disparity; (g) the
WMDL disparity error.
Vol. 42, No. 4, July/Aug. 1998
315
(a)
(b)
Figure 7. (a) The left image; (b) the right image.
n
k
( )
Appendix A: Derivation of the WMDL Criterion10
The WMDL criterion is derived from a weighted log
lw = ∑ wi log Qk , θˆ (i ) + ∑ wi log 1 e j .
likelihood ∑ wi log Qk ,θ (i ) of a probability distribution
By applying Eq. A-2 to Eq. A-3 and by the Taylor expansion of Eq. A-3 around θ = θˆ , the following equation
is derived:
[]
i =1
j =1
(A-3)
n
i =1
n
n
∏ Qk ,θ (i ) instead of the log likelihood ∑ log Qk ,θ (i ) that is
i =1
i =1
the basis of the minimum description length (MDL) criterion, where log indicates the natural logarithm and wi(i
= 1,2, . . . ,n) indicates weighted coefficients that satisfy
wi–1 < wi, wn = 1.
The best parameter θ̂ is the most likelihood estimation
of about the weighted log likelihood, as follows:
 n

θˆ = arg min − ∑ wi log Qk ,θ (i ).
 i =1

θ
(A-1)
( )
k
n
n

lw = ∑ wi log Qk ,θ (i ) + (1 2)e l  ∑ wi M  e + ∑ log 1 e j + Rn , (A-4)


j =1
i =1
i =1
where
Rn =
M=
the rest term
a k × k matrix of Mk1,k2(k1,k2 = 1,2,…,k) elements
Mk1,k 2 =
 n
∂2
  n 
− ∑ wi log Qk ,θ (i )  ∑ wi   .


  i =1  θ =θˆ
∂θ k1∂θ k 2  i =1
The description of the probability of a model is equivalent
to the description of θ̂ in a finite precision. The description of
θ̂ in a finite precision can be formulated as follows:
In Eq. A-4, the first and second differential terms are
considered. And the first differential term equals zero,
because
[θˆ] = θˆ + e,
− ∑ wi log Qk ,θ (i )
(A-2)
where
θ̂ = the most likelihood estimation of θ concerning
the weighted log likelihood
e =
 e1 
 e2 
 
 M  an error vector of finite precision
e 
 k
[ θ̂ ] = a finite precision description of θ̂ .
The probability distribution of a model is Qk,[θ̂ ] , and it is
necessary to discriminate reciprocal numbers of ej to describe the j’th [ θ̂ ] element. The log description length of
the discrimination is log(1/ej) and the total description
length is as follows:
316
Journal of Imaging Science and Technology
n
i =1
takes a pole value (the minimum value) at θ = θˆ by of the
definition. The
n

 ∑ wi log Qk ,θ (i )
 i =1

n 
 ∑ wi 
 i =1 
value in Mk1,k2 is a weighted averaged log likelihood that
corresponds to the averaged log likelihood
n

 ∑ wi log Qk ,θ (i ) n
 i =1

in the MDL derivation procedure.
The ej value that minimizes lw value can be derived by
differentiating the summation of the second and the third
terms in Eq. A-4 about ej as follows:
ej = dj
n
∑ wi ,
(A-5)
i =1
where dj is the constant value depending on M.
Matsushiro, et al.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Stereo Matching by Using a Weighted Minimum Description of Length Method...
Figure 8. (a) The true disparity; (b) the
SSD (3 × 3) disparity; (c) the SSD (3 × 3)
disparity error; (d) the SSD (17 × 17) disparity; (e) the SSD (17 × 17) disparity
error; (f) the WMDL disparity; (g) the
WMDL disparity error.
Vol. 42, No. 4, July/Aug. 1998
317
(
(
)
)
 n n
D Q* Qk ,θ = 1 ∑ wi ÷∑ wi log Q* (i ) Q k ,θ (i ) .
 i =1  i =1
(
(A-9)
)
Theorem 2. The weighted divergence D Q* Qk,θ takes a
positive finite value.
Proof. Under the assumption that a weighted average
is equivalent to the expectation value E{•}, the following
relation is derived:

1

∑ wi ÷ ∑ wi log (Q* (i)

n
i= 1
n
i= 1
{
Qk,θ (i)
)
)}
(i) Q* (i))}
= − log ∑ Q* (i)(Q (i) Q* (i))
(
≥ − log E{(Q
= E − log Qk,θ (i) Q* (i)
k ,θ
(A-10)
k ,θ
(a)
Q
= − log 1
= 0.
(
)
So the weighted divergence D Q* Qk ,θ takes a positive value.
The Q*(i)/Qk,θ(i) (i = 1,2,…,n) takes a finite value that let
U denote the maximum value of log(Q*(i)/Qk,θ(i)), and the
following relation is derived:
0<
∑ wi log(Q* (i)
n
i= 1
)
n
Qk,θ (i) < U ∑ wi .
(A-11)
i= 1
By using Theorem 1, the right side of Eq. A-11 takes a
finite value and the weighted divergence D Q* Qk ,θ takes
a positive finite value.
(
Theorem 3. Let’s assume that the disparity distortion can
be included in θ. Disparity distortion can be modeled as a
noise term added to the WMDL value of the true probability distribution.
Proof. Let’s assume q* to be the true model parameter.
The first-order Taylor expansion of Eq. A-7 is as follows:
(b)
Figure 9. (a) The left image; (b) the right image.
By applying Eq. A-5 to Eq. A-4, ignoring the second term,
and ignoring the constant term separated from the third
term and Rn, the following equation is derived:
n
n 
lw = ∑ wi log Qk ,θˆ ( Si ) + (1 2)k log ∑ wi ÷.
 i =1 
i =1
(A-6)
By minimizing Eq. A-6 about k and indicating the optimal
k as k̂ , the WMDL is derived as follows:
n
n 
WMDL = ∑ wi log Qkˆ , ˆ (i ) + (1 2)kˆ log ∑ wi ÷.
θ
 i =1 
i =1
(A-7)
Appendix B
Theorem 1. Under the assumption that
wi–1 < wi (i = 1,2,…,n), wn = 1,
n
∑ wi takes finite value even if n → ∞.
i =1
Proof. There exists a value γ that satisfies the relation
wi ≤ γ n–i (i = 1,2,…,n).
n −i
< 1 (1 − γ ) .
∑ wi ≤ ∑ γ
n
n
i =1
i =1
(A-8)
The right side of Eq. A-8 takes a finite value and the left
side value takes a finite value even if n→∞.
Definition. Let Q* denote the true probability distribution. A weighted divergence D between Q* and Qk,θ is defined as follows:
318
Journal of Imaging Science and Technology
)
n
n 
WMDL = ∑ wi log Qkˆ ,θˆ (i) + (1 2)kˆ log  ∑ wi ÷
 i= 1 
i= 1
(A-12)
((
n
n 
= ∑ wi log Qkˆ ,θ (i) + (1 2)kˆ log  ∑ wi ÷ + O D Q* Qk,θ
*
 i= 1 
i= 1
)) + R .
n
The third term can be seen to be noise additive to the
WMDL value of the true probability distribution.
References
1. M. Tadenuma and I. Yuyama, Optimization of matching-point-detection
in stereoscopic images, in Proc. of the Institute of Television Engineers
of Japan Annual Convention, Japan, 1993.
2. J. L. Barron, D. J. Fleet and T. A. Burkitt, Performance of optical flow
techniques, in IEEE Proc. CVPR, IEEE, Piscataway, NJ, 1992, p. 236.
3. T. Kanade and M. Okutomi, A stereo matching algorithm with an adaptive window: theory and experiment, Technical Report CMU-CS-90,
School of Computer Science, C. M. U., Pittburgh, PA, 15213 (1990).
4. T. Azuma, K. Uomori and A. Morimura, Motion estimation from different
size block correlation, in Proc. of the Institute of Television Engineers of
Japan 3-D Image Conference, Japan, 1994, p. 33.
5. J. Rissanen, Stochastic complexity and modeling, Ann. Statist. 14, 1080
(1986).
6. J. Rissanen, Modeling by shortest data description, Automatica, 14,
465 (1978).
7. J. Rissanen, A universal prior for integers and estimation by minimum
description length, Ann. Statist. 11, 416 (1983).
8. J. Rissanen, Universal coding, information, prediction and estimation,
in IEEE Trans. Inf. Theory, Vol. IT-30, 629 (1984).
9. N. Matsushiro and K. Kurabayashi, Stereo matching based on mdl based
ssd method, in Proc. of the Society for Imaging Science and Technology 50th Annual Conference, Boston US, 645 (1997).
10. N. Matsushiro, Considerations on the weighted minimum description
length criterion, J. Inst. Tele. Eng. Japan, 50 (4), 483 (1996).
Matsushiro, et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Diffuse Illumination as a Default Assumption for Shape-From-Shading in
the Absence of Shadows
Christopher W. Tyler
Smith-Kettlewell Eye Research Institute, San Francisco, California 94115 USA
Sinusoidal luminance patterns appear dramatically saturated toward the brighter regions. The saturation is not perceptually logarithmic but exhibits a hyperbolic (Naka–Rushton) compression behavior at normal indoor luminance levels. The object interpretation
of the spoke patterns is not consistent with the default assumption of any unidirectional light source but implies a diffuse illumination
source (as if the object were looming out of a fog). The depth interpretation is, however, consistent with the hypothesis that the
compressed brightness profile provided the neural signal for perceived shape, as an approximation to computing the diffuse Lambertian
illumination function for this surface. The surface material of the images is perceived as non-Lambertian to varying degrees, ranging
from a chalky matte to a lustrous metallic.
Journal of Imaging Science and Technology 42: 319–325 (1998)
Introduction
It is common to assume that the perception of the shape
of an object from its shading image follows a few simple
principles based on default assumptions about the light
source and surface properties. For example, much of the
computer vision literature makes the assumption of a spatially limited (or approximately point) source of light and
surfaces of Lambertian (or uniform matte) reflectance
properties. Such assumptions are commonly supposed to
provide reasonable approximations to the typical interpretations of the human perceptual system (at least in
the absence of explicit highlight features). In fact, however, the present analysis will show that there is wide
variation in the interpreted surface quality depending on
minor variations in the luminance profile of the shading
image. Human observers do not seem to make a default
assumption about reflectance properties, but to impute
them for the particular shading image. Moreover, their
interpretation of simple shading images is not consistent
with the point-source assumption. The theoretical expectations for a variety of illuminant assumptions is examined in an attempt to determine what default assumption
is made by human observers.
Prior work in diffuse illumination includes extensive
analysis of the properties of diffuse illumination by Langer
and Zucker1 (considered in detail below) and a study of
fogging of non-Lambertian objects by Barun.2 Although
these studies consider the inverse problem of estimating
the shape of the surface from the resultant luminance in-
Original manuscript received November 10, 1997
† Email: [email protected]; Web: www.ski.org/cwt; Fax:415-561-1610
© 1998, IS&T—The Society for Imaging Science and Technology
formation, they do not address the specific ambiguities
and distortions that are the topic of the present work. It is
well known that the surface depth corresponding to a particular (submaximal) luminance value is indeterminate
for point-source illumination, because its luminance is
controlled by its angle to the surface normal. For the onedimensional surface, this ambiguity reduces to two possible values. It is less clear from the cited studies that the
same ambiguity pertains to the diffuse illumination case,
where surfaces become darker as they lie deeper in “holes.”
This kind of issue and the processes that the human brain
may use to decode the surface shape are the topic of the
present analysis.
The focus will be on shading images based on sinusoidal and related luminance functions. As an initial demonstration of the shapes perceived from sinusoidal
shading images, Fig. 1 depicts three spoke patterns in
which there is repetitive modulation as a function of radial angle. The first pattern has a linear sinusoidal profile, the second is predistorted so as to have an
approximately sinusoidal appearance to most observers,
and the third is further distorted so as to appear as an
accelerating function with wider dark bars than light
bars. Note that, in this radial format, there is a strong
tendency to perceive these luminance profiles as deriving from three-dimensional surfaces.
What are the properties of the perceived surfaces? Although the generator function is one-dimensional, we are
able to estimate simultaneously the surface shape, the reflectance properties, and something about the illuminant
distribution. We thus parse the one-dimensional luminance
function at a particular radius in the image into three distinct functions. Such parsing can occur only if the visual
system makes default assumptions about two of the functions. The question to be addressed is what default assumptions are made?
Because the patterns are radially symmetric, the
illuminant distribution must itself be symmetric (or the
319
(a)
(b)
(c)
Figure 1. Depictions of sinusoidal spoke patterns with various levels of brightness distortion; (a) linear sinusoid; (b) perceptually
sinusoidal compensation, accelerating hyperbolic distortion to provide sinusoidal appearance; (c) overcompensation for perceptual
inversion, extreme hyperbolic distortion to appear as an accelerating distortion. Best approximation to intended appearance will be
obtained if viewed from a distance so that pixellation is not visible.
Sequence of Operations in Shape Perception.
SURFACE SHAPE
INCIDENT
➤ ILLUMINATION OF
SURFACE
➤
REFLECTANCE
FUNCTION
➤
BRIGHTNESS
COMPRESSION
➤
SHAPE
RECONSTRUCTION
Figure 2. The sequence of operations involved in the perception of the shape of a viewed object from the luminance shading information. Does the visual system reconstruct the full sequence or use the simplifying assumption that the output approximates the input?
shading on different spokes would vary with the orientation of the spokes relative to the direction of the
illuminant). Thus the only possible variation of illuminant
properties is the degree of diffusion of the illuminant from
a point source (positioned above the center of the surface).
To most observers, the surface appears to be of matte (or
Lambertian) material in Fig. 1(a) and to become progressively more lustrous in Figs. 1(b) and 1(c). Somehow, the
human visual system partitions the single function in each
image into separate shape, reflectance and illumination
functions. This study is an initial attempt to explore the
rules by which such partitioning takes place.
Compressive Brightness Distortion. Before proceeding with the analysis of surface properties, first consider
the simple compressive distortion of the brightness image. If the surface properties are ignored for the moment,
the direct brightness profile of Fig. 1 does not appear to
be sinusoidal: the dark bars look much narrower than the
bright bars (based on the perceived transition through midgray). This narrowing effect is far more pronounced in
320
Journal of Imaging Science and Technology
high-contrast images on a linearized CRT screen than in
this printed example, which has a contrast of about 95%.
For several reasons, it is probable that the perceived distortion arises at the first layer of visual processing, the
output of the retinal cone receptors (Macleod et al.,3 Hamer
and Tyler.4 However, the focus here is on the distortion’s
perceptual characteristics, not its neural origin.
It is reasonable to be skeptical of the linearity of the
reproduction of Fig. 1. A simple test of the accuracy of its
linearity is to view the figure in (very) low illumination,
after dark-adapting the eyes for a few minutes. In such
conditions, the visual system defaults to an approximately
linear range, and it can be seen that Fig. 1(a) now appears to have roughly equal widths of the bright and dark
bars.
In terms of shape-from-shading issues, the question
arises whether the depth interpretation mechanism of the
visual system “knows” that it is being fed a distorted input. The most adaptive strategy, for either genetic specification or developmental interaction with the environment,
would be for the brightness distortion to be compensated
Tyler
in the depth interpretation process so that the perceived
brightness distortion does not distort the depth interpretation (Fig. 2).
However, the observed depth interpretation from these
patterns seems to follow closely the waveform of the perceived brightness profile; when the brightness is perceived
as sinusoidal [Fig. 1(b)], the surface is perceived as a
roughly sinusoidal “rosette.” When the brightness pattern
is perceived as having narrow dark bars [Fig. 1(a)], the
surface is perceived more like a ring of cones with narrow
valleys between them. Fig. 1(c) continues this trend, although a second principle of change in surface properties
now appears. The question to be addressed is: what principles is the visual system using in deriving its surface
interpretation from the luminance profile?
The direct relationship between perceived brightness
and surface depth that is the typical perception of the patterns of Fig. 1 is surprising in relation to the luminance
profiles that should be expected from geometric reflectance
considerations. For example, in Fig. 1(b) the surface appears approximately sinusoidal and peaks in phase with
the peaks of the luminance image. As the following illumination analysis will show, this interpretation is completely incompatible with point-source illumination in any
position. This incompatibility is surprising in view of the
widespread use of the point-source assumption in the field
of computer vision. Development of a diffuse illumination
analysis then provides an explanatory basis for the observed perceptual interpretations. An additional benefit
of the diffuse illumination analysis is that it shows how
the direct relationship between perceived brightness and
surface depth perception is compatible with the operation
of a compensation for early brightness compression in the
perceived brightness function.
Properties of Diffuse Illumination. Although much of
the computer vision literature has concentrated on illumination by point sources, Langer and Zucker1a,b have laid
the groundwork for the analysis of the luminance properties arising from diffuse illumination. The basic assumption is that the illuminance of any point on a surface is
the integral of the incident light at that point. This
amounts to the cross-section of the generalized cone of rays
reaching that point through the aperture formed by the
rest of the surface. Its properties are described by the sum
of a direct illumination term, a (first-order) self-illumination term of reflections from other surface points and a
residual ε encompassing higher order self-illumination
terms.
R( x) =
ρ
ρ
Rsrc N ( x) ×udΩ +
R(∏ ( x, u)) N ( x) ×udΩ +ε , (1)
π ν (∫x)
π η ( x)ν∫ ( x)
where x is a surface point, N(x) is the surface normal, η(x)
= {u: N(x) u > 0} is the hemisphere of outgoing unit vectors, η(x) is the set of directions in which the diffuse source
is visible from x, dΩ is an infinitesimal solid angle, and
Π(x,u) is the self-projection to the surface from point x in
direction u.
The properties of self-illumination by reflection to a point
from nearby surfaces have been treated for diffuse illumination by Stewart and Langer.5 Although some special
cases deviate in detail, they show that for complex surfaces the self-illumination component tends to operate as
a multiplicative copy of the direct term, so that the whole
equation for R(x) may be approximated by the first term
multiplied by a constant close to 1.
•
R( x) ≈ (1 + k)
ρ
Rsrc N( x)× udΩ.
π ν (∫x)
(2)
Figure 3. Lambertian reflectance profiles for a sinusoidal surface (a) under three illumination conditions; (b) point-source illumination from infinity at a grazing angle to the left-hand slopes;
(c) point-source illumination from infinity directly above the surface; and (d) diffuse illumination from all directions.
Intuitively, this simplification occurs because the maximum self-illumination generally arises from surfaces of
similar luminance to the point under consideration. This
result is particularly clear for surfaces that are symmetric with respect to the average surface normal (such as a
V-shaped valley), where the closest points across the valley are those at the same height as a chosen point. Stewart
and Langer show that even extreme departures from this
symmetry (such as an overhanging cliff) introduce only
relatively mild distortions into the net diffuse illumination function.
Illumination Analysis. The general principles of luminance profiles based on Lambertian objects are well
known, but it is instructive to consider the variety of luminance patterns that may arise from a simple object
such as a sinusoidal surface under different illumination
conditions, for comparison with human perceptual performance in the reconstruction of shape from shading
when the light sources is unknown. For point sources at
infinity, the angle of incidence is a critical variable. For
the alternative assumption of diffuse illumination, the
principal factor is the acceptance angle outside which the
diffuse illumination is blocked from reaching a particular point on the surface. The assumptions for the following analysis are:
1. The surface has constant albedo (inherent reflectance).
2. The surface has Lambertian reflectance properties.
3. Secondary reflections from one part of the surface are
negligible for point-source illumination and as described
in Eq. 2 for diffuse illumination.
The Lambertian reflectance assumption is that the surface illumination is proportional to the sine of the angle of
incidence at the surface and that the reflectance is uniform
at all angles. Hence, the reflected light is assumed to follow
the cosine rule of proportionality to the cosine of the angle
of incidence relative to the surface normal.
Diffuse Illumination as a Default Assumption for Shape-From-Shading ...
Vol. 42, No. 4, July/Aug. 1998 321
Figure 3 shows (top) the profile of a sinusoidal surface,
below which are three luminance profiles for selected illumination conditions designed to illustrate the variety of
outputs. Because the surface is assumed Lambertian, the
reflected luminance is proportional to the incident illumination and hence proportional to the cosine of the angle of
the surface to the viewer.
The first luminance profile is derived from a point source
at infinity whose angle grazes (is tangential to) the lefthand descending slopes of the sinusoidal surface. Hence,
the reflected luminance is lowest at the position of the
grazing slope and highest along the opposite slope, as
shown by Fig. 3(b). Note that, in this position, the luminance profile has the same number of cycles as the original surface (though distorted rather than being a strict
derivative).
The second luminance profile is derived from a point
source at infinity directly above (normal to) the surface
[Fig. 3(c)]. Because the peaks and troughs of the surface
waveform are at the same angle, they have the same
Lambertian reflectance and hence produce a frequencydoubled luminance profile. For 100% luminance modulation, this profile is close to sinusoidal as described in the
following section. Here the point is that a quantitative shift
in the angle of incidence of the point source produces a
qualitative change in the resulting luminance profile of
the same object.
The third luminance profile [Fig. 3(d)] is derived from
the assumption of a diffuse illumination source rather than
point source. The resulting luminance profile is again very
different from the other two based on point sources. These
examples are chosen to illustrate the complexity of the
interpretation of shape from shading, because a given
shape can give rise to qualitatively different shading profiles depending on the assumed source of illumination.
When confronted with a luminance profile that is actually
sinusoidal, does the human observer assume that it is a
frequency-doubled reflection of a underlying surface of half
that frequency, the diffusely illuminated profile of a
nonsinusoidal surface, or a non-Lambertian surface, etc.?
Geometric Derivation. To develop the theoretical reflectance functions of Fig. 3 required two stages: computation
of the angle-of-incidence functions for the selected illumination conditions according to Eq. 2 and conversion to reflectance functions through the Lambertian reflectance
assumption. The sinusoidal surface profile is shown again
for reference in Fig. 4, below which are plots of the angle
of incidence for three different illumination conditions.
The first angle-of-incidence function [Fig. 4(b)] is derived
from a point source at infinity whose angle grazes (is tangential to) the left-hand descending slopes of the sinusoidal surface. Hence, the angle of incidence is zero at the
position of the grazing slope and highest along the opposite slope, as shown by Fig. 4(b). This curve will itself be
sinusoidal if (and only if) the amplitude of the surface sinusoid (top curve) is such that opposite flanks are at a 90º
angle to each other. Note that, in this position, the angleof-incidence function has the same number of cycles as
the original surface function (though shifted in phase in
the direction of the angle of the incident light). The second angle-of-incidence function [Fig. 4(c)] is derived from
a point source directly above (normal to the mean orientation of) the surface. Because the peaks and troughs of the
surface waveform are at the same angle, they produce a
frequency-doubled luminance profile that is asymmetric
with respect to its peaks and troughs. The third angle-ofincidence function [Fig. 4(d)] is derived from the assumption of a diffuse illumination source rather than a point
322
Journal of Imaging Science and Technology
Figure 4. Net angle-of-incidence profiles for a sinusoidal surface (a) under three illumination conditions; (b) point-source illumination at a grazing angle to the left-hand slopes; (c)
point-source illumination directly above the surface; and (d) diffuse illumination from all directions.
source. The light is assumed to be coming equally from all
directions but to be occluded if any part of the surface lies
in its path according to Eq. 2. The resulting luminance
profile is again very different from the other two based on
point sources.
The derivation of the diffuse illumination profile depicted in Fig. 4(d) is depicted in Fig. 5. For a particular
point on the upper trace of the surface being viewed, the
acceptance angle for any point on the surface is the angle
between the line passing through point p that is tangent
to the surface on the left [Fig. 5(b)] and the one that is
tangent to the surface on the right [Fig. 5(c)]. The sum of
the two angles φL and φR defines the acceptance angle for
each point on the surface. Within this acceptance angle,
the light from all directions has to be integrated according to the Lambertian cosine rule for each direction of the
diffuse illumination relative to the orientation of the surface, as specified in Eq. 2.
The net result of the diffuse illumination analysis is
shown for the sinusoidal surface by the lowest curve of
Figs. 3 and 4. Note that this curve peaks at a value of π at
each peak of the waveform but drops to some lower (nonzero) value depending on the absolute depth of the sinusoidal modulation of the surface. Interestingly, the
acceptance angle is not a well-known function such as a
catenary but has marked shoulders between relative
straight regions. Note that the flatness of the lower portion implies that the trough of the sinusoid approximates
the shape of a circle, which has a constant acceptance angle
relative to a gap in its surface (as was demonstrated by
Euclid).
Discussion
The conclusion from the analysis of the three paradigm
cases in Fig. 3 is that, contrary to the appearance of images in Fig. 1, there is no point-source illumination assumption of a sinusoidal Lambertian surface form that
would give rise to a periodic luminance profile matching
the frequency and phase of the surface waveform (as is
perceived by the human observer). The only luminance
function that has the observed frequency and phase relative to the peaks of the surface is the diffuse one, and even
Tyler
Perception of Sinusoidal Patterns. With the analysis
in hand, we may now analyze the perception of the patterns of Fig. 1. The most important result is that these
patterns do give pronounced depth perceptions, even
though they are qualitatively incompatible with any position of point-source illumination. These reports correspond
most closely to the diffuse reflectance profile of Fig. 2 (bottom curve), as looking like a surface with peaks at the
positions of the luminance peaks. However, the case where
the brightness profile [Fig. 1(b)] looks most sinusoidal corresponds to the case where the perceived surface has the
most sinusoidal shape. This seems odd because a sinusoidal surface is predicted to have a much more peaked luminance distribution according to the diffuse illumination
assumption [Figs. 3(d) and 4(d)].
Note that typical deviations from the Lambertian and
the diffuse assumptions will both enhance the discrepancy.
If the surface had a reflectance function that is more focused than the Lambertian, it would tend to increase the
luminance in the direction of the observer and hence make
the peaks of the assumed surface brighter relative to the
rest. Similarly, if the illumination source were more focused than a pure diffuse source, it would introduce a second-harmonic component into the reflectance function
similar to Fig. 3(c), which would again enhance the peaks
and also introduce a bright band in the center of the dark
strips. Hence, the diffuse illumination function at the bottom of Fig. 2 is the least peaked function to be expected
from any single illumination source.
nal response R that seems to be most closely approximated
by a hyperbolic function (like the Naka–Rushton equations for receptor response saturation), as described in
Chan et al.6,7 and Tyler and Liu.8 The optimal equation
was of the form
a
R=
.
(3)
L +σ
Figure 6 illustrates how such a brightness compression
behavior can result in an output that approximates the
original surface shape. For a sinusoidal surface [Fig. 6(a)]
the diffuse reflectance function under Lambertian assumptions is the peaky function of Fig. 6(b). The effect of a hyperbolic compression on this waveform is shown in Fig.
6(c) to result in an approximately sinusoidal output waveform. For comparison, the effect of the same hyperbolic
compression on a straightforward sinusoidal waveform is
shown in Fig. 6(d), appearing strongly asymmetric in terms
of the peak versus trough shapes. It is thus plausible that
the shape-processing system could use the compressed
brightness signal as a simple means of deriving the original surface shape from the diffuse reflectance profile.
If the visual system does indeed use its inbuilt brightness compression as a surrogate for a more elaborate reconstitution algorithm of the shape from shading under
diffuse illumination assumptions, the approximation should
work for other typical surface waveforms. One example to
test this hypothesis is a cylindrical waveform corresponding to a one-dimensional version of the sphere that is used
widely in computational vision (and which corresponds to
the most-simplified form of an isolated object in the world).
A cylindrical waveform is depicted in one-dimensional
cross-section in Fig. 7(a), although the vertical axis is extended relative to a purely circular cross-section. The subsequent panels, in the same format as Fig. 6, show the
diffuse reflectance profile, the effect of brightness saturation on this profile, and a simple sinusoid with the same
degree of compression. Notice that the saturated diffuse
profile again looks similar to the surface waveform, supporting the idea that the brightness-compressed signal can
generally act as a surrogate for the back-computation of
the surface waveform. In this case, the compressed sinusoid looks somewhat similar to the surface waveform also,
which may explain why the linear sinusoid of Fig. 1(a)
resembles a ring of conical “dunce caps” (because a cone is
a version of a cylinder with a converging diameter). If the
visual system treats the brightness-compressed signal as
an approximation to the depth profile of the object under
diffuse illumination, any object that generates a similar
signal after brightness compression should appear to have
a similar shape.
Finally, some brief thoughts on the different qualities
of surface material perceived in Fig. 1. Given that the
image that appears Lambertian is the one that resembles
the ring of dunce caps with circular cross-section, it may
be that the visual system has a Bayesian constraint to
prefer a solution that corresponds to such discrete objects
rather than a continuously deformed surface. If so, shape
reconstructions that deviated from such a circular crosssection (in the absence of explicit contour cues) may tend
to be interpreted as deviations from the Lambertian assumption rather than deviations from the assumption of
circular cross-section. It is not intended for the present
work to provide an empirical analysis of this question but
merely to frame the hypothesis.
Role of Perceptual Response Compression. Human
vision is, of course, not linear as a function of image luminance L but shows a saturating compression of the inter-
Conclusion
The object interpretation of the spoke patterns of Fig.
1 is not consistent with the default assumption of any
Figure 5. Derivation of diffuse illumination profile for the sinusoidal surface (a); (b) surface tangent to the left of each point
along surface; (c) surface tangent to the right of each point along
surface; and (d) net acceptance angle at each point.
it is much more cuspy than a sinusoid. It therefore seems
clear that the human observer is defaulting to a diffuse
illumination assumption, in contrast to the point source
typically assumed for computer graphic displays.
Diffuse Illumination as a Default Assumption for Shape-From-Shading ...
Vol. 42, No. 4, July/Aug. 1998 323
Figure 6. Role of response compression in
the interpretation of depth from shading.
(a) Sinusoidal surface shape; (b) net reflectance profile assuming diffuse illumination
and Lambertian reflectance function; (c)
Perceived brightness signal after hyperbolic saturation. Note similarity to original surface waveform; (d) same degree of
hyperbolic saturation applied to a sinusoidal signal, to illustrate how much brightness distortion is perceived in Fig. 1(a)
under high illumination.
Figure 7. A second example of response compression in the interpretation of depth from
shading. (A) Cyclic surface shape. (B) Net
reflectance profile assuming diffuse illumination and Lambertian reflectance function.
(C) Perceived brightness signal after hyperbolic saturation. Note similarity to original
surface waveform. (D) Same degree of hyperbolic saturation applied to a sinusoidal
signal, to illustrate similarity of result to (A)
and (C).
unidirectional light source but implies a diffuse illumination (as if the object were looming out of a fog). The
existence of such a default for human vision of shape from
shading has not been previously described to our knowledge. Note that similar percepts are obtained for linear
sinusoids of high contrast (such as a “stack of cigarettes”),
although the sense of shape-from-shading is weaker initially. No-one ever seems to see a linear sinusoid in a
rectangular aperture according to the predictions of Fig.
3(b) for a local illumination source, even though there is
now no orientational symmetry to force a symmetric
source illumination Thus, the default to diffuse illumination appears to be general unless specific cues imply
an oriented source (e.g., Ramachandran).9
324
Journal of Imaging Science and Technology
Given default diffusion, the depth interpretation is
consistent with the hypothesis that the visual system
uses the compressed brightness profile directly as the
neural signal for perceived shape. It is shown that this
equivalence is a reasonable approximation to computing the diffuse Lambertian illumination function for this
surface. This match provides the visual system with a
rough-and-ready algorithm for shape reconstruction
without requiring elaborate back-calculation of the
brightness compression and integral angle-of-acceptance
functions through which the diffuse illumination image
was built.
Acknowledgment. Supported by NEI grant No. 7890.
Tyler
References
1. (a) M. S. Langer and S. W. Zucker, Casting light on illumination: a computational model and dimensional analysis of sources, Comp. Vis. Image Understand. 65, 322–335 (1997); (b) M. S. Langer and S. W. Zucker,
Shape from shading on a cloudy day, J. Opt. Soc. Am. 11, 467–478
(1994).
2. V. V. Barun, Imaging simulation for non-Lambertian objects observed
through a light-scattering medium. J. Imaging Sci. Technol. 41, 143–
149 (1997).
3. D. I. A. Macleod, D. R. Williams and W. Makous, A visual nonlinearity fed
by single cones, Vis. Res. 32, 347–363 (1992).
4. R. D. Hamer and C. W. Tyler, Phototransduction: Modeling the primate
cone flash response, Vis. Neurosci. 12, 1063–1082 (1995).
5. A. J. Stewart and M. S. Langer, Towards accurate recovery of shape
from shading under diffuse lighting, IEEE Trans. Patt. Anal. Mach. Intell.
19, 1020–1025 (1997).
6. H. Chan and C. W. Tyler, Increment and decrement asymmetries: Implications for pattern detection and appearance. Soc. Inf. Displ. Tech.
Dig. 23, 251–254 (1991).
7. H. Chan, C. W. Tyler, P. Wenderoth, and L. Liu, Appearance of bright
and dark areas: An investigation into the nature of brightness saturation, Investigative Ophthalmology and Visual Science, Suppl. B, 1273
(1991).
8. C. W. Tyler and L. Liu, Saturation revealed by clamping the gain of the
retinal light response, Vis. Res. 36, 2553–2562 (1996).
9. V. S. Ramachandran, The perception of depth from shading, Sci. Am.
269, 76–83 (1988).
3-D Shape Recovery from Color Information for a Non-Lambertian Surface
Vol. 42, No. 4, July/Aug. 1998 325
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
3-D Shape Recovery from Color Information for a Non-Lambertian Surface
Wen Biao Jiang, Hai Yuan Wu and Tadayoshi Shioyama†
Department of Mechanical and System Engineering, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606, Japan
This article presents a method for shape recovery from color image in the case of a non-Lambertian surface illuminated by only a
single light source. In the first step, we use the dichromatic reflection model and obtain the directions of spectral power distributions
of the light due to surface reflection and body reflection by using eigenvectors of the moment matrix of the color signals. In the second
step, the parameters determining the reflectance map are identified by using the dichromatic reflection model and information of the
maximum intensity point. Subsequently, the surface normal of the object is estimated and the 3-D shape is recovered.
Journal of Imaging Science and Technology 42: 325–330 (1998)
Introduction
The approach of the shape from shading was proposed by
Horn.1 The intensity of an object within an image depends
on the light source position, the surface normal of the object, and the viewing direction.1,2 When the object surface
is made of a material that acts as a Lambertian reflector,
the intensity varies with the cosine of the angle between
the incident ray and the surface normal. We see that it is
simple to determine the reflectance map of a Lambertian
surface, and shape recovery from image intensity (shading) is easy. For the Lambertian case, methods of classical
shape recovery from shading have been developed for the
single image by Ikeuchi and Horn,2 and for two images by
Onn and Bruckstein.3 But in non-Lambertian surface
cases, it is difficult to recover the 3-D shape of an object
only from image intensity because the reflectance map
varies depending on the type of surface material of the
object. For shape recovery in the non-Lambertian case,
photometric stereo (PMS) methods have been developed
by many researchers.4–8 The previous PMS procedures use
multiple images of an object, taken under different illumination conditions, to estimate parameters determining
the reflectance map and the surface normal. In these
works, however, color information was not used. Schlüns9
suggested using color information to obtain the direction
of spectral power distributions of the light due to surface
reflection and body reflection and applied PMS to the derived matte image. Because PMS methods are based on
multiple images of an object sequentially illuminated by
multiple light sources, it is considered impossible to apply
the methods to natural scene understanding. Furthermore,
when the observed object is moving, the PMS methods give
rise to a correspondence problem of points among multiple images because of sequential illumination.
Original manuscript received June 17, 1997
† E-mail: [email protected]; FAX: 075-724-7300}
© 1998, IS&T—The Society for Imaging Science and Technology
For the purpose of natural scene understanding like that
from the retina of a human being, in our previous article10
we presented a method for shape recovery from shading
in the case of a non-Lambertian surface illuminated only
by a single light source. We estimated the parameters determining the reflectance map by using the intensity information at the occluding boundary. Because it is difficult
to observe the intensity at the occluding boundary precisely, in this article we propose another method for shape
recovery from a color image of an object with a nonLambertian surface illuminated by a single light source.
The advantage of using color information is that we can
recover shape from shading without intensity information
at the occluding boundary. In this method, the vectors in
the (RGB) color space corresponding to the color signals
due to surface reflection and body reflection are estimated
by using eigenvectors of the moment matrix of the color
signals on the basis of the dichromatic reflection model11
for the color signals. The normal of the object surface is
estimated by an iterative method using the image-irradiance equation, and 3-D shape is recovered. Furthermore,
to evaluate the algorithm, experimental results on several real objects are shown.
Reflection Model
The method of this article deals with materials that are
optically inhomogeneous, meaning that light interacts both
with the surface matter and with particles of a colorant
that produce scattering and coloration under the surface.
Many common materials can be described this way, including plastics, most paints, varnishes, paper, etc. Metals and crystals are not included in this discussion. We
also limit the discussion to opaque surfaces that transmit
no light from one side to the other.
When light strikes a surface, some of the light is reflected at the interface producing interface reflection (or
surface reflection). The direction of such reflection is in
the “perfect specular direction” relative to the local surface normal. Most materials are optically “rough” with local surface normals that differ from the macroscopic perfect
specular direction, so that the interface reflection is somewhat scattered at the macroscopic level. The light that
325
 sr   ∫ L(λ , i, n, r) r (λ ) dλ 
÷

s ≡  s g ÷ =  ∫ L(λ , i, n, r) g (λ ) dλ ÷.
 ÷
 sb   ∫ L(λ , i, n, r)b (λ ) dλ ÷

(3)
The interval of summation is determined by the
responsivity, which is non-zero over a bounded interval of
λ. Substituting Eqs. 1 and 2 into Eq. 3, it follows that:
sr = ∫ [ c1 (λ )m1 (i, n, r) + c2 (λ ) m2 (i, n, r)]r (λ )λd ≡
≡ c1, r m1 (i, n, r) + c2, r m2 (i, n, r).
(4)
Assuming the three filters are sensitive in the spectral
channels characterized by the color names red, green, and
blue, the color values in vector notation are
 sr 
s =  sg ÷
 ÷
 sb 
Figure 1. Imaging geometry.
penetrates through the interface undergoes scattering
from the particles of a colorant and is either transmitted
through the material (if it is not opaque), absorbed, or reemitted through the same interface by which it entered,
producing “body reflection.” The above inhomogeneous materials are described well by the dichromatic reflection
model that was shown by Shafer to be a useful approximation. The dichromatic reflection model is stated as11
L(λ,i,n,r) = L1(λ,i,n,r) + L2(λ,i,n,r),
(1)
where λ is the wavelength of light, i is the unit vector
aligned with the incident light direction, n is the surface
normal, and r is the viewing direction, as illustrated in
Fig. 1. Equation 1 says that the total radiance L of reflected light is a sum of two parts: the radiance L1 of light
reflected at the interface and the radiance L2 of the light
reflected from the body. The dichromatic reflection model
makes several assumptions. The model assumes the surface is an opaque, inhomogeneous medium with one significant interface, not optically active (i.e., has no
fluorescence), and uniformly colored (the colorant is uniformly distributed).
The model assumes independence of spectral and geometrical properties, and the following simplifying separation may be used:
Lt(λ,i,n,r) = ct(λ)mt(i,n,r),
with
mt(i,n,r) ≥ 0
and
t = 1,2,
(2)
where ct(t = 1,2) is the spectral power distribution of the
reflected light, which is the product of the spectral power
distribution of the incident light and the spectral reflectance of the surface, and mt is a geometrical scaling factor. The factor m2 is modeled by the Lambertian cosine
law. Several possible models for m1 are known, for example,
the Torrance-Sparrow model.
In a color camera, the red color value sr is a summation
of the radiance L(λ,i,n,r) at each wavelength, weighted by
the responsivity of the camera combined with the red filter r (λ ), [and so on, for the green filter g(λ ), and the blue
filter b (λ ), ]. Then the color values of the radiance L(λ,i,n,r)
are given by
326
Journal of Imaging Science and Technology
 c1,r 
 c2,r 
=  c1, g ÷m1 (i, n, r) +  c2, g ÷m2 (i, n, r)

÷

÷
c ÷
c ÷
 1,b 
 2,b 
(5)
≡ c1m1 (i, n, r) + c2 m2 (i, n, r),
 c1, r 
c1 ≡  c1, g ÷,

÷
c ÷
 1, b 
 c2, r 
c2 ≡  c2, g ÷,

÷
c ÷
 2, b 
(6)
where c1,g, c2,g, c1,b and c2,b are defined in the same manner
as c1,r and c2,r. This equation defines the dichromatic plane
(DCP) in a coordinate system spanned by the primary colors called the RGB space (see Fig. 2). In this work, we
assume the surface material is uniformly colored. Then c1
and c2 become constant vectors for all image points. As
models for m 1 and m 2, we use the Torrance-Sparrow
model6,12 and the Lambertian law:
m1(i,n,r) = exp{–c2[cos–1(ns • n)]2}
(7)
m2(i,n,r) = (i • n),
(8)
where the symbol • denotes a scalar product, ns ≡ (i + r)/
i + r is the macroscopic specular direction, • denotes a
norm of the vector, and c is a constant that depends on the
surface roughness. In this article, the value of c is set as
2.578. For a very rough surface it is shown that typical
values for c are around 2.5. We adopt the value 2.578, which
was experimentally determined by Tagare and deFigueiredo.7 For a vector, value can be expressed as a product
of its unit vector and its norm. We define ĉ 1 as the unit
vector aligned with c1, that is cˆ 1 = c1 / c1 , and ĉ 2 is the
unit vector aligned with c2. Then, the unknown c1 and c2
can be expressed by c1 = c1 cˆ 1 , c 2 = c 2 cˆ 2 , respectively.
Algorithm for 3-D Shape Inference
We assume orthographic image projection for the imaging geometry being shown as Fig. 1 and let the viewing
direction r be parallel to the z axis. Then, the 3-D shape of
an object can be described by its height z at coordinate
(x,y) in the image plane. We explain the procedure of 3-D
shape recovery based on the dichromatic reflection model.
We estimate ĉ 1 and ĉ 2 in the next subsection.
Jiang, et al.
Figure 2. (RGB) space.
The Method for Estimating ĉ 1 and cˆ 2 . We assume the
color signals s are normalized so that the maximum value
of their intensity E is equal to 1. When si,j denotes a column vector, which represents the color signals of the observed object, at a point with coordinate (i,j) in the image
plane, the moment matrix M is given as follows:
Figure 3. The eigenvectors of the moment matrix.
where e ∫ (0.299, 0.586, 0.115). From Eqs. 5, 6, and 10
through 12, we have
1
T
M=
∑ s i, j s i, j ,
N ( i, j )∈Ω
(9)
where T denotes a transposition, Ω is defined as the set
of orthographic image projections of points on the observed surface of the object, and N is the total number of
points in the set Ω. Because M is a real symmetric matrix, all of the eigenvalues are nonnegative and three
eigenvectors are orthogonal. Note that an eigenvector
expresses a statistical characteristic of the color signals
of an observed object. As illustrated in Fig. 3, the eigenvector u1 corresponding to the greatest eigenvalue expresses the direction of the centroid of all color signals
si,j, (i,j) ∈ Ω, and eigenvectors u2 and u3 corresponding to
the second and the third eigenvalue represent the maximal and minimal dispersive directions of all the vectors
in the set Ω, respectively. Then we can obtain the DCP
normal by the minimal dispersive direction u3 and obtain the vectors ĉ 1 and ĉ 2 by means of the maximal dispersive direction u2. Because the number of color signals
is very small near the specular point, the vector ĉ 1 (near
the specular point) and vector ĉ 2 can be easily distinguished. When the object surface material exhibits nonLambertian reflection, the reflectance map R(i,n,r) is
given by6,12
E = (e • s),
ˆ 1 ),
ρ1 = (e • c1 ) = c1 (e • c
ˆ 2 ).
ρ 2 = (e • c 2 ) = c 2 (e • c
(13)
If c 1 and c 2 are known, the 3-D shape is recovered
by the iterative method mentioned in the section Surface
Normal Inference, using the image-irradiance equation Eq.
11. Next we estimate parameters c 1 and c 2 .
Estimating c 1 and c 2 . In this section, we use the
spherical coordinate, i.e., the zenith angle θ and azimuth
angle φ, to represent the unit vector. The convention we
adopt with respect to these is as follows: the zenith angle
of any unit vector is measured positively down from the z
axis while the azimuth angle is measured positively counterclockwise from the x axis. The θ and φ usually are subscribed to indicate the vectors to which they belong. Thus
θn and φn are zenith and azimuth angles of the vector n,
while θi and φi are the angles of i, respectively. Then we
have the following relations:
n = (sinθncosφn, sinθnsinφn, cosθn),
(14)
i = (sinθicosφi, sinθisinφi, cosθi),
(15)
R(i, n, r) =
{
ρ 1 exp − c 2 [cos −1 (n s • n)]2 }+ ρ 2 (i• n),

 0,
if (i • n) > 0,
(10)
otherwise,
where ρ1 and ρ2 are parameters determining the reflectance property. Denoting the intensity by E, we obtain the
image-irradiance equation
E = R(i,n,r).
(11)
(i • n) = cosθi,cosθn + sinθisinθncos(φi – φn). (16)
Because viewing direction r is along the z axis and ns lies
in the “principal plane” spanned by i and r, the zenith and
azimuth angles of ns are θs = θi/2 and φs =φi. Then (ns • n) is
given by
From the Commission Internationale de l’Éclairage
(CIE), the intensity E of an image can be obtained by the
following equation:
E = 0.299sr + 0.586sg + 0.115sb = (e • s),
(12)
3-D Shape Recovery from Color Information for a Non-Lambertian Surface
(ns • n) =
cos
θi
θ
cos θ n + sin i sin θ n cos(φ i − φ n ).
2
2
(17)
Vol. 42, No. 4, July/Aug. 1998 327
The mapping from unit vector to zenith and azimuth
angles is one to one; thus Eq. 11 can be written as
E = R(θ n ,φ n )


θ
θ

= ρ 1 exp− c 2 cos −1  cos i cos θ n + sin i sin θ n cos(φ i −

2
2


+ ρ 2 [cos θ i cos θn + sin iθsin
cos( i
nθ
−φ
n)

φn )÷

]φ.
2



(18)
It is considered that the right hand side of the above
equation takes the maximum value Rmax at a vector in the
principal plane, i.e., φn = φi. Hence, at the point θ n* , φ n* of
maximum value of intensity, ρ1, ρ2, θ n* , and φ n* should satisfy
(
(
)
)
Rmax = R θ n* , φ n* = φi = 1.
(19)
Because the direction of the light is known, i.e. (θi, φi) is
known with 0 < θI < π/2, substituting φ n* = φi into Eq. 18,
R θ n* , φ n* becomes
(
)
(
R
θ n* , φ n*
)
= φi =
(
)
(23)
g = 2q 1 + p2 + q 2 − 1 /( p2 + q 2 ),


(24)
(20)
Because Eq. 13 implies that finding unknown ρ1, ρ2 and
θ* can be replaced by finding unknown c 1 , c 2 , and θ*,
from Eq. 5 through Eq. 8, we have the relation of smax defined as s corresponding to R θ n* , φ n* =Rmax,
(
)
s max = c 1m1 (i, n * , r) + c 2 m2 (i, n * , r), i.e.,
2

θ 

= c 1 cˆ 1 exp− c 2  θ n* − i ÷

2

p≡
∂z
∂z
, q≡
∂x
∂y
(25)
and are related to f and g as follows:
p = 4f/(4 – f 2 – g2), q = 4g/(4 – f2 - g2).
(
)

*
 + c 2 cˆ 2 cos θ n − θ i .

(
(21)
)
n
2

θ 
θ  


2 c 1 (e ⋅ cˆ 1 ) c 2  θ n* − i ÷ exp− c 2  θ n* − i ÷  +


2
2  

(22)
+ c 2 (e ×cˆ 2 ) sin(θ n* − θ i )= 0.
Hence, c 1 , c 2 , and θ* should satisfy the above equation.
Using Eq. 22 as a constraint equation associated with vector Eq. 21, we can obtain the values of c 1 , c 2 , and θ n* .
The iterative algorithm can be used to solve the nonlinear
simultaneous equations, where we will use the Marquardt
method.13 The Marquardt method is a combination of the
Newton method and the method of the steepest descent.
The above procedure obtaining the parameters c 1 , c 2 ,
and θ n* does not use the intensity of the occluding boundary, because it is not easy to observe the intensity of the
occluding boundary precisely. After obtaining c 1 , c 2 , and
θ n* by the method mentioned above, we estimate surface
normal using Eq. 11 and the tangent property of the occluding boundary.
Surface Normal Inference. The unit vectors r, n, and i
are described by points on the unit sphere called the
Gaussian sphere (see Fig. 4). In the stereographic projection, a point on the Gaussian sphere is projected by a ray
through the point from the south pole onto the tangent
plane at the north pole, which is called the stereographic
plane. The coordinate (f,g) in the stereographic plane is
given as
Journal of Imaging Science and Technology
(26)
The unit vector r and the surface normal n are given by
r = (0, 0, 1), n = (− p, − q, 1) / 1 + p2 + q 2 ,
From Eq. 19, finding the maximum value of R θ n* , φ n* = φi ,
∂
i.e., finding the solution θ n* of ∂θ R(θn, φn = φi) = 0, we have
328
f = 2 p 1 + p2 + q 2 − 1 /( p2 + q 2 ),


where p and q are defined as
2

θ  

ρ 1 exp− c 2  θ n* − i ÷  + ρ 2 cos θ n* − θi .

2  

s max
Figure 4. Gaussian sphere.
(27)
The vectors n and i are described in terms of f and g as
n = [–4f, –4g, 4 – f2 – g2]/(4 + f2 + g2),
(28)
i = [ −4 fi , −4 gi , 4 − fi2 − gi2 ] /(4 + fi2 + gi2 ),
(29)
where (fi, gi) denotes the stereographic coordinate corresponding to the direction of the light.
We assume that the viewing direction coincides with the
north pole of the Gaussian sphere and only considered
points are on the northern hemisphere of the Gaussian
sphere. Therefore, the considered points (f, g) and (fi, gi) in
the stereographic plane are constrained to the following
regions: f2 + g2 ≤ 4, fi2 + gi2 ≤ 4. Then for each considered
point in the image plane, Eq. 18 is rewritten as
Eij = R(fij, gij),
(30)
where (fij, gij) denotes the stereographic coordinate corresponding to the surface normal at image plane coordinate
(i, j) and Eij the intensity at (i, j). We define the constraint
h(f, g) as
h(u) ≡ E – R(u) = 0,
(31)
u ≡ (u1, u2)T ≡ (fij, gij)T,
(32)
which is imposed on the image intensity. We again use the
following Marquardt method to estimate surface normal.
At each considered point in the image, the estimate u(n)
at the n’th iteration is improved with Du in the following
steps.
1. Solve the following equation with unknown vector ∆u
≡ (∆u1, ∆u2)T ≡ (∆fij, ∆gij)T,
[G(u ) + γ I]∆u = − ∂W∂(uu
(ν )
(ν )
)
,
(33)
Jiang, et al.
where I denotes a 2 × 2 unit matrix and W(u), G, and J are
defined as
1 2
h , G ≡ J T J,
2
W (u ) ≡
(34)
∂W (u)
∂h
= hJ , J ≡
: Jacobian.
∂ (u )
∂u
(35)
2. Improve the estimate u(ν) satisfying
u(n + 1) = u(n) + ∆u.
(36)
Solving Eq. 33, ∆u is given by
∆u1(ν ) ≡ ∆fij(ν )
(
Eij − R
=
Rf
(
)
2
fij(ν ) , gij(ν )
gij(ν )
(
Rg fij(ν ) ,
+
∆u2(ν ) ≡ ∆gij(ν )
=
fij(ν ) ,
(
)
(
)
2
(
)
2
gij(ν )
Eij − R fij(ν ) , gij(ν )
Rf fij(ν ) , gij(ν )
(37)
)
+ Rg fij(ν ) , gij(ν )
+γ
(
)
Rf fij(ν ) , gij(ν ) ,
(38)
)
2
+γ
(
Rg fij(ν ) ,
gij(ν )
),
where
Rf ≡
∂R
∂R
, Rg ≡
.
∂f
∂g
(39)
In the above-mentioned algorithm, the value of u at a
point on the occluding boundary is known.2 The initial
value of u at a point on the region except the occluding
boundary is set as u = 0. For convenience, the region where
u = 0 and u is not yet improved is called the unknown
region. The estimate u(n) at a point on the unknown region
can be obtained from the following steps:
1. When there is at least one point called the known point,
which does not belong to the unknown region, in the
eight neighboring points, we have
fij(ν ) = afij(ν ) + bfˆij( ν) , gij( )ν = agij( ) ν+ bgˆ ij( ) ,ν
where
fij ≡ fi + 1, j + fi, j + 1 + fi − 1, j + fi, j − 1
gij ≡ gi + 1, j + gi, j + 1 + gi − 1, j + gi, j − 1
fˆij ≡ fi − 1, j − 1 + fi + 1, j − 1 + fi − 1, j + 1 + +fi
gˆ ij ≡ gi − 1, j − 1 + g+i
1, j − 1
+ gi − 1, +j
1
1+, j 1
+ g+i
1, +j 1 .
If there exists a known point in the four immediate
neighbors (i ± 1, j) and (i, j ± 1), (a,b) = (1/ξ,0), where ξ is
the number of known points in the four immediate neighbors, else (a,b) = (0,1/η) where η is the number of known
points in the eight neighboring points.
2. In the case rather than step 1,
fij(ν ) = gij(ν ) = 0.
Because the unknown region will vanish as the iteration goes ahead, the above two steps are used only as a
transient process.
After obtaining (f,g) by the above algorithm, we can get
(p,q) by Eq. 26 and obtain the height z by integrating p
and q from Eq. 25. Thus, we can reconstruct the 3-D shape.
When the above method is applied to a real image, one
difficult problem is obtaining the differentiable curve of
the occluding boundary. In such a case, we use the B-spline
curve to fit the occluding boundary of the image and compute the initial value of u at a point on the occluding
boundary by the normal of the fitting curve.
Experimental Results
To evaluate the proposed algorithm, we show results of
several experiments. The materials used in experiments
are plastics, which are amenable to the analysis based on
the dichromatic model. Two balls and one bowling pin are
used as real objects. Object A is a yellow table tennis ball
made of plastic with an almost diffuse reflector surface.
Object B is a green can cap also made of polished plastic
with a hemisphere shape. Object C is a white bowling pin
made of plastic with a diffuse reflector surface. The direction of the light source is given by θi = 5° and φi = 0°. The
observed image intensities of objects A, B, and C are preprocessed by the method in the following steps and shown
in Figs. 5(a), 6(a) and 7(a), respectively. The practical experimental steps are as follows:
• Preprocess. Noise is removed by using the median
filter with 9 × 9 pixels, and contrast stretching
transformation is performed14 so that the value of image
intensity varies from 0 to 255.
• Estimating Parameters. After forming the moment
matrix of color signals, we determine DCP by
eigenvectors of the moment matrix and estimate ĉ 1
and ĉ 2 in DCP. By Eqs. 21 and 22, the parameters
c 1 , c 2 , and θ n* are determined.
• Extracting Occluding Boundary. Because a normal
to the silhouette in the image plane is parallel to the
normal to the surface at the corresponding point on
the occluding boundary, we can regard the normal of
the silhouette in the image plane as the normal of the
surface on the occluding boundary. In this article, we
use the Laplacian–Gaussian filter method to detect
the boundary of the image and use the B-spline curve
to fit discrete edge data. Then the initial values
fi(, 0j ) and gi(,0j) on the occluding boundary are obtained
from the fitted curve.
• Surface Normal Inference. Using the imageirradiance equation Eq. 11 and initial values fi(, 0j ) and
gi(,0j) , we obtain the 3-D shape by the Marquardt
iterative algorithm.
Table I shows the results of estimated parameters ρ1,
ρ2, and θ n* of the real three objects. Figures 5(b), 6(b), and
7(b) illustrate the 3-D shapes reconstructed by the proposed algorithm from the image intensities shown in Figs.
5(a), 6(a), and 7(a), respectively.
The limitation of the reflection model is that in a particular case where the model is described by only the
Lambertian component and the incident light direction
coincides with the viewing direction, the model cannot
distinguish between concave and convex shapes because
of similar brightness variations.
Conclusion
We have proposed a method for shape recovery from the
color signals for a non-Lambertian surface illuminated by
TABLE I.
Object
ρ1
ρ2
θn *
A
B
C
0.226
0.371
0.000
0.778
0.602
1.000
3.585
3.236
4.699
3-D Shape Recovery from Color Information for a Non-Lambertian Surface
Vol. 42, No. 4, July/Aug. 1998 329
(a)
(b)
Figure 5. (a) Image intensity and (b) reconstructed shape of object A.
(a)
(b)
Figure 6. (a) Image intensity and (b) reconstructed shape of object B.
(a)
(b)
Figure 7. (a) Image intensity and (b) reconstructed shape of object C.
only a single light source. In the method, parameters determining the reflectance map are identified by using color
information and subsequently the normal of the object surface is estimated and 3-D shape recovered. Our method
has been shown to produce reasonable results on several
real objects.
References
1. B. K. P. Horn, Understanding image intensities, Art. Intell. 8, 201–231
(1977).
2. K. Ikeuchi and B. K. P. Horn, Numerical shape from shading and occluding Boundaries, Art. Intell. 17, 141–184 (1981).
3. R. Onn and A. Bruckstein, Integrability disambiguates surface recovery in two-image photometric stereo, Int. J. Computer Vision, 5, 105–
113 (1990).
4. K. Ikeuchi, Determining the surface orientations of specular surfaces
by using the photometric stereo method, IEEE Trans. PAMI–3, 661669 (1981).
5. E. N. Coleman and R. Jain, Obtaining 3-D shape of textured and specular surfaces using four-source photometry, CVGIP, 18, 309–328 (1982).
330
Journal of Imaging Science and Technology
6. H. D. Tagare and R. J. P. deFigueiredo, A theory of photometric stereo
for a class of diffuse non-Lambertian surface, IEEE Trans. PAMI–13,
133–152 (1991).
7. H. D. Tagare and R. J. P. deFigueiredo, Simultaneous estimation of
shape and reflectance map from photometric stereo, CVGIP: Image
Understanding 55, 275–286 (1992).
8. F. Solomon and K. Ikeuchi, Extracting the shape and roughness of
specular lobe objects using four light photometric stereo, IEEE Trans.
PAMI-18, 449–454 (1996).
9. K. Schlüns, Photometric stereo for non-lambertian surfaces using color
information, in Proc. 5th Int. Conf. Computer Analysis of Images and
Patterns, 444–451(1993).
10. W. B. Jiang, H. Y. Wu and T. Shioyama, 3-D Shape Recovery from
Image Brightness for non-Lambertian Surface, J. Imag. Sci. Technol.
41 (4), 429–437 (1997).
11. S. A. Shafer, Using color to separate reflection components, Col. Res.
Appl. 10(4), 210–218 (1985).
12. K. E. Torrance and E. M. Sparrow, Theory for off-specular reflection
from roughened surfaces, J. Opt. Soc. Am. 56, 1105–1114 (1967).
13. J. M. Ortega and W. C. Rheiboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970, p. 281.
14. A. K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall,
Inc, Englewood Cliffs, NJ, 1989.
Jiang, et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Optical Effects of Ink Spread and Penetration on Halftones Printed by
Thermal Ink Jet
J. S. Arney* and Michael L. Alber*
Rochester Institute of Technology, Rochester, New York 14623
A probability-based model of halftone imaging, which was developed in previous work to describe the Yule–Nielsen effect, is shown in
the current work to be easily modified to account for additional physical and optical effects in halftone imaging. In particular, the
effects of ink spread and ink penetration on the optics of halftone imaging with an ink-jet printer is modeled. The modified probability
model was found to fit the experimental data quite well. However, the model appears to overcompensate for the scattering associated
with ink penetration into paper.
Journal of Imaging Science and Technology 42: 331–334 (1998)
Introduction
Recent work in this laboratory has been directed at the
development of a probability model of the Yule–Nielsen
effect to relate fundamental optical properties of papers
and inks to tone reproduction in halftone printing. However, practical halftone models also need to account for
physical effects such as the lateral spread of ink on the
paper, called physical dot gain, and the penetration of ink
into the paper. The most fundamental description of the
Yule–Nielsen effect involves modeling the optical point
spread function, PSF, of light in the paper and convolving
the PSF with a geometric description of the halftone dots.
Although such models have been shown to be quite accurate in describing the Yule–Nielsen effect, they are
computationally quite intensive. Moreover, they are difficult to combine with models of physical dot spread and
especially of physical penetration of ink into the paper.
But the probability-based model is much less
computationally intensive, can be written in a closed analytical form, and is only slightly less rigorous than the
convolution approach. Moreover, the probability approach
will also be shown to be easily modified to account for ink
spread and penetration.
The Probability Model
The probability model has been described elsewhere,1,2
and here we present only the recipe for its application.
The model begins with an empirical description of the
mean probability P p that a photon of light that enters
the paper between halftone dots will emerge under a dot.
[
Pp = w 1 − (1 − F )
B
],
Original manuscript received August 21, 1997
* IS&T Member
© 1998, IS&T—The Society for Imaging Science and Technology
(1)
where F is the dot area fraction and w is the magnitude of
the Yule–Nielsen effect and is related quantitatively to
the optical point spread function of the paper.1,2 Both F
and w can have values from 0 to 1. The B factor is a constant characteristic of the chosen halftone pattern and the
geometric characteristics of the printer. For the printer
used in the current work, an HP 1600C thermal ink-jet, a
B factor of 2.0 was found to provide the best correlation
between the model and the experimental measurements
described below.
A second function needed to model tone reproduction is
the probability Pi that a photon that enters the paper under a halftone dot (having first passed through the dot)
then reemerges from the paper under a dot. The two probabilities have been shown to relate as follows.1
1− F 
Pi = 1 − Pp 
 F .
(2)
We assume initially an ink that is transparent, with no
significant scattering. Then, as shown previously, the reflectance of the paper between the dots and of the dots is
given by Eqs. 3 and 4, with Rg the reflectance of the paper
on which the halftone pattern is printed.
[
]
Rp = Rg 1 − Pp (1 − Ti ) ,
(3)
Ri = Rg Ti [1 − Pi (1 − Ti )] .
(4)
Note that the reflectance of the ink and of the paper
between the dots are not constant but depend on the dot
area fraction F through Eqs. 1 and 2.
With the reflectance of the ink dots and the paper between the dots, the overall reflectance of the halftone image is calculated with the Murray–Davies equation.
R(F) = FRi + (1 – F)Rp.
(5)
331
1
Reflectance
Measured Dot Fraction, F
1
0.5
0
0
0
0
Figure 1. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the
pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn
from the model with ε = 0.060 m2/g and w = 0.75 with no physical
dot gain and no penetration.
The Yule–Nielsen “n” factor is not used in Eq. 5 because
the Yule–Nielsen effect is described by the scattering probability Pp. Thus, to model tone reproduction R versus F,
one needs (1) the transmittance of the ink Ti, (2) the reflectance of the paper Rg, (3) the scattering power of the
paper w, and (4) the geometry factor B. The value of Ti can
be determined with the Beer–Lambert equation using the
coverage of the ink within the dot c in g/m2 and the extinction coefficient ε in m2/g.
(6)
The pigment-based ink was delivered by the printer at
c = 7.31 g/m2. This was determined by weighing the ink
cartridge before and after commanding the printer to print
a known number of ink drops at a selected area coverage
of 0.50.
As a test of the model, a dispersed-dot halftone at 300
dpi addressability was printed using an HP 1600C thermal ink-jet. Figure 1 shows the measured reflectance of
the halftone image R, the ink dots Ri, and the paper between the dots Rp versus the dot area fraction F measured
by microdensitometry as described previously.1,2 The reflectance values are integral values characteristic of the
instrument spectral sensitivity. The solid lines in Fig. 1
are the model calculated as follows: The values of Rg and c
were measured independently. The values of ε and w were
used as independent variables to provide the best fit between the model and the data. For selected ε and w, Eq. 6
was applied, then Eqs. 2 through 5. The values of ε and w
were adjusted to provide a minimum rms deviation between the model and experimental values of Rp. Figure 1
shows that the model describes the paper reflectance Rp,
quite well, but the measured values of Ri are significantly
higher than expected from the model. Clearly, modification of the model to account for nonideal behavior of the
thermal ink-jet system is needed.
332
Journal of Imaging Science and Technology
0.4
0.6
0.8
1
Dot Fraction from Printer, Fo
Dot Fraction, F
Ti = 10 − εc .
0.2
Figure 2. Measured ink area fraction F, versus the nominal gray
fraction F0 commanded by the printer. The Fmax is the ink area
fraction at a nominal gray fraction of F0 = 1.00.
Dot Spread and Overlap
A deficiency of the above model is the way in which Ti is
estimated with Eq. 6. The value of c = 7.31 g/m2 was estimated from an accurate measure of ink mass, but the area
coverage was estimated as the value commanded by the
printer. However, inks can spread out and/or overlap, and
this makes the actual ink coverage differ from the commanded ink coverage. This, in turn, changes the transmittance of the ink layer on the paper. To improve the
estimate of Ti in the model, the ideal value of c0 = 7.31 g/
m2 was modified to estimate the actual ink coverage c.
This was done by measuring the actual area coverage F
determined by microdensitometry and comparing it with
the value F0 sent to the printer. The correct value of c was
calculated from Eq. 7.
c = c0
F
F0
(7)
To use Eq. 7 in the model, a relationship between F and
F0 is needed. However, this is a characteristics of a given
printer, and rather than model it a priori the effect was
characterized experimentally by measuring the printed ink
area fraction F as a function of the value commanded by
the printer F0. Values of F were measured by histogram
segmentation of images captured by the microdensitometer, as described previously.1,2 Figure 2 is an example,
and the data were fit empirically to Eq. 8 with Fmax = 0.79
and m = 1.05.
F = Fmax F0m .
(8)
The model was then run by ranging F from 0 to Fmax.
At each F the ratio F/F0 was calculated using Eq. 8. Equation 7 was then applied to determine c, which was used
in Eqs. 6 and 2 through 5. The values of Rg, m, Fmax, and
c0 were measured independently, and the values of ε and
w were adjusted to provide a minimum rms deviation
between the model and experimental values of Rp, as
shown in Fig. 3. Again fit to Rp is good, but Ri is still
Arney and Alber
and the product Sx will be used as an independent variable in the tone reproduction model.
Second, some light penetrates the dot and enters the
paper. The transmittance of the dot, according to Kubelka–
Munk, is given as follows:
Reflectance
1
Ti =
0
0
Dot Fraction, F
1
Figure 3. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the
pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn
from the model with ε = 0.052 m2/g , w = 0.70, and Fmax = 0.79.
modeled with a reflectance that is lower than observed
experimentally. Indeed, the fit appears worse than in Fig.
1 suggesting that ink spread and overlap, while clearly
present in Fig. 2, is not the major perturbation in tone
reproduction characteristics of the system. It was anticipated that ink penetration into the paper may have a significant effect.
Ink Penetration into the Paper
The effect of ink penetration into the substrate could be
quite complex. In an a priori model in which the paper PSF
is convolved with the halftone pattern, vertical penetration
of the dot would require a 3-D convolution and a detailed
knowledge of the 3-D geometry of the ink. Such halftone
modeling has been described but is quite complex.3–5 For
the current probability model, ink penetration was approximated in a much simpler way. The major optical effect of
ink penetration was assumed to be in the increased scattering of light in the ink by the paper. To model the effect
we assume the ink behaves as if it does not actually penetrate the sheet but only increases in scattering coefficient
S. In other words, the model is identical to the case of a
nonpenetrating ink with a significant scattering coefficient.
Thus an increase in S is used as an index of the degree of
ink penetration into the paper substrate. This scattering
effect was added to the probability model as follows:
First, the ink scattering coefficient causes some light to
reflect from the ink dot without penetrating through the
dot. The Kubelka–Munk model gives this reflectance contribution as follows:6
RiK =
1
,
a + b ×Coth( bSx )
(9)
where a = (Sx + Kx)/Sx and b = (a2 – 1)1/2.
The value of the product Kx is linearly related to the
product εc,
Kx = 2.303 εc,
(10)
b
a ×Sinh( bSx ) +b ×Cosh( bSx )
(11)
Equation 11 replaces Eq. 6 in the model.
Light that enters the paper between the halftone dots
is scattered and may emerge with probability Pp under
the dot. Equation 1 has been used to model this probability for the disperse dot halftone. However, light that encounters a dot with a significant scattering coefficient Sx
may be reflected back into the paper. A detailed description of this effect might include multiple scattered reflections between the substrate and the dot, but a simpler
approximation will be used in the current model. One approach might be to assume the effect results in a decrease
in the effective value of Ti of the dot. However, light that
fails to transmit through the dot is returned to the paper
where it can scatter and emerge between the dot. This
would not be accounted for by simply approximating a
decrease in the effective value of Ti. Alternatively, the effect can be described as a decrease in the probability factor Pp. In other words, the effect of scattering in the dot
can be modeled as a decrease in the probability that light
entering the paper between the dots will emerge from the
system after passing through the dot. The effect will be
approximated by modifying Eq. 1 with the reflectance factor from Eq. 9.
[
]
Pp = w 1 − (1 − F ) B [1 − RiK ] .
(12)
The value of Pp from Eq. 12 is used to determine Pi from
Eq. 2 and Ri from a modified form of Eq. 4 in which reflectance from the bulk is added to the Kubelka–Munk reflectance RKM to produce the overall ink reflectance,
Ri = Rg Ti [1 − Pi (1 − Ti )] + RiK .
(13)
The reflectance of the paper is determined from Eq. 3
as before, and the overall reflectance is determined with
Eq. 4. If the Kubelka–Munk reflectance RKM is zero (no
scattering), the model reduces exactly to the model used
in Fig. 3. If, however, the scattering Sx is adjusted as a
third independent variable, the result shown in Fig. 4 can
be achieved.
Modifying Ink Spread and Penetration
Achieving the fit of all three nonlinear sets of data in
Fig. 4 with only the three independent variables ε, w, and
Sx suggests the model is at least a reasonable approximation of the optical and physical behavior of the ink-jet system. To examine the physical impact of spread and
penetration further, the ink and halftone pattern of Fig. 4
was printed on a recycled plain paper. The experimental
data and the fit of the model are shown in Fig. 5. Evident
from this experiment are the following: First, the model is
able to fit the data quite well. Moreover, the fit is achieved
with a significantly higher value of Sx than one would
expect for the plain paper system. The ink penetrates farther into the plain paper and thus has a higher effective
scattering coefficient. However, the model may overcompensate for this scattering effect in the ink layer and, thus,
Optical Effects of Ink Spread and Penetration on Halftones Printed by Thermal Ink Jet
Vol. 42, No. 4, July/Aug. 1998
333
1
Reflectance
Reflectance
1
0
0
Dot Fraction, F
0
1
0
1
Dot Fraction, F
Figure 4. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the
pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn
from the model with ε = 0.051 m2/g and w = 0.73, measured dot
gain parameters of m = 1.05 and Fmax = 0.79, and ink penetration
modeled with Sx = 0.5.
Figure 5. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the
pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a recycled plain paper. The lines are drawn from
the model with ε = 0.06 m2/g and w = 0.55, measured dot gain
parameters of m = 1.05 and Fmax = 0.79, and ink penetration modeled with Sx = 1.3.
requires a slightly higher value of ε to achieve a good fit
with the data. Moreover, the value of w which fits the data
is lower for the plain paper than for the gloss-coated paper,
which is the reverse of expectation.7 The value of w is related to the mean distance light travels between scattering
events, and this is expected to be larger in plain papers
than in coated papers. Perhaps this effect also has been
overcompensated by the simplifying assumptions in modeling ink penetration.
Halftone patterns were also printed for a dye-based ink
on both the plain paper and the coated paper. The parameters used to fit the model to the data for all experiments
along with the observed values of Fmax are summarized in
Table I. In most cases the trends in the parameters are as
expected. For example, the measured values of Fmax indicate the amount of lateral spread of ink on the paper and
the lateral spread is greater for dye-based ink on the coated
paper than on the plain paper. However, the amount of lateral spread is not significantly different for the pigmented
ink on the two types of paper. But the effective increase in
light scattering within the ink dot, Sx, in going from the
coated paper to the plain paper is evident in both the pigment and the dye-based inks. In addition, the value of ε is
higher for the dye-based ink, as is typically observed, but
the values should not change when the paper is changed.
That it does in both cases suggests the simple model of ink
penetration overestimates the optical effect of scattering,
requiring a compensating adjustment of ε.
TABLE I. Summary of Modeling Parameters. Parameters Adjusted to Achieve the Minimum rms Deviation Between Model
and Data for All Three Sets of Data R, Ri, and Rp versus F. Also
Shown is the Value of Fmax, or the Dot Area Fraction at a Nominal Print Gray Scale of 100%.
Conclusion
The success of the model described in this report indicates the advantage of the probability model for exploring and modeling the mechanism of halftone imaging.
Because the probability model can be written in closed
analytical form, it is easily modified to account for additional mechanistic effects such as ink spread. Such modifications are much more difficult to do with an a priori
model involving the convolution of ink with the paper
point spread function. The probability model does, nevertheless, maintain a reasonable connection with the fundamental parameters of the point spread function
334
Journal of Imaging Science and Technology
Ink base
Paper
ε (m2/g)
w
Sx
Fmax
pigment
dye
pigment
dye
coated glossy
coated glossy
recycled plain
recycled plain
0.052
0.099
0.060
0.13
0.70
0.75
0.55
0.55
0.50
0.88
1.3
1.5
0.79
0.84
0.77
1.017
through the empirical w parameter1,2 and through fundamental theory described by Rodgers.8 Caution should
be used, however, in applying the simplifying assumptions for ink penetration, because the model appears to
overcompensate the optics of the penetration effect and
to decrease the reliability of the w parameter as an index of the paper point spread function.
Acknowledgments. Support for this project was provided
by DuPont Corporation and is gratefully acknowledged.
Special thanks to Paul Oertel for many challenging discussions and helpful suggestions.
References
1.
2.
3.
4.
5.
6.
7.
8.
J. S. Arney, A probability description of the Yule–Nielsen effect, J. Imaging Sci. Technol. 41(6), 633-636 (1997).
J. S. Arney and M. Katsube, A probability description of the Yule–Nielsen
Effect II: The impact of halftone geometry, J. Imaging Sci. Technol. 41(6),
637, (1998).
F. Ruckdeschel and O. G. Hauser, Appl. Opt. 17, 3376 (1978).
S. Gustavson, Color gamut of halftone reproduction, J. Imaging Sci.
Technol. 41, 283 (1997).
S. Gustavson, Dot gain in color halftones, Ph.D. Dissertation, Kinkoping
University Department of Electrical Engineering, Linkoping, Sweden,
Fall 1997.
G. Wyszecki and W. S. Stiles, Color Science, 2nd. ed., John Wiley &
Sons, NY, 1982, p. 785.
J. S. Arney, C. D. Arney, and M. Katsube, An MTF analysis of paper, J.
Imaging Sci. Technol. 40, 19 (1996).
G. L. Rogers, Optical dot gain in halftone print, J. Imaging Sci. Technol.
41(6) 643-656 (1997); Optical dot gain: Lateral scattering probabilities,
J. Imaging Sci. Technol. 42(4), 341 (1998).
Arney and Alber
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Modeling the Yule–Nielsen Effect on Color Halftones
J. S. Arney,* Tuo Wu and Christine Blehm*
Rochester Institute of Technology, Center for Imaging Science, Rochester, NY 14623-0887
The Neugebauer approach to modeling color cmy halftones generally has to be modified to correct for the Yule–Nielsen light scattering
effect. The most common modification involves the Yule–Nielsen n factor. A less common, but more fundamentally correct modification
of the Neugebauer model involves a convolution of the halftone geometry with the point spread function, PSF, of the paper. The
probability model described in the current report is less complex than the PSF convolution approach but is still much less empirical
than the Yule–Nielsen n model. The probability model assumes the Neugebauer equations are correct and that the Yule–Nielsen effect
manifests itself in a variation in the XYZ tristimulus values of the eight Neugebauer primary colors as a function of the amounts of c,
m, and y printed. The model describes these color shifts as a function of physical parameters of the ink and paper that can be measured
independently. The model is based on the assumption that scattering and absorption probabilities are independent, that the inks obey
Beer–Lambert optics, and that ink dots are printed randomly with perfect hold-out. Experimentally, the model is most easily tested by
measuring the shift in the color of the paper between the halftone dots, and experimental microcolorimetry is presented to verify the
model.
Journal of Imaging Science and Technology 42: 335–340 (1998)
Background
One of the conceptual advantages of halftoning is the linearity between the fractional area coverage of the ink dots,
Fk, and the overall reflectance R of the image as expressed
in the Murray–Davies equation, R = Fk Rk + (1 – Fk)Rp,
where Rk and Rp are the reflectance factors of the ink and
paper, respectively. In color halftoning this also means a
linearity between c, m, and y dot areas and the CIE XYZ
chromaticity coordinates of the image. However experimental measurements typically show a nonlinearity between R and Fk with R being less than predicted by the
Murray–Davies equation. The nonlinearity between Fk and
R is caused by two phenomena: (1) physical dot gain in
which the actual dot fraction is larger than the dot fraction commanded in the printing process, and (2) the Yule–
Nielsen effect in which the lateral scatter of light within
the paper leads to an increase in the probability of the ink
dots absorbing the light.1 Thus, to describe tone and color
reproduction in halftone images, modifications of the
Murray–Davies equation are used. In this report, F will
refer to the actual, measured dot area fraction rather than
the fraction commanded by the printer, and a modification of the Murray–Davies equation that models the Yule–
Nielsen effect on color halftones will be described.
The earliest and still most commonly used modification
of the Murray–Davies equation is the Yule–Nielsen equation, with an empirical n factor.1
R1 n = Fk Rk1 n + Fp R1p n .
Original manuscript received September 8, 1997
* IS&T Member
© 1998, IS&T—The Society for Imaging Science and Technology
(1)
In this expression, Fk and Fp are the area fractions of
the ink dots and the paper between the dots, respectively,
and Fp = 1 – Fk. The Murray–Davies and Yule–Nielsen
equations are often extended to describe spectral reflectance in cmy color halftoning.
8
R(λ ) = ∑ fi Ri (λ ) ,
i =1
(2)
8
R(λ )1 n = ∑ fi Ri (λ )1 n .
i =1
(3)
The fi are the area fractions of the eight possible colors
(white, cyan, magenta, yellow, red, green, blue, and black)
formed by overlap between the printed ink area fractions
c, m, and y. The Ri(λ) are the reflection spectra of the eight
colors. By knowing or assuming the geometry of overlap
between ink dots, the color fraction may be determined
from the ink fractions, fi = f(c,m,y). The most common assumption regarding dot overlap is that dots are printed
randomly, which leads to the so-called Demichel equations
[f1 =(1 – c)(1 – m)(1 – y), through f8 = c, m, y].2 Models for
deterministic dot placement have also been published.3
By integration, the CIE chromaticity coordinates may
be determined.
XYZ = ∫ R(λ ) xyzP(λ )dλ ,
(4)
where XYZ represents the X, Y, or Z chromaticity value
and xyz represents the corresponding x, y, or z color matching function. The value P(λ) is the spectral power distribution of the light used to view the image. By applying
Eq. 4 to Eq. 2 we have what is often called the Neugebauer
equation for tristimulus values,2
335
I3
8
XYZ = ∑ fi ×XYZi1 n ,
(5)
i =1
where XYZi represents a tristimulus value for the color
region i of the halftone.
The empirical modification for R(λ) may also be used to
calculate tristimulus values. However, Eq. 6 does not follow from application of Eq. 4 to Eq. 3.
8
XYZ 1 n = ∑ fi XYZi1 n .
Nevertheless, Eq. 6 is occasionally used as an empirical
modification of Eq. 5. Because n is only an empirical factor, no reason exists not to use Eq. 6 if it provides a useful
description of a given halftone system.4
Work in this laboratory has explored an alternative
modification to the Murray–Davies equation in which Ri
and Rp are not constants but are described as functions of
the dot area fraction Rk(Fk) and Rp(Fp).5
R = FkRk(Fk) + FpRp(Fp).
The Probability Functions
Equation 7 may be expanded to describe cmy halftone
color. Equation 2 is the appropriate expansion of Eq. 7 if
we consider the eight Ri(λ) spectra to be functions of the
color fractions fi as well as functions of wavelength λ. Then,
integration gives the tristimulus values of the color image. The problem is to describe the way in which the eight
Ri(λ) of Eq. 2 depend on the eight fi. The approach taken
in this report is to describe probability functions for the
lateral scattering of light in the paper and then to describe
the way in which the eight Ri spectra depend on the probability functions.
We begin by defining the probability function Pji. This
is the probability that if a photon enters the paper in region j, of area fraction fj, it will reemerge after scattering
in region i, of area fraction fi. In other words, if N photons
enter the paper in region i, then Pji is the fraction of these
N photons that scatters and emerges in region i, provided
no light is absorbed by the paper. To account for light absorption, we assume absorption and scattering probabilities are independent so that the final number of photons
from region j that emerges in region i is the product RgPji.
Consider the monochrome case with region j = 1 defined
as the region between the dots and region i = 2 the region
of the paper containing dots. The probability P11 is the
probability of light emerging from the region between the
dots after entering between the dots. For conventional clustered dot halftones this probability was experimentally
shown to be well described by the following function7:
]
P11 = 1 − (1 − f1 ) 1 − (1 − f1 ) w + (1 − f1w ) ,
(8)
where f1 is the same thing as Fp in Eq. 7 and w is a factor
related to the scattering of light in paper. The w factor
has been shown to be related to the scattering optics of
the paper,
336
− Ak pν
C
ink
Iof1
M
paper
P43
4
P33
P23
3
2
P13
1
Figure 1. For a two-color (cyan, magenta) halftone, the light reflected back from the blue region has four origins as a result of
the Yule–Nielsen effect.
(7)
Experimentally it has been well shown that both Rk and
Rp decrease as Fk increases.5 Both empirical and theoretical models have been reported for describing Rk and Rp
versus Fk for monochrome halftones.5–7
w = 1− e
I of 2
(6)
i =1
[
I of 4 I of 3
,
Journal of Imaging Science and Technology
(9)
where A is a constant characteristic of the halftone geometry, kp is a constant proportional to the mean distance
light travels in paper before reemerging as reflected light,
and ν is the halftone dot frequency. A thorough discussion
of the terms in Eq. 9 was reported elsewhere.7
Equation 8 may be generalized to describe the probability Pjj that light that enters an area of the paper marked j
will emerge from the paper at area j, with fj as the area
fraction.
[
]
Pjj = 1 − (1 − f j ) 1 − (1 − f j +) w −(1 f jw ) .
(10)
Similarly, an extension of previous work on monochrome
FM halftones leads8 to a somewhat different expression
for Pjj,
(
)
Pjj = 1 − w 1 − f jB ,
(11)
where w is again given by Eq. 9 but with ν defined as the
inverse of the dot diameter. The B factor is an empirical factor related to the particular geometry of the FM halftone.8
In addition to the Pjj probabilities, there are also all of
the Pji probabilities, as illustrated for the cy two-ink case
in Fig. 1. If we have functions to describe all of the Pji,
then for a three-color cmy halftone we would have an 8 × 8
matrix of probability functions Pji with the Pjj functions
on the diagonal of the matrix. Similarly, a monochrome
halftone would be described with a 2 × 2 matrix of probability functions. As will be shown below, the Pji can be
related to the Pjj. First, however, we examine how these
probabilities can be used to calculate color reproduction
in the halftone.
From Probability to Reflectance
The two-ink case illustrated in Fig. 1, shows how the
incident irradiance, I0 = watts/area, is divided among the
areas, fi, of the halftone image. Photons I0 fi strike the
image in region number i. The light that then enters the
paper in this region is I0 fiTi, where Ti is the Beer–Lambert transmittance of the ink layer over region i. Note
that Ti = 0 for i = 1 (the paper between the dots) and that
T2 = Tcyan, T3 = Tcyan Tyellow, etc. Then, the number of photons from region i that emerge from region j is I0 fiTiPji.
Arney, et al.
For example, as illustrated in Fig. 1, the number of photons that strike Region 4 (the cyan-color region) and eventually emerge under Region 3 is given by the product
(I0T4RgP43). The total amount of scattered light that reaches
Dot 3 is the sum of these expressions for Regions 1, 2, and
4. Then the light passes through Dot 3 and is attenuated
by T3. Similarly, the general expression for the photon irradiance emerging from any dot i is as follows:
Ii = I 0 Rg Ti ∑ (TJ Pji f j ).
J
(12)
The reflectance of the dot is the ratio of the light emerging from the dot, Ii, to the light entering the dot, I0 fi. Thus,
dividing Eq. 12 by I0 fi gives the following expression for
the reflectance of dot i.
fj 

Ri = Rg Ti ∑  TJ Pji ÷.
fi 
J 
(13)
The spectral designation (λ) has been dropped to simplify the notation, but Ri, Rg, and all the Tj are functions of
wavelength. The spectral Ri may then be used in Eq. 2 to
determine the overall spectral reflectance of the halftone
image, and then Eq. 5 can be used to calculate the
tristimulus values of the image. The only unknown in the
model is a description of the off-diagonal probabilities Pji.
The Off Diagonal Probabilities
Everything required to model halftone color is now
known except the off-diagonal probabilities Pji. Intuitively,
the Pji must relate to the Pjj and to the color fractions fj
and fi. We can derive this relationship by assuming the
independence of the scattering probabilities Pji and the
absorption probabilities Ti and Rg. We also assume the
paper is sufficiently thick that loss of light through the
back of the paper is negligible. Under these conditions,
the photons that enter Region 3 of Fig. 1 must eventually
emerge in one of the four regions.
P13 + P23 + P33 + P43 = 1.
(14)
This expression is a special case of Eq. 13 for a two-ink
halftone and for Rg = Ti = Tj = 1. If we further assume that
the dots are randomly placed on the paper so that the probability of light from Region k emerging in some other Region i ≠ k is proportional to the area fraction i. Thus, for
any two regions i ≠ k and j ≠ k, we may write the following:
Pki/Pkj = fi/fj.
(15)
For example P31 = P31(f1/f1), P32 = P31(f2/f1), and P34 = P31(f4/
f1). Combining these with Eq. 14 gives the following:
P31 ( f1 f1 ) + P31 ( f2 f1 ) + P31 + P31 ( f4 f1 ) = 1
(16)
Note that Eq. 15 does not apply to P33, but only to i ≠ j.
Then we recognizing f1 + f2 + f4 = 1 – f3 and solve Eq. 16 for
the off-diagonal P31.
 f 
P31 = (1 − P33 ) i ÷.
 1 − f3 
(17)
We may generalize this expression for any off-diagonal
term, i ≠ j.
Modeling the Yule–Nielsen Effect on Color Halftones
 f 
Pji = 1 − Pjj  i ÷ .
 1 − fj 
(
)
(18)
We now have a sufficient set of functions to model color
halftones.
The Model Recipe
To apply these probability functions to calculate the XYZ
tristimulus values of a color halftone given the c, m, and y
ink fractions delivered by a printer, the following steps
are taken: (Note c, m, and y are the actual areas not the
areas commanded by the printer. Physical dot gain is not
considered here.)
Step 1. Measure the transmittance spectra of the individual inks, Tcyan, Tmagenta, and Tyellow. Assume the
Beer–Lambert law and determine the transmittance spectra Ti of the eight colors. Also measure
the reflection spectrum of the paper, Rg.
Step 2. Begin with the ink combination (c,m,y) and calculate the eight color fractions, f1 through f8. For
randomly placed ink dots, the Demichel equations may be used. Otherwise dot geometry must
be modeled, as illustrated for dot-on-dot halftones
described subsequently.
Step 3. Use Eq.n 10 to calculate the eight diagonal probabilities, Pjj, for a traditional clustered dot halftone. Eq. 11 may be used with an FM, stochastic
type of halftone. The parameters w and B may be
taken as arbitrary constants to fit the model to
data. Alternatively, w and B may be measured independently as described previously.8
Step 4. Use Eq. 18 to calculate the off-diagonal probabilities.
Step 5. Use Eq. 13 to calculate the reflection spectra of
the eight colors.
Step 6. Use Eq. 2 to calculate the reflection spectrum of
the overall halftone image.
Step 7. Use Eq. 5 and the power spectrum of the illumination light, P(λ), to calculate the XYZ tristimulus
values of the halftone image.
Testing the Model
Color halftones were generated with an HP 1600C inkjet printer on a high-quality coated sheet to minimize ink
penetration and dot gain. Halftones were printed with an
error diffusion algorithm, and dot-on-dot was not used.
The dots from the different colors were at 300 dpi and were
randomly placed with respect to each other. A fixed amount
of magenta (dot fraction m = 0.45) was printed at different cyan dot fractions (0 < c < 1). No yellow was printed in
this experiment. A microscopic image of the dot pattern
was captured with a 2-mm field of view using a 3-chip
color CCD camera and video frame grabber. The sample
was illuminated with an incandescent light source through
fiber optics. The resulting light on the sample was measured and found to have the power distribution P(λ) of
CIE Illuminant A. The camera and optical system had been
calibrated to the ink-jet dye set so the rgb images could be
translated into XYZ space. In addition, gray-level segmentation in the original rgb images provided independent
measures of the c, m, and y dot area fractions. Using the
color microdensitometer, measurements were made of the
XYZ tristimulus values of not only the overall image but
of the space between the ink dots. The results were plotted as a function of the cyan dot area fraction c and are
shown in Figs. 2, 3, and 4. Figure 5 shows the corresponding x,y chromaticity values. The data do not go all the way
to the gamut limit because the printer, at a command of
Vol. 42, No. 4, July/Aug. 1998 337
100
X50
Y50
Y
X
100
Paper
Paper
Mean
0
0
Mean
0.5
0
1
0
C
Figure 2. The X tristimulus value of the paper between the dots
() and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi.
0.5
Figure 3. The Y tristimulus value of the paper between the dots
() and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi.
0.6
100
g
paper
w
Z
Z
y
y
50
mean
0.1
0.5
1
C
Figure 4. The Z tristimulus value of the paper between the dots
() and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi.
100% ink, formed dots with very little dot gain and occupied only 90% of the paper area. The model was run over
the range 0 < c < 0.9 to agree with the experiment. These
experiments demonstrate that the color between the dots
is, indeed, not the color of the unprinted paper but mimics
the mean value color of the overall image.
The solid lines in Figs. 2 through 5 were calculated with
the model recipe described above. The transmittance spectrum of the cyan dye was determined from the reflection
spectrum, Rcyan, of a 100% cyan region (m = y = 0) and the
function, Tcyan = (Rcyan/Rg)1/2. Spectra for the magenta and
yellow were similarly determined. The Demichel equations
338
Journal of Imaging Science and Technology
m
r
b
Mean
0
sp
ec
tru
m
loc
y
us
c
Paper
0
1
C
0.2
xx
0.7
Figure 5. The xy chromaticity trajectory with Illuminant A for
the paper between the dots and for the overall image for the variable cyan, fixed magenta, error diffusion halftone. The gamut of
the printer at maximum c, m, and y inks is shown. The paper
chromaticity (w ) and the spectrum locus are shown.
were used to determine the color area fractions fi, and Eq.
11 for FM halftones was used for the on-diagonal probabilities. The model was fit to the data by adjusting w and
B. Rather than search for a statistical fit criteria, the authors simply adjusted w and B to achieve a visually acceptable agreement between the model and all of the data
in Figs. 2 through 5. Values of w = 0.82 and B = 1.2 were
used in this calculation and are consistent with independent estimates from earlier work.8,9
A second experiment was performed using the same
inks and a traditional clustered dot halftone. However,
the clustered dot halftone was printed dot-on-dot rather
Arney, et al.
100
50
50
Paper
Y
X
100
Paper
Mean
0
0
Mean
0.5
0
1
M
0
0.5
M
Figure 6. The X tristimulus value of the paper between the dots
(O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi.
100
Figure 7. The Y tristimulus value of the paper between the dots
(O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi.
,
0.6
g
paper
50
y
Z
w
y
r
m
mean
Mean
0
y
c
Paper
0
1
b
0.5
1
M
Figure 8. The Z tristimulus value of the paper between the dots
(O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi.
than randomly. Figures 6 through 9 show the results.
Again the color of the paper between the dots mimics the
color of the halftone image. The solid lines in these figures were modeled by the recipe above with the following changes. First, Eq. 10 was used for the diagonal
probabilities. Second, the Demichel equations were replaced with a geometric calculation for dot-on-dot halftones. For the fixed magenta at different levels of cyan,
the functions in Table I were used. The value of w = 0.80
was found to provide an overall fit, judged visually, to
the data in Figs. 6 through 9.
Discussion
As shown by Engeldrum, the Yule–Nielsen effect manifests itself in color halftones as a change in the color of the
Modeling the Yule–Nielsen Effect on Color Halftones
0.1
0.2
0.
xx
Figure 9. The xy chromaticity trajectory with Illuminant A for
the paper between the dots and for the overall image for the variable cyan, fixed magenta, clustered dot-on-dot halftone. The
gamut of the printer at maximum c, m, and y inks is shown. The
paper chromaticity (w ) and the spectrum locus are shown.
TABLE I. Color Fractions. Dot-On-Dot Geometric Calculation of
Color Area Fractions from Ink Area Fractions For Cyan, Magenta Two-Ink Halftones
Color
If c < m
If c ≥ m
white
cyan
magenta
blue
1–m
0
m–c
c
1–c
c–m
0
m
paper between the halftone dots as the dot area fractions
change.10,11 The probability model appears to provide a mechanistic rational for this phenomenon. Moreover, the model rationalizes the overall color of the halftone image. The printer
Vol. 42, No. 4, July/Aug. 1998 339
used in the project employed a default algorithm for gray
color removal, and this prevented experimental testing with
more than two of the three cmy inks. However, the fit with
the two color cases strongly supports the model. This, in turn,
indicates that the Neugebauer Eqs. 2 and 5 are correct descriptions of halftone color reproduction provided the eight
reflectance spectra Ri and the eight sets of tristimulus values XYZi are treated as continuous functions of ink fractions
cmy and not as the reflectance spectra and tristimulus values of the eight Neugebauer colors printed at 100% coverage. This point is emphasized by integrating Eq. 13 directly
to find the eight sets of Neugebauer tristimulus values to
use in Eq. 5. Integration leads to the following:
8 
fj 
XYZ = ∑  Pij XYZij .
fi 
j =1
(19)
Note that integration leads to a matrix of 64 tristimulus
values XYZij. The eight values on the diagonal XYZjj are
the traditional Neugebauer values for the eight Neugebauer
colors printed at 100% coverage.2,11 These values may be
measured independently. However, the off-diagonal values
XYZij are tristimulus values for light that passes Dot j, scatters in the paper, and then passes through Dot i. The XYZij
tristimulus values can not be measured independently.
Unlike the Yule–Nielsen modification to the Neugebauer
equation, the probability model has a direct link with the
fundamental optical and geometric characteristics of the
halftone system via Eq. 9. While the probability model is
significantly more complex than the traditional n modified Yule–Nielsen model, it is significantly less complex
than a convolution model involving the fundamental probability function PSF of light in the paper. Gustavson12 has
demonstrated such a model, and it is fundamentally correct theoretically. However, the current probability model
is expressed with closed analytical functions and is much
more amenable to modifications for nonideal systems, as
demonstrated in previous work.9 Moreover, one should be
able to derive the mean level probabilities Pjj from the fundamental probability PSF and a knowledge of the geometry of the halftone system. Because the PSF of paper is
quite difficult to measure experimentally, it is typically
modeled empirically. In the current model, we begin by
modeling Pjj empirically. In addition, as demonstrated previously,8 it may be easier to measure the Pjj than the PSF.
Thus, one experimental approach to measuring PSF may
be to measure Pjj with several known dot geometries and
then to calculate PSF.
340
Journal of Imaging Science and Technology
Appendix
A reviewer of this manuscript correctly pointed out that
Eq. 15 implies an assumption. The assumption is that the
off-diagonal probabilities Pik are proportional to the area
fractions fi so that Pik = ak fi, for i ≠ k. If a nonlinear proportionality actually applies so that Pik = akG(fi) for some function G, then Eqs. 15 through 18 become more complex.
While this may certainly be the case, it is not revealed in
the experimental data and the data are not sufficiently
noise-free to provide a guide to a more advanced estimate
of the functional form of Eq. 15. For a more rigorous analysis of this probability, the reader is directed to recent theoretical work by Rogers.13–15
Acknowledgments. The authors express their appreciation to the Hewlett-Packard Company for support of this
project. Thanks to the reviewers of the paper who offered
extremely helpful criticism. Special thanks to the students
in the 1997/98 course in Color Reproduction at RIT for
finding all the typos.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
J. A. Yule and W. J. Nielsen, TAGA Proc. p. 65 (1951).
J. A. Yule, Principles of Color Reproduction, Chap. 10, John Wiley &
Sons, NY, 1967, p. 255.
T. N. Pappas, Proc. IS&T 47, 468 (1994).
J. A. S. Viggiano, TAGA Proc. 37, 647 (1985).
J. S. Arney, P. G. Engeldrum and H. Zeng, An expanded Murray–Davies
model of tone reproduction in halftone imaging , J. Imaging Sci.
Technol. 39, 502 (1995).
J. S. Arney, C.D. Arney and P. Engeldrum, Modeling the Yule–Nielsen
halftone effect, J. Imaging Sci. Technol. 40, 233 (1996).
J. S. Arney, A probability description of the Yule–Nielsen effect, J.
Imaging Sci. Technol. 41, 633 (1997).
J. S. Arney and M. Katsube, A probability description of the Yule–Nielsen
effect II: The impact of halftone geometry, J. Imaging Sci. Technol. 41,
637 (1998).
M. Alber, Modeling the effect of ink spread and penetration on tone
reproduction, M.S. Dissertation, Rochester Inst. of Technol., Rochester, NY, 1997.
P. G. Engeldrum and B.Pridham, Application of turbid medium theory
to paper spread function measurements, TAGA Proc . 47, 353
(1995).
P. G. Engeldrum, The color gamut limits of halftone printing with and
without the paper spread function, J. Imaging Sci. Technol. 40, 2229
(1996).
S. Gustavson, Color gamut of halftone reproduction J. Imaging Sci.
Technol. 41, 283 (1997).
G. L. Rogers, Optical Dot gain in a halftone print, J. Imaging Sci.
Technol. 41, 643 (1997).
G. L. Rogers, Optical dot gain: Lateral scattering probabilities, J. Imaging Sci. Technol. 42(4), 336-339 (1998).
G. L. Rogers, The effect of light scatter on halftone color, J. Imaging Sci.
Technol. 42, in press (1998).
Arney, et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Optical Dot Gain: Lateral Scattering Probabilities
Geoffrey L. Rogers*
Matrix Color, 26 E 33 Street, New York, New York 10016
In the development of the technology of halftone imaging there has been significant interest in physically modeling the halftone
microstructure. An important aspect of the microstructure is the scattering of light within the paper upon which the halftone image
is printed. Because of light scatter, a photon may exit the paper at a point different from the point at which it entered the paper. The
effect that this light scatter has on the perceived color of the printed image is called optical dot gain. Optical dot gain can be characterized by lateral scattering probabilities, which is the probability that a photon entering the paper through a particularly inked region
exits the paper through a similar or different type inked region. In this article we explicitly calculate these lateral scattering probabilities for the case of AM and FM halftone screening. We express these probabilities in terms of the fractional ink coverage and the
lateral scattering length, a quantity that characterizes the distance a photon travels within the paper before exiting.
Journal of Imaging Science and Technologies 42: 341–345 (1998)
Introduction
Halftone imaging is a widely used technique for producing printed images. Recently there has been significant
interest in physically modeling the halftone microstructure to control the tone characteristics of the halftone image better.1–4 An important aspect of this microstructure
is the scattering of light within the paper upon which the
image is printed. This effect of scattering is called optical
dot gain, because, for achromatic images, the ink dots are
effectively larger as a result of the scattering.5 Several
authors have expressed optical dot gain in terms of lateral scattering probabilities1,2 which is the probability that
a photon having entered the paper through a particular
type inked region exits the paper through a similar or different type inked region. In this article we explicitly calculate these probabilities; in particular we calculate the
ink–ink probability, which is the probability that if a photon enters the paper through an inked region it also exits
the paper through an inked region—a conditional probability we label Pii. Knowledge of Pii allows one to calculate all the other lateral scattering probabilities.1 Although
the calculation done here involves a single array of dots,
our results are applicable to a chromatic halftone image.9
We make the calculation for the case of both AM and FM
halftone screening.2
In Ref. 1 it is shown that the ink–ink probability can be
expressed in terms of an infinite series—the Z-series—
involving the Fourier transforms of the dot shape and the
paper’s point spread function. Here, we explicity calculate Pii and obtain a closed-form expression.
The model we construct to determine Pii is as follows: a
uniform stream of photons is incident on the paper within
an area of one dot, and we calculate the fraction of the
Original manuscript received September 15, 1997
* IS&T Member
© 1998, IS&T—The Society for Imaging Science and Technology
photons that exit the paper through this dot and through
all the other dots. This fraction is Pii. We assume that the
dots are circular with radius d and that they are arranged
in a square grid (screen) with screen period r. The origin
of the coordinate system is at the center of the dot through
which the photons enter the paper (see Fig. 2).
We define 2πR(ρ)ρ,dρ as the probability that a photon,
having entered the paper through the dot centered on the
origin, exits the paper through an annulus, also centered
on the origin, with radius ρ and thickness dρ. R(ρ) is the
radial reflectance per unit area, and R(ρ) integrated over
the entire surface is the paper’s reflectance, Rp:
∞
R p = 2π ∫0 R(ρ ) ρdρ.
(1)
We define the radial covering distribution A(ρ), as the
probability that an arbitrary point at a distance ρ from
the origin is covered by ink.
Then the ink–ink probability is:
∞
Pii = 2π ∫0 R(ρ ) A(ρ ) ρdρ.
(2)
In the section “Reflectance” we calculate the reflectance
per unit area, R(x,y), for photons that have entered the
paper through the area of a single dot. In the section “Radial Covering Distribution” we calculate the covering distribution A(ρ). In the section “Ink-Ink Probability” we carry
out the integration of Eq. 2, making two approximations
to obtain a closed-form expression for Pii. The calculations
carried out in these sections are for AM halftone screening in which the number of dots within a region is constant and the size of the dots is varied. In the section “FM
Halftone Screen” we calculate Pii for FM halftone screening: the dots are of constant size and the number of dots is
varied. In the section “Ink-Ink Probability for Diffusion
PSF” we give the ink-ink probability as calculated with
the diffusion point spread function.
Reflectance
The reflectance per unit area R(x,y) is the probability
that a photon exits the paper at the point x,y after having
341
Figure 1. Radial reflectance per unit area R(ρ) with d = 0.4r
and (a) ρ = 0.1r, (b) ρ = 0.6r, (c) ρ = 1.5r, and (d) ρ = 4.0r.
entered the paper through the area of a dot of radius d
centered on the origin and is given by:
Rp
R( x, y) =
∫∫ H ( x − x' , y − y' ) I ( x' , y' ) dx' dy' , (3)
S0
where S0 is the number of photons incident on the paper
per unit time, H(x,y) is the paper’s normalized point spread
function, and I(x,y) is the incident photon distribution. The
value H(x – x′,y – y′) is the probability that a reflected photon
having entered the paper at x′,y′ exits the paper at x,y. The
value I(x,y) is the number of photons per unit area per unit
time entering the paper at the point x,y and is given by:
 x 2 + y2 
S
,
I ( x, y) = 02 circ 
d


πd



 x 2 + y2 

  = πd 2 J1 (2πkd) ,
F circ 


d
πkd



where J1 is a Bessel function.
Due to the circular symmetry, the inverse Fourier transform can be expressed as a Hankel transform, and one
writes Eq. 3 as:
(4)
where ρ is the polar radial coordinate.
To evaluate Eq. 4, one must choose an appropriate point
spread function. A widely used PSF is6:
Journal of Imaging Science and Technology
ρ2
K 0 (2πρ / ρ ),
where K0 is a modified Bessel function of the second kind.
The parameter ρ is ρ = 4 < ρ > and < ρ > is the first moment of H called the lateral scattering length. It is the
average lateral distance a photon travels within the paper, and its inverse, < ρ >–1, is the approximate bandwidth
of the paper. The MTF is:
1
1 + (ρ k) 2
.
(5)
1 − (2πd / ρ ) K 1 (2πd / ρ ) I0 (2πρ / ρ ), 0 ≤ ρ ≤ d
πd 2
R(ρ ) = 
, (6)
d<ρ
Rp
(2πd / ρ ) I1 (2πd / ρ ) K 0 (2πρ / ρ ),
The integral Eq. 3 is a convolution and can be evaluated as the inverse Fourier transform of the product of
the transforms of H(x,y) and I(x,y). The transform of H(x,y)
is the paper’s modulation transfer function (MTF) labeled
H˜ (k), with k the spatial frequency (lines per unit length).
Owing to the assumed isotropy of the point spread function, the MTF has circular symmetry.
The transform of the circ[ ] function is:
342
2π
Integrating Eq. 4 using Eq. 5, one finds:
 1, 0 ≤ ρ ≤ d

.
0, ρ > d
∞
πd 2
R(ρ ) = 2πd ∫0 H˜ (k) J1 (2πkd) J0 (2πkρ) dk,
Rp
H (ρ) =
H˜ (k) =
where d is the radius of the dots and circ [ρ/d] is:
ρ
circ   =
d
Figure 2. The small circles are dots, and the large circle has
radius ρ. Light is incident through the central dot. The value
A(ρ) is the sum of the bold arc-lengths of the large circle divided
by its radius. The value θ is the angle subtended by the bold arclengths.
where I0 and I1 are modified Bessel functions of the first
kind. Figure 1 shows the radial reflectance Eq. 6, with d =
0.4r, for several different ρ .
Radial Covering Distribution
The radial covering distribution, A(ρ), is the probability
that an arbitrary point at a distance ρ from the origin is
covered by ink. The value A(ρ) is the fraction of the circumference of a circle, centered on the origin with radius ρ, that
lies on a dot. This is shown graphically in Fig. 2. The small
circles are the dots, with radius d, and the large circle has
a radius ρ. The variable A(ρ) is the sum of the bold arclengths of the large circle divided by its circumference. If
the dots overlap (d > r/2), then for some values of ρ the sum
of the arc-lengths is larger than the circumference—in this
case, all points of the large circle lie on a dot and A(ρ) = 1.
The value A(ρ) is calculated as follows: We define the
neighbor distribution N(s) as the number of dots whose
centers lie at a distance s from the origin. We define θ(s,ρ)
as the angle subtended by the arc-length covering a dot
whose center lies at a distance s, as shown in Fig. 2. Then,
the radial covering distribution is:
A( ρ ) =
1 ∞
∫ N ( s)θ ( s, ρ )ds.
2π 0
(7)
Rogers
Integrating the first term and dividing by πd2 one obtains:
1 − 2 K 1 (2πd / ρ ) I1 (2πd / ρ ).
(12)
This expression is the probability that a reflected photon exits the paper through the same dot as that through
which it entered the paper.
Integrating the second term, one obtains a sum of integrals of the form:
xk + d
∫x
k
−d
 x2 + ρ 2 − d2 
K 0 (2πρ / ρ ) arccos k
 ρdρ.
2 xk ρ


(13)
These integrals can be evaluated numerically with littletrouble, however it is possible to get a very accurate closedform expression by making two approximations. The first
is an approximation to the arccos [ ]:
 x2 + ρ 2 − d2 
1
d 2 − ( xk − ρ ) 2 .
arccos k
→
ρ
2 xk ρ


Figure 3. The value A(ρ) with d = 0.4r.
The second approximation is:
Both N(s) and θ(s, ρ) are derived in Ref. 2 and are given
by:
K 0 (2πρ / ρ ) → K 0 (2πxk / ρ ) exp[ −2π (−ρ
for xk - d ≤ ρ ≤ xk + d.
The errors in these approximations tend to cancel each
other for all d and ρ so that the expression
θ ( s, ρ ) =
 2π , 0 ≤ ρ ≤ d − s

2
2
2
2 arccos s + ρ − d /(2 sρ ) , s − d ≤ ρ ≤ s + d
 0, ρ ≤ s − d or ρ ≥ s + d

[(
]
)
K 0 (2πxk / ρ ) ∫− d exp[ −2πu / ρ ] d 2 − u 2 du
(8)
d
∞
∑ pkδ ( xk − s),
ρd
I1 (2πd / ρ ) K 0 (2πxk / ρ ).
2
(9)
k=0
where δ(x) is a Dirac delta function and x k is
r k with k a natural number, and pk is the number of combinations of integers n and m such that k = n2 + m2. The
quantity xk is the distance to the kth “set” of dots, and pk is
the number of dots in the “set”; i.e., the number of dots at
a distance xk. The first few xk/r with nonzero pk are 0, 1, 2, 2,
7
5 , 8 ; and the corresponding pk are 1, 4, 4, 4, 8, 4.
Carrying out the integration in Eq. 7 and defining:
Rp−1 Pii = 1 − 2 K 1 (2πd / ρ ) I1 (2πd / ρ )+ 2[ I1 (2πd / ρ )]2 S(ρ ), (16)
where we define:
S( ρ ) =
)
]
d
,
A( ρ ) =
∑
k=0
Ak (ρ ).
Ink–Ink Probability
Inserting the expressions for R(ρ), Eq. 6, and A(ρ), Eq.
10, into Eq. 2, one obtains:

d
2πd
πd 2
Pii = 2π ∫0 1 −
K 1 (2πd / ρ ) I0 (2πρ / ρ ρdρ
Rp
ρ


+ (2π ) 2
∞ ∞
d
I1 (2πd / ρ ) ∑ ∫d K 0 (2πρ / ρ ) Ak (ρ ) ρdρ.
ρ
k= 1
Optical Dot Gain: Lateral Scattering Probabilities
 π (d / r )2 ,
µ=
(θ + cosθ ) /(1 + sin θ ),
(10)
Figure 3 shows A(ρ) for dot radius d = 0.4r.
Equation 10 is correct for d ≤ r/2. If d > r/2, the right
side of Eq. 10 is greater than 1 for some values of ρ, in
which case one sets A(ρ) = 1.
(11)
(17)
The second term in Eq. 16, 2[I 1 (2πd/ ρ )] 2 S( ρ ),
is the probability that reflected photons exit the paper
through dots other than the one through which they entered the paper.
It is convenient to express Pii in terms of the fractional
ink coverage rather than the dot radius. The percent area
covered by ink, µ, is:
one obtains:
∞
∞
∑ pk K 0 (2πxk / ρ ).
k= 1
and for k ≥ 1:

( p / π )arccos x k2 + ρ 2 − d 2 /(2 x k ρ ) , x k − d ≤ ρ ≤ x+k
Ak ( ρ ) =  k

 0, ρ < x k − d or ρ> x k + d
(15)
Inserting Eqs. 12 and 15 into Eq. 11, one obtains:
1, 0 ≤ ρ ≤ d
A0 (ρ ) = 
0, ρ > d
[(
(14)
is a very accurate approximation to Eq. 13. The integral is
easily evaluated, and one obtains for Eq. 14:
and
N (s) =
xk ) / ρ ],
0 ≤ d ≤ r/2
r/2 ≤ d ≤ r/ 2
(18)
where
θ=
π
 r 
− 2 arccos
÷.
 2d 
2
The expression Eq. 16 is correct for 0 ≤ µ ≤ π/4. Numerical integration of Eq. 2 indicates that linear extrapolation
of Eq. 16 for π/4 ≤ µ ≤ 1 is an excellent approximation.
One then obtains for the ink-ink probability:

 1 − ξ (µ ),
R p−1 Pii (µ ) = 
1 − (1 − µ ) / (1 − µ 0 ) ξ (µ 0 ),


[
]
0≤µ ≤π /4
π / 4 ≤ µ ≤ 1 (19)
where
Vol. 42, No. 4, July/Aug. 1998
343
Figure 4. The value Pii as a function of µ for various ρ . (a) ρ =
0.2r, (b) ρ = 1.0r, (c) ρ = 2.0r, and (d) ρ = 6.0r.
Figure 6. Comparision of the first and second terms of Eq. 16
with ρ = 1.5r. (a) Probability that photon exits incident dot. (b)
Probability that photon exits any of the other dots. (c) Total probability that photon exits a dot, sum of (a) and (b).
that our final result depends only on the average number
of dots within a given region; we assume that within a
region of constant tone, the dots are uniformly distributed.
For ease in notation we assume the paper reflectance Rp
is unity; for Rp < 1, the final expression for Pii is multiplied
by Rp.
We assume the dots are potentially located on a square
grid array with period r. The dots are labeled by their coordinates n, m, with the photons entering the paper
through the n = 0, m = 0 dot. We define Pnm as the probability that a photon having entered the paper through
the dot 0, 0 exits the paper through the dot n, m. We also
define the stochastic variable pnm as:
 1, if there isa dot at m, n
pnm = 
0, if there is nodot at m, n
Figure 5. The value Pii as a function of ρ for (a) µ = 0.1, (b) µ =
0.4, (c) µ = 0.6, and (d) µ = 0.9.
(
ξ (µ) = 2 I1 2r µπ ρ
/
)[ K (2r
1
µ
π ρ/
) − I (2r πµρ / ρ )S( )]
1
and µ0 = π/4. Note that ξ(µ) is the probability that a reflected photon exits the paper through a nonink region
after entering through an inked region.
Figure 4 shows Pii versus µ for several ρ and Fig. 5 shows
Pii as a function of ρ for several µ. In the figures, ρ is in
units of r. As indicated by the curves in Fig. 5 and as can
be shown by Eq. 16, if ρ >> r, then Pii ≈ µ. This corresponds to the case of “complete scattering”.1 Figure 6 shows
the first and second terms in Pii separately (as a function
of µ) for ρ = 1.5. Curve (a) is the probability that the light
exits through the incident dot, (b) is the probability it exits through the other dots, and (c) is the sum of (a) and (b).
For convenience, we have set the paper reflectance equal
to unity in all the figures.
FM Halftone Screen
In this section we calculate the ink-ink probability for
an FM halftone screen. In such a method, all the dots have
the same size and are square with dot area equal to a cell
area and the number (or frequency) of dots is varied. There
are a number of techniques for determining the exact placement of the dots.8 The calculation done here is general in
344
Journal of Imaging Science and Technology
(20)
subject to the constraint:
lim
N →∞
1
N2
N /2
∑'
n,m =− N / 2
pnm = µ,
(21)
where the ′ on Σ indicates that the n = m = 0 term is excluded from the sum (p00 ≡ 1) and µ is the fractional ink
coverage. The left side of Eq. 21 is the average pnm so that:
<pnm> = µ
(22)
(excepting the n = m = 0 term).
The ink–ink probability is obtained by first summing the
probability that a photon exits the paper through the n, m
cell, Pnm , over all cells that contain a dot (pnm = 1), then averaging over all realizations of the pnm consistent with Eq. 21:
Pii =
∑ pnm Pnm
nm
=
∑
pnm Pnm .
(23)
nm
For a uniform distribution, the average over all possible
realizations of the pnm is equivalent to the average defined
by the left side of Eq. 21, so one can write:


Pii = µ  ∑ Pnm − P00  + P00 .
nm

As we assume Rp = 1, the sum is unity:
∑ Pnm = 1,
nm
(24)
(25)
Rogers
Figure 7. The value FM Pii as a function of µ for (a) ρ = 0.2r, (b)
ρ = 1.0r, (c) ρ = 2.0r, and (d) ρ = 6.0r.
Figure 8. The value FM Pii as a function of ρ for (a) µ = 0.1, (b)
µ = 0.4, (c) µ = 0.6, and (d) µ = 0.9.
which simply states that the number of photons is conserved. The probability that the photons exit the same dot
as that through which they entered the paper, P00 , is given
by Eq. 12 (where we approximate the square dot with a
circular dot with area equal to cell area) with d = r/√π, so
the ink–ink probability is:
ξ (µ) =
Pii = 1 – (1 – µ)χ,
) (
)
qn / σ n2
n=1
1 + (2πkt / σ n ) 2
∑
,
(28)
qn / σ n2 ,
n
and the lateral scattering length is:
ρ =
tπ
2 Rp
∑
∞
∑
k= 1
pk K 0 (σ n xk / t).
qn / σ n3 .
n
The diffusion ink–ink probability for AM screening has
the same form as Eq.19 with ξ(µ) given by:
Optical Dot Gain: Lateral Scattering Probabilities
χ=
2
Rp
∞
∑
n=1
(
) (
)
qn
I1 r / π σ n / t K 1 r / π σ n / t .
σ n2
(30)
Conclusion
In this article, we explicitly calculate the probability that
a photon exits the paper through an inked region after
originally entering the paper through an inked region, and
we obtain a simple closed-form expression. This conditional
probability completely contains the effects of optical dot
gain; i.e., knowledge of this probability allows one to account for the effects of optical dot gain in a halftone print
completely. We calculate the probability for both AM and
FM halftone screening.
The results reported here also allow a simple calculation of the Z that appear in the theory of the multi-ink
halftone image.9
G. L. Rogers, Optical dot gain in a halftone print, J. Imaging Sci. Technol.
41, 643 (1997).
2. J. S. Arney, Probability description of the Yule–Nielsen effect: Part I and
II, J. Imaging Sci. Technol. 41, 633 (1997).
3. G. L. Rogers, Neugebauer revisited: Random dots in halftone screening, Col. Res. Appl. 23, 104, (1998).
4. (a) J. S. Arney, C. D. Arney, and P. G. Engeldrum, J. Imaging Sci. Technol.
40, 233 (1996); (b) J. S. Arney, C. D. Arney, and Miako Katsube, J. Imaging Sci. Technol. 40, 19 (1996); (c) J. S. Arney, P. G. Engeldrum, and H.
Zeng, J. Imaging Sci. Technol. 39, 502 (1995).
5. J. A. C. Yule and W. J. Nielsen, TAGA Proc. 3, 65 (1957).
6. J. C. Dainty and R. Shaw, Image Science, Academic Press, New York, 1974.
7. G. L. Rogers and R. Bell, Bessel function identity: dots in a circle, to be
published.
8. See, for example, A. Zakhor, S. Lin, and F. Eskafi, A new class of B/W
halftoning algorithms, IEEE Trans. Image Process. 2, 499 (1993); D. E.
Knuth, Digital halftones by dot diffusion, ACM Trans. Graph. 6, 245
(1987); R. W. Floyd and L. Steinberg, An adaptive algorithm for spatial
gray scale, SID ’75 Dig., Society for Information Display, 36 (1975).
9. G. L. Rogers, The effect of light scatter on halftone color, J. Opt. Soc.
Am. 15, 1813 (1998).
1.
where qn and σn are defined in Ref. 1 and t is the paper’s
thickness. The paper’s reflectance is:
Rp =
Sn =
References
Ink–Ink Probability for Diffusion PSF
The MTF of the diffusion point spread function is:1
∞
) ]
) (
were Sn is given by:
(27)
Unlike with the AM halftone screen, the probability here
is linear with µ for all ρ . The Pii is shown as a function of
µ for several different ρ in Fig. 7, and as a function of ρ
for several different µ in Fig. 8. For ρ >> r, the AM Pii(µ) is
equal to the FM Pii(µ).
Note that χ is the probability that a photon having entered the paper through a dot exits the paper outside the
dot. The different terms of Pii can be interpreted by writing Eq. 26 as Pii = 1 – χ + µχ. In other words: [the probability that the photon exits through a dot (Pii)] = [the
probability it exits within the dot through which it entered the paper (1 – χ)] + [the probability there is a dot
located at an arbitrary point (µ)] × [the probability the
photon exits the paper outside the dot through which it
entered (χ)].
∑
)[ (
(
qn
I1 r µ / π σ n / t K 1 r µ / π σ n / t − I1 r µ / π σ n / t Sn ,
σ n2
For FM halftone screening, Pii has the same form as Eq.
26 with χ given by:
χ = 2 K 1 2 π r / ρ I1 2 π r / ρ .
1
H˜ (k) =
Rp
∑
n= 1
(29)
(26)
with:
(
2
Rp
∞
Vol. 42, No. 4, July/Aug. 1998
345
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Diffuse Transmittance Spectroscopy Study of Reduction-sensitized and
Hydrogen-hypersensitized AgBr Emulsion Coatings
Yoshiaki Oku and Mitsuo Kawasaki*†
Department of Molecular Engineering, Graduate School of Engineering, Kyoto University, Yoshida, Kyoto 606-8501, Japan
A diffuse transmittance spectroscopy method that utilizes a large-area photodetector in contact with the sample allowed the absorption spectra of reduction-sensitization centers and hydrogen-hypersensitization centers in AgBr emulsion to be measured using standard emulsion coatings. Both reduction-sensitization centers produced by dimethylamine borane and hydrogen-hypersensitization centers exhibited the common absorption band centered at 455 nm, shorter by ~20 nm than that previously ascribed by
other groups to similar reduction-sensitization centers in thick liquid layers of AgBr emulsions by means of diffuse reflectance
spectroscopy.
Journal of Imaging Science and Technology 42: 346–348 (1998)
Introduction
The formation and properties of small silver clusters in silver halide emulsions continue to be a critical issue in photographic science. Direct experimental characterizations of
these clusters are requisite to gain proper understanding
of their properties and photochemical behaviors, but are
not necessarily easy to achieve because of the extremely
small size and small concentration at which such clusters
generally function in photographic systems. In this context, Tani and Murofushi first demonstrated that a diffuse
reflectance spectroscopy method could be a promising technique to obtain the absorption spectra of reduction-sensitization centers in photographic emulsions.1 Usefulness of
the same method was confirmed by Hailstone and coworkers,2,3 though questions still remain about the identity of
the centers that give rise to the relevant absorption band
observed at ~475 nm.
Powerful as it is, diffuse reflectance spectroscopy requires
measured reflectance data be analyzed by the so-called
Kubelka−Munk equation4 to obtain a spectrum that approximately scales linearly to the real absorption coefficient of the absorbing species. Furthermore, to meet the
condition for the use of the Kubelka−Munk transform, derived for an ideal scattering layer with infinite thickness,
the previous measurements invariably involved thick layers of liquid emulsions. Thus the corresponding spectroscopic data may not necessarily be correlated directly with
a variety of other experimental data, most of which were
obtained by using standard emulsion coatings. From this
standpoint, we briefly introduce here a diffuse transmittance spectroscopy method, which is an alternative technique to obtain reliable spectroscopic data for small silver
Original manuscript received July 27, 1997
* IS&T Member; Corresponding Author
† e-mail: [email protected] FAX: (+81)75-753-5526
© 1998, IS&T—The Society for Imaging Science and Technology
346
clusters that are present in arbitrary emulsion coatings on
a clear support.
Experimental
Figure 1 shows the components and simple optical geometry of the model spectrometer constructed for the present
purpose. Automatic wavelength scan and data acquisition
are not presently available in this system so the measurement was done point-by-point at 5 to 10-nm intervals by
manual adjustment of the monochromator. The data were
then subject to interpolation in a personal computer to obtain a smoothed spectrum. In Fig. 1, a stack of sample films
is mounted in close contact with an end-on photomultiplier
tube (Hamamatsu Co., Hamamatsu, Japan, type R375) with
a large aperture size ~50 mm in diameter; a geometry that
ensures capturing of the majority of the diffuse transmitted
light by the photomultiplier. The test emulsion coatings consist of monodisperse 0.45-µm octahedral AgBr grains, coated
on a clear support at silver and gelatin coverages of 1.0 g/m2
and 2.0 g/m2, respectively. It is this comparatively low silver
coverage, giving the maximum developable density of ~0.9,
that necessitated the use of a stacked film sample to improve the signal-to-noise ratio. We limited, however, each
stack to a maximum of 10 sample films, resulting in the total sample thickness of ~2.5 mm including the film base.
For the principle of the measurement, it should be noted
first that the light scattering characteristic of the sample
is controlled by the large number of AgBr emulsion grains
regardless of the presence of a small amount of extra absorbing species. In addition, one may expect the effective
path lengths across such a highly scattering layer to be
approximately equal for all the diffuse transmitted photons at the given thickness of the sample. In this condition,
when the diffuse transmittance of a control sample with no
silver clusters is measured as T0 and that of a sample with
a small concentration of silver clusters as T, their absorptivity may be expressed simply by the diffuse absorbance,
as defined by −log T/T0. This could also be supported experimentally, as the measured absorbance proved to scale
linearly to the total number of coatings stacked together.
Monochromator
Figure 1. Experimental setup for diffuse transmittance measurement, with a stack of emulsion coatings mounted in close contact with
an end-on photomultiplier.
Figure 3. A series of diffuse absorbance spectra of reduction-sensitization centers produced by DMAB, measured for a stack of 10
sample coatings. The number attached to each spectrum refers to
the initial DMAB concentration in mg/mol-Ag. The inset shows
the relationship between the peak absorbance at 455 nm and the
initial DMAB concentration.
concentrations ranging from 0.1 to 1.0 mg/mol-Ag. The hydrogen hypersensitization involved 1-atm pure H2 atmosphere maintained at ~50°C in which an unsensitized
coating evacuated by a turbo molecular pump for ~14 h in
advance was kept for 1 to 5 h.
Figure 2. Sensitometric data (relative speed and fog density) for
a series of (a) DMAB-sensitized and (b) hydrogen-hypersensitized
samples as a function of DMAB concentration or time of hydrogen
treatment. The sensitivity refers to the density, 0.2 above the fog
level, obtained by 1-s blue exposure followed by 10-min development in an M-AA-1 surface developer at 20°C.
The present method has been favorably tested for a series of reduction-sensitized samples by dimethylamine borane (DMAB), as well as for hydrogen-hypersensitized5
coatings. Note that no spectroscopic information about the
products of hydrogen hypersensitization has been made
available in the previous diffuse reflectance works on liquid emulsions.1–3 The reduction sensitization (before coating) was carried out at 70°C for 40 min at selected DMAB
Results and Discussion
Figure 2 shows the sensitometric data for the series of
DMAB-sensitized and hydrogen-hypersensitized coatings.
It can be seen that the hydrogen hypersensitization allowed
noticeably higher sensitivity to be reached at lower fog density as compared with the DMAB sensitization.
Figure 3 shows a series of diffuse absorbance spectra
taken for the DMAB-sensitized samples. Another series of
spectra obtained for the hydrogen-hypersensitized coatings
is presented in Fig. 4. In both cases well-defined absorption bands are clearly resolved, the peak position being invariably located at 455 nm. This is substantially shorter
by ~20 nm than that previously ascribed to similar reduction-sensitization centers in liquid phase emulsions, which
we tentatively attribute to different spectroscopic environments of the relevant silver clusters in liquid emulsions
and in emulsion coatings.
Figure 3 (see inset) suggests that the peak absorbance
at 455 nm is an approximately linear function of the initial
DMAB concentration. Because the diffuse absorbance is
Diffuse Transmittance Spectroscopy Study of ... AgBr Emulsion Coatings
Vol. 42, No. 4, July/Aug. 1998
347
Figure 4. As in Fig. 3 but for hydrogen-hypersensitization centers.
Numbers on the spectra show the time/h of hydrogen treatment.
expected to be proportional to the concentration of the extra absorbing species, as noted already, the result indicates
that the number of reduction-sensitization centers produced
by the reaction involving DMAB increases linearly with
the initial DMAB concentration. Regardless of the exact
kinetics and stoichiometry involved, this would be a reasonable relationship simply obeying the law of mass action, if the sensitizing reactions involving DMAB are
identical and complete at all sensitizer levels. According to
Fig. 4 the 455-nm absorption band also grows linearly with
the time of hydrogen hypersensitization. This is another
reasonable relationship, suggesting that the corresponding reaction rate has been maintained approximately constant over the range of reaction time examined here. Note
that the intensity of the ~455-nm absorption is considerably smaller as a whole for the hydrogen-hypersensitized
coatings than for the DMAB-sensitized ones, as opposed to
the trend in the photographic sensitivity (cf. Fig. 2); an interesting correlation requiring further pursuit.
Of course, the identity of the centers that give rise to the
455-nm absorption band (or 475 nm as observed for liquid
emulsions) may still be a controversial issue. Tani and
Murofushi have associated this absorption band exclusively
with what they referred to as P centers,1 which they claimed
348
Journal of Imaging Science and Technology
to be capable of trapping photoelectrons. In contrast, Hailstone and coworkers have suggested that the same absorption band may be a more general feature associated with
reduction-sensitization centers as a whole, of which the
predominant function is hole removal.2,3 In our opinion,
some silver centers with an electron-trapping property that
somehow resembles that of the photolytic subimage center
do form at high enough levels of DMAB sensitization, where
the photographic behaviors of the sensitized film rather
resemble those of sulfur-sensitized emulsions,6 and by prolonged development the fog density gradually but steadily
increases up to the maximum developable density. (At the
highest DMAB concentration of 1.0 mg/mol-Ag, the fog density increased to ~0.4 to 0.8 at 40 to 80 min development in
the M-AA-1 developer.) Even so, such P-like centers do not
seem to be totally responsible for the 455-nm absorption
bands, which clearly showed up even at the lowest level of
DMAB sensitization and also by hydrogen hypersensitization. In this aspect our results may be in closer agreement
with the work of Hailstone and co-workers.
Conclusion
In summary, a diffuse transmittance spectroscopy method
that utilizes a large-area photodetector in contact with the
sample has proved a simple and reliable method to obtain
direct spectroscopic data for reduction-sensitization centers that are present in the standard emulsion coatings.
By this method, we were able to show that both reductionsensitization centers produced by dimethylamine borane
and hydrogen-hypersensitization centers have the common
absorption band centered at 455 nm.
Acknowledgment. We thank Dr. Judith M. Harbison (retired) and Dr. Marian Henry of the Eastman Kodak Company for supplying the series of DMAB-sensitized sample
coatings.
References
1.
2.
3.
4.
5.
6.
T. Tani and M. Murofushi, J. Imaging. Sci. Technol. 38, 1 (1994).
S. Guo and R. K. Hailstone, J. Imaging. Sci. Technol. 40, 210 (1996).
A. G. DiFrancesco, M. Tyne, C. Pryor, and R. K. Hailstone, J. Imaging.
Sci. Technol. 40, 576 (1996).
P. Kubelka and F. Munk, Z. Tech. Phys., 12, 593 (1931).
T. A. Babcock, P. M. Ferguson, W. C. Lewis, and T. H. James, Photogr.
Sci. Eng. 19, 49 (1975).
M. Kawasaki and H. Hada, J. Imaging. Sci. 33, 21 (1989).
Oku and Kawasaki
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Silver Clusters of Photographic Interest III. Formation of ReductionSensitization Centers in Emulsion Layers on Storage and Mechanism
for Stabilization by TAI
Tadaaki Tani,* Naotsugu Muro and Atsushi Matsunaga
Ashigara Research Laboratories, Fuji Photo Film Co., Ltd., 210 Minami-Ashigara, Kanagawa, 250-0193, Japan
Comparison of the increase in sensitivity of AgBr emulsion layers caused by their storage in Ar gas at 70°C for 72 h with that caused
by reduction-sensitization of the same emulsions revealed that R centers of reduction-sensitization were formed on the emulsion
grains during the storage. The driving force for the electron transfer from gelatin to AgBr grains on storage for formation of R centers
was confirmed by ultraviolet photoelectron spectroscopy (UPS) measurement, whereby the Fermi level of gelatin was higher than that
of AgBr. It was however proposed for the formation of R centers on storage that the electron transfer took place by a two-electron
process without creating any free electron or any single silver atom in the AgBr grain. Fog centers as well as reduction-sensitization
centers were formed on sulfur-plus-gold-sensitized grains during storage. A stabilizer 4-hydroxy-6-methyl-1,3,3a-7-tetraazaindene
(TAI) depressed the sensitivity increase and fog formation in layers of sulfur-plus-gold-sensitized AgBr emulsions on storage. The
observation provided evidence for the mechanism of stabilization of emulsion layers by TAI, according to which TAI depresses sensitometric change on storage by preventing formation of reduction-sensitization centers.
Journal of Imaging Science and Technology 42: 349–354 (1998)
Introduction
The formation of reduction-sensitization centers owing to
the reduction of silver ions on silver halide grains by gelatin was observed in several phenomena including reduction-sensitization by digesting emulsions at low pAg (i.e.,
silver digestion) as reported by Wood,1 formation of silver
clusters during precipitation of silver halide emulsion
grains as suggested by Pouradier2 and confirmed by Tani
and Suzumoto3 and by Nakatsugawa and Tani,4 and reduction-sensitization by digesting emulsions at high pH
as reported by DeFrancisco and coworkers.5
The formation of reduction-sensitization centers owing
to the reduction of silver ions on silver halide grains by
gelatin is important for understanding various photographic phenomena, because photographic emulsions are
composed of suspensions of silver halide grains in gelatin
or in aqueous gelatin solutions and reduction-sensitization centers have significant influence on various photographic phenomena.6 However, little analysis has been
undertaken on the formation of reduction-sensitization
centers as a result of the reduction of silver ions on silver
halide grains by gelatin.
In this series of investigations,7–9 reduction-sensitization of fine grain AgBr emulsions was studied by digesting them in the presence of DMAB and other sensitizers
and by characterizing reduction-sensitization centers on
Original manuscript received November 12, 1997.
* IS&T Fellow
© 1998, IS&T—The Society for Imaging Science and Technology
the basis of measurements of sensitometry of emulsion
layers, photoconductivity and ionic conductivity of emulsion grains, and diffuse reflectance spectra of emulsions.
As a result of the above experiments, reduction-sensitization centers acting as positive hole traps and electron traps
could be separately prepared and characterized according
to our proposal, as dimers of silver atoms formed at electrically neutral sites and at positively charged kink sites
(i.e., R centers and P centers), respectively.
The present investigation was undertaken to confirm
formation of reduction-sensitization centers in emulsion
layers on storage by observing the phenomenon under such
a simplified condition that nothing but the interaction of
AgBr grains with gelatin could be the cause for the phenomenon taking place. The result was also compared with
the phenomena caused by normal reduction-sensitization,
which were well established in this series of investigations. The materials and experimental conditions employed
in this study were the same as those in the previous investigations so that the observed phenomena could be
analyzed on the basis of established results.7–9
We believe that the formation of reduction-sensitization
centers on storage causes the photographic instability of
emulsion layers, because it should increase their sensitivity and might cause the formation of fog centers, especially in sulfur-plus-gold-sensitized emulsions. We also
believe a compound that depresses the formation of reduction-sensitization centers in an emulsion layer on storage photographically, stabilizes the emulsion.
Birr discovered 4-hydroxy-6-methyl-1,3,3a,7-tetraazaindene (TAI) as a stabilizer, which depresses sensitometric
change of an emulsion on storage.10 It was noted that sulfur-plus-gold sensitization was realized in practice, when
Koslowski succeeded in stabilizing gold-enhanced fog us-
349
ing TAI.11 The stabilizing effect of TAI has been studied
mainly in relation to the change in the condition of sulfursensitization centers on storage.6 Although an idea for the
stabilizing effect of TAI in relation to the change in the condition of reduction-sensitization on storage was noted,11a any
evidence for the change in the condition of reduction of reduction-sensitization in emulsion layers on storage was not
described.
The results obtained in this investigation could verify
formation of reduction-sensitization centers on AgBr grains
in emulsion layers on storage and provide the grounds for
proposal of a mechanism for stabilization of emulsion layers, according to which stabilizers depress sensitometric
change on storage by limiting the formation of reductionsensitization centers.
Experimental
The emulsions used were the same as those used in the
previous investigation,7,8 composed of octahedral or cubic
AgBr grains with equivalent circular diameter of 0.2 µm
suspended in aqueous solutions of an inert and deionized
gelatin provided by Nitta Gelatin Co., Ltd., (Yao, Osaka)
and prepared at pH 2 at 75°C for 60 min by a controlled
double-jet method12,13 to minimize formation of reductionsensitization centers during precipitation.3,4 These emulsions were reduction-sensitized by digesting them at 60°C
for 60 min in the presence of various amounts of
dimethylamine borane (DMAB). They were sulfur-plus-goldsensitized by digesting them at 60°C for 60 min in the presence of various amounts of sodium thiosulfate as a sulfur
sensitizer, potassium chloroaurate as a gold sensitizer, and
potassium thiocyanate as a stabilizer for a gold sensitizer.
The above-stated emulsions were coated on triacetate
cellulose (TAC) film base with 1.74 g of AgBr/m2 and 1.27
g of gelatin/m2 and used as film samples for the measurements of sensitometry and microwave photoconductivity.
The film samples were exposed at room temperature for
10 s to a tungsten lamp (color temperature: 2856 K)
through a continuous wedge. Exposed films were developed14 at 20°C for 10 min by use of a surface developer
MAA-1, fixed, washed, dried, and subjected to the measurement of optical density. Photographic sensitivity of a
film sample was given by the reciprocal of the exposure to
give the optical density of 0.1 above fog density.
To observe the formation of reduction-sensitization centers on AgBr grains in emulsion layers on storage, film
samples were evacuated, stored at a fixed temperature
for a fixed period in Ar, and subjected to the above-stated
sensitometry.
The photoconductivity of AgBr grains in the emulsion
layers was measured at -100°C by means of a 9-GHz microwave photoconductivity apparatus.15,16 Ultraviolet photoelectron spectroscopy (UPS) was also applied to an
evaporated thin AgBr layer and a thin gelatin layer to
measure the Fermi levels of AgBr and gelatin. A UPS apparatus was used with the retardation potential technique
designed under the guidance of Seki.17
Results and Discussions
Figure 1 shows the photographic sensitivity, fog density,
and microwave photoconductivity of octahedral AgBr grains
with equivalent circular diameter of 0.2 µm in emulsions,
which were reduction-sensitized by digesting them at 60°C
for 60 min in the presence of DMAB in the amount indicated on the abscissa. Sensitivity increased through two
steps with increasing amounts of DMAB. The photoconductivity of the emulsion grains was hardly influenced by the
sensitization centers bringing about the sensitivity increase
350
Journal of Imaging Science and Technology
Figure 1. Photographic sensitivity (S, ), fog density (×), and
photoconductivity (σ, ) of octahedral AgBr emulsions, unsensitized and reduction-sensitized at 60°C for 60 min in the presence of DMAB with the amount indicated in the abscissa.
in the first step and was significantly decreased by the sensitization centers bringing about the sensitivity increase in
the second step. This result provided the evidence for the
idea proposed in the series of these investigations8,9 that
the sensitivity increases in the first and second steps were
ascribed to the effects caused by hole-trapping R centers
and electron-trapping P centers, respectively.
Figure 2 shows photographic sensitivity and fog density of unsensitized and reduction-sensitized octahedral
AgBr emulsion layers stored at 70°C for 0, 72, and 96 h in
dry Ar gas. The reduction-sensitization was carried out
by digesting the emulsions at 60°C for 60 min in the presence of DMAB in the amount shown on the abscissa In the
latter case, sensitivity increased through two steps with
increasing amounts of DMAB in accordance with the previous papers,8 and as seen in Fig. 1. The sensitivity increases in the first and second steps were thus ascribed to
the effects caused by R and P centers of reduction-sensitization, respectively.
As seen in Fig. 2, storage in dry Ar gas at 70°C for 72 and
96 h significantly increased the sensitivity of the
unsensitized emulsion, and the sensitivity increase was
detected and saturated on storage at 4 and 72 h, respectively. After storage in dry Ar gas, the sensitivity increase
caused by R centers of reduction-sensitization could hardly
be observed, whereas the sensitivity increase caused by P
centers of reduction-sensitization remained. The sensitivity achieved by storage in dry Ar gas was nearly the same
as that achieved by R centers of reduction-sensitization.
We propose from these results that R centers of reduction-
Tani, et al.
Figure 2. Photographic sensitivity (S) and fog density of
unsensitized and reduction- sensitized octahedral AgBr emulsion
layers stored in dry Ar gas at 70°C for 0(), 72(), and 96 h (×).
The reduction sensitization was carried out by digesting the
above-stated emulsions in the presence of DMAB in the amounts
indicated in the abscissa.
Figure 3. Photographic sensitivity (S) and fog density of
unsensitized and reduction-sensitized cubic AgBr emulsion layers, stored in dry Ar gas at 70°C for 0(), 72(), and 96 h (×). The
reduction-sensitization was carried out by digesting the abovestated emulsions in the presence of DMAB in the amounts indicated in the abscissa.
sensitization were formed on the unsensitized emulsion
grains on their storage in dry Ar gas at 70°C for 72 and
96 h owing to reduction of silver ions on the grains by gelatin, because the emulsion grains were surrounded only by
gelatin in an inactive atmosphere during storage.
Following the results shown in Fig. 2, Fig. 3 shows photographic sensitivity and fog density of unsensitized and
reduction-sensitized cubic AgBr emulsion layers, which
were stored in dry Ar at 70°C for 72 and 96 h. Sensitivity
increased through two steps by reduction-sensitization
with increasing amounts of DMAB in accordance with the
results in the previous article.8 The sensitivity increases
in the first and second steps were ascribed to the effects
caused by R and P centers, respectively.
In a similar fashion to the results with octahedral AgBr
emulsions, the sensitivity of unsensitized cubic AgBr emulsions significantly increased by storage in dry Ar gas. After storage in dry Ar gas, the sensitivity increase caused
by R centers of reduction-sensitization was hardly observed, whereas the sensitivity increase caused by P centers remained. The sensitivity achieved by the image in
dry Ar gas was nearly the same as that achieved by R
centers of reduction-sensitization. We likewise propose
that R centers of reduction-sensitization were also formed
on cubic AgBr grains in emulsion layers owing to reduction of silver ions on the grains by gelatin on storage in
dry Ar gas.
Following the results shown in Fig. 2, Fig. 4 shows photoconductivity along with photographic sensitivity of
unsensitized and reduction-sensitized octahedral AgBr
grains in emulsion layers stored in dry Ar gas at 70°C for
72 h. As seen here, storage in dry Ar gas hardly influenced
the photoconductivity of unsensitized grains in contrast
to the fact that the image significantly increased the grain
sensitivity. This result also supports the idea that the storage of the emulsion layers in dry Ar gas caused the formation of R centers of reduction-sensitization on the grains.
Figure 5 shows photographic sensitivity and fog density of sulfur-plus-gold-sensitized emulsion layers stored
in dry Ar gas at 70°C for 0 and 18 h. The sulfur-plus-gold
sensitization was carried out by digesting the emulsions
at 60°C for 60 min in the presence of sulfur and gold sensitizers with the amounts indicated in the abscissa. As
shown, storage increased the sensitivities and fog densities of the sulfur-plus-gold-sensitized emulsions. This result suggests that storage caused the formation of
reduction-sensitization centers on the emulsion grains.
Most of the reduction-sensitization centers contributed to
the sensitivity increase, but some were converted by gold
ions into fog centers composed of gold atoms.
Following the results shown in Fig. 5, Fig. 6 shows photographic sensitivity and fog density of sulfur-plus-gold-sensitized octahedral AgBr emulsion layers without and with
TAI stored in dry Ar gas at 70°C for 18 h. The increases in
sensitivity and fog density of the sulfur-plus-gold-sensitized
emulsions caused by the storage in dry Ar gas were less
owing to the addition of TAI to the emulsions.
To obtain knowledge of the driving force for the reduction of silver ions on AgBr by gelatin in the dry and inactive condition, UPS was applied to a thin layer of gelatin
and an evaporated AgBr layer to obtain their electronic
states according to the procedure reported in the literature17 using the apparatus described in Experimental
section. Although a dried gelatin layer in the ground state
Silver Clusters of Photographic Interest III...
Vol. 42, No. 4, July/Aug. 1998
351
Figure 4. Photographic sensitivity (S) and photoconductivity of
unsensitized and reduction-sensitized octahedral AgBr grains in
emulsion layers, stored in dry Ar gas at 70°C for 0() and 72 h
(). The reduction-sensitization was carried out by digesting the
above-stated emulsions in the presence of DMAB in the amounts
indicated in the abscissa. The photoconductivity of the emulsion
grains was measured at –100°C by a microwave photoconductivity apparatus.
is an insulator, it has its own Fermi level that can be determined by means of UPS because the gelatin is electronically excited and can therefore exchange electrons
with AgBr to equalize the Fermi levels during the measurement of their UPS.17 In gelatin in an emulsion layer,
there is some reducing substituents and/or impurities that
can transfer electrons to silver halide to form silver clusters during storage. It is not clear at present how those
substituents and/or impurities in a gelatin layer are related to the gelatin’s Fermi level.
The resulting electronic energy level diagram of gelatin
and AgBr is shown in Fig. 7. As indicated, it was found
that the Fermi level of gelatin was higher than that of
AgBr, indicating the presence of a driving force for electron transfer from gelatin to AgBr when they are in contact with each other. Based on this result, we propose that
the reduction of silver ions on AgBr grains by gelatin occurs because of the electron transfer from gelatin to AgBr
grains in emulsion layers during storage.
Discussion
Formation of Silver Clusters. As described in the previous section, reduction-sensitization centers were formed
on AgBr emulsion grains during storage of emulsion layers owing to reduction of silver ions on the grains by gelatin. This phenomenon should be very important for
352
Journal of Imaging Science and Technology
Figure 5. Photographic sensitivity (S) and fog density of sulfurplus-gold-sensitized octahedral silver bromide emulsion layers,
stored in dry Ar gas at 70°C for 0() and 18 h (). The sulfurplus-gold sensitized was carried out by digesting the emulsions
at 60°C for 60 min in the presence of sulfur and gold sensitizers
with the amounts indicated in the abscissa.
sensitivity and stability of silver halide photographic materials, because silver halide grains are always surrounded
by gelatin and reduction-sensitization centers have significant influence on the sensitivity and stability of the
photographic materials.
We suggest that this phenomenon involved the formation of silver clusters brought about by reduction of silver
ions on AgBr grains as a result of electron transfer from
gelatin to the grains. It seems meaningful to compare electron transfer to AgBr by this reduction reaction to the formation of silver clusters with light-induced electron
transfer from photoexcited sensitizing dyes to AgBr. Note
that formation of reduction-sensitization centers during
storage was very slow. The sensitivity increase by the reduction-sensitization centers was detected and saturated
on storage in dry Ar gas at 70°C at 4 and 72 h, respectively. But electron transfer from sensitizing dyes is the
excited state to AgBr emulsion grains takes place in several picoseconds when the energy level of the excited electrons in a dye are18 above the bottom of the conduction
band of AgBr. We therefore propose that the energy level
of the electrons transferred from gelatin to AgBr emulsion grains should be much lower than the bottom of the
conduction band of AgBr.
It has been shown that Lowes hypothosis7 really takes
place according to the following steps: Namely, an R center
Tani, et al.
Figure 7. The UPS energy level diagram of thin layers of AgBr
and gelatin where broken lines indicate their Fermi levels. The
UPS measurement also gave the top of the valence band of AgBr
and the highest occupied electronic energy level of gelatin as –6.30
and –6.84 eV, respectively. By taking21 the band gap of AgBr as 260
eV, the bottom of the conduction band of AgBr was evaluated to be
–3.70 eV.
Figure 6. Photographic sensitivity (S) and fog density of sulfurplus-gold-sensitized octahedral silver bromide emulsion layers
without (,) and with (, ) TAI, stored in dry Ar gas at 70°C
for 0(,) and 18 hours (,). The sulfur-plus-gold sensitization was carried out by digesting the emulsions at 60°C for 60
min in the presence of sulfur and gold sensitizers with the
amounts indicated in the abscissa. The amount of TAI added to
the above-stated emulsions was 2 × 10–2 mole/mole AgBr.
is oxidized by a positive hole to give Ag2+ (step 1), which
undergoes ionic relaxation to give a silver atom and an interstitial silver ion (step 2). A silver atom then dissociates
to give a free electron and an interstitial silver ion (step 3).
were disposed to form many R centers on the grain. We
therefore suggest that the electron transfer from gelatin
to an AgBr grain takes place by creating neither any free
electrons nor free silver atoms in the grain, which is contrary to the electron transfer in spectral sensitization that
creates free electron in an AgBr grain.20 Namely, there is
an essential difference between the electron transfer processes by reduction reaction and the light-induced electron transfer (i.e., spectral sensitization) for the formation
of silver clusters.
To analyze the proposed difference, it is meaningful to
compare the following two electron transfer processes:
Ag2(R center) + h+ → Ag2+,
(1)
R + Ag+ → R+ + Ag,
(4)
Ag2+ → Ag + Ag+,
(2)
R + 2Ag+ → R2+ + Ag2.
(5)
Ag → Ag+ + e–.
(3)
Step 2 was originally proposed by Mitchell,18 and Step 3
was proposed according to the analysis of low-intensity
reciprocity failure.19 Accordingly, appearance of free electrons leads to the formation of a silver cluster acting as
an electron trap (i.e., P center) on a grain following the
mechanism of the formation of a latent image center.7
Namely, only one P center is formed and grown on such a
fine AgBr grain according to the concentration principle7
when free electrons take part in the formation of the silver cluster.
It was obvious in this investigation that the electrons
transferred from gelatin to an AgBr grain during storage
Silver Clusters of Photographic Interest III...
The electron transfer for spectral sensitization corresponds to process 4. It is proposed in this article that the
electron transfer for the formation of a reduction-sensitization center by reduction reaction correspond to process 5.
In the case of spectral sensitization, R in Eq. 4 corresponds to a sensitizing dye molecule in the excited state
in which only one electron is excited to the higher molecular orbital and available for the electron transfer within
the lifetime of the excited dye molecule. Therefore, one
electron is transferred from the excited dye molecule to
the conduction band of silver halide in spectral sensitization contributing to the formation and growth of one silver cluster acting as an electron trap on a grain.
Vol. 42, No. 4, July/Aug. 1998
353
But formation of a silver cluster by thermal reduction
proceeds not through process 4, but through process 5
owing to the following reason: In process 5, R is in the
ground state and has two electrons in the highest occupied molecular orbital both of which are available for electron transfer. In addition, a dimer of two silver atoms is
much more stable than two isolated silver atoms owing to
the large binding energy of the dimer.7–9,21 It is therefore
expected that the formation energy of a silver cluster
through process 5 is smaller than that through process 4
leading directly to the formation of a silver cluster by the
reduction reaction.
No analysis could be made in this investigation on the
chemical composition of the gelatin that transferred electrons to AgBr emulsion grains during storage. However,
the difference in Fermi level between gelatin and AgBr
provides the driving force for electron transfer from gelatin to the grains. Namely, the Fermi level of AgBr is situated nearly at the middle of its forbidden band, lower than
the Fermi level of gelatin. Electron transfer from gelatin
to AgBr emulsion grains and formation of reduction-sensitization centers there could accordingly take place, raising the Fermi level of the grains until it became equal to
that of gelatin. The detailed elementary processes for
achieving process 5 could not however be clarified in this
study.
Mechanism of Stabilization by TAI. As described in the
previous section, reduction-sensitization centers were
formed on AgBr emulsion grains during storage of emulsion layers owing to two-electron reduction of silver ions on
the grains by gelatin. This phenomenon should be very
important for stability of silver halide photographic materials because formation of reduction-sensitization centers
during storage should have significant influence on both
sensitivity and fog density of photographic materials.
As Yoshida, Mifune, and Tani reported,22 it is known that
application of gold sensitization to reduction-sensitized
emulsions causes formation of fog centers by converting
reduction-sensitization centers into clusters of gold atoms
that act as fog centers owing to the development-enhancing effect of gold atoms.6 It is therefore expected that formation of reduction-sensitization centers on silver halide
grains in sulfur-plus-gold-sensitized emulsion layers during storage might cause an increase in fog density in addition to the increase in sensitivity. As shown in Fig. 5,
storage in Ar gas actually increased the sensitivity and
fog density of sulfur-plus-gold-sensitized emulsions. This
result indicates that storage caused the formation of reduction-sensitization centers on the emulsion grains, most
of which contributed to the sensitivity increase, and some
of which were converted into fog centers composed of gold
atoms.
354
Journal of Imaging Science and Technology
As shown in Fig. 6, increases in the sensitivity and fog
density of the sulfur-plus-gold-sensitized emulsions by storage in dry Ar gas was less because of the addition of TAI to
the emulsions. Namely, TAI could stabilize sulfur-plus-goldsensitized emulsion layers by inhibiting the formation of
reduction-sensitization centers during storage. As indicated
by Eq. 5, the formation of reduction-sensitization centers
during storage depends upon the activity of silver ions. It is
well known that TAI decreases the activity of silver ions by
forming its sliver ions on the grain surface.6,11a Thus TAI
could depress the formation of reduction-sensitization centers on emulsion grains during storage.
Reference
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
H. W. Wood, J. Photogr. Sci. 6, 33 (1958).
J. Pouradier, J. Soc. Photogr. Sci. Technol. Jpn. 54, 464 (1991).
T. Tani and T. Suzumoto, J. Imaging Sci. Technol. 40, 56 (1996).
H. Nakatsugawa and T. Tani, Formation of silver clusters due to reduction of silver ions by gelatin during precipitation and digestion processes
of silver halide emulsion grains in the preprint book of Autumn
meeting of Soc. Photogr. Sci. Technol. Jpn. Nov., 1996, pp. 14–16.
A. G. DeFrancisco, M. Tyne, C. Pryor, and R. Hailstone, J. Imaging Sci.
Technol. 40, 576 (1996).
T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 6.
T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 4.
T. Tani and M. Murofushi, J. Imaging Sci. Technol. 38, 1 (1994).
T. Tani, J. Imaging Sci. Technol. 41, 577 (1997).
E J. Birr, Z. Wissensch. Photogr. 47, 2 (1952).
(a) M. H. Van Doorselaer, J. Imaging Sci. Technol. 37, 524 (1993); (b) F.
W. H. Mueller, J. Opt. Soc. Am. 31, 499 (1949); (c) R. Koslowski, Z.
Wissensch Photogr. 46, 65 (1951); (d) K. Meyer, Z. Wissensch Photogr.
47, 1 (1952).
C. R. Berry and D. C. Skillman, Photogr. Sci. Eng. 6, 159 (1962).
(a) E. Klein and E. Moisar, Photogr. Wiss. 11, 3(1962); (b) E. Klein and
E. Moisar, Ber. Bunsenges. Phys. Chem. 67, 349 (1963); (c) E. Moisar
and E. Klein, Ber. Bunsenges. Phys. Chem. 67, 949 (1963); (d) E. Moisar,
J. Soc. Photogr. Sci. Technol. Jpn. 54, 273 (1991).
T. H. James, W. Vanselow and R. F. Quirk, Photogr. Sci. Technol. 19B,
170 (1953).
(a) L. M. Kellogg, N. B Libert and T. H. James, Photogr. Sci. Eng. 16, 115
(1972); (b) L. M. Kellogg, Photogr. Sci. Eng. 18, 378 (1974).
T. Kaneda, J. Imaging Sci. 33, 115 (1989).
K. Seki, H. Yanagi, Y. Kobayashi, T. Ohta, and T. Tani, Phys. Rev. B49,
2760 (1994).
(a) J. W. Mitchell, Recent Progr. Phys. 20, 433 (1957); (b) J. W. Mitchell,
J. Phys. Chem. 66, 2359 (1962); (c) J. W. Mitchell, Photogr. Sci. Eng. 25,
170 (1981).
(a) P. C. Burton and W. F. Berg, Photogr. J. 86B, 2 (1946); (b) P. C. Burton, Photogr. J. 86B, 62 (1946); (c) W. F. Berg and P. C. Burton, Photogr.
J. 88B, 84 (1948); (d) P. C. Burton, Photogr. J. 88B, 13 (1948); (e) P. C.
Burton, Photogr. J. 88B, 123 (1948); (f) W. F. Berg, Rep. Prog. Phys. 11,
248 (1948).
T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 5.
(a) J. W. Mitchell, Photogr. Sci. Eng. 22, 1 (1978); (b) J. W. Mitchell,
Imaging Sci. J. 45, 2 (1997).
Y. Yoshida, H. Mifune and T. Tani, J. Soc. Photogr. Sci. Technol. Jpn. 59,
541 (1996).
T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 3.
Tani, et al.
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
A New Crystal Nucleation Theory for Continuous Precipitation of
Silver Halides*
Ingo H. Leubner†
Imaging Research and Advanced Development, Eastman Kodak Company, Rochester, New York 14650-1707
A new steady state theory of crystallization in the continuous stirred tank reactor (CSTR), or mixed-suspension mixed-productremoval (MSMPR), system was developed based on a dynamic balance between growth and nucleation. The present model was (but is
not) limited to nonseeded systems with homogeneous nucleation, diffusion controlled growth, and a nucleation model previously
confirmed for such systems in controlled double-jet batch precipitations. No assumptions of size-dependent growth were needed. The
model predicts the correlation of the average crystal size with residence time, solubility, and temperature of the system and enables
calculation of the supersaturation ratio, the maximum growth rate, the ratio of nucleation to growth, the ratio of average to critical
crystal size, and the size of the nasent nuclei. The model predicts that the average crystal size is independent of reactant addition rate
and suspension density. The average crystal size is a nonlinear function of the residence time where the crystal size has a positive
value at zero residence time (plug–flow condition). Results of continuous precipitations of silver chloride confirm the predictions of the
model. The ratio of the fraction of the input stream used for nucleation and crystal growth was calculated from the experimental
results to decrease from 4.79 to 0.12, and the size of the nascent crystals to increase from 0.194 to 0.221 µm between 0.5- and 5.0-min
residence time. The ratio of average to critical crystal size was determined to 5.73∗103 (1.02 to1.09), the supersaturation ratio to 12.2
(0.54, average crystal size L= 0.5 µm ), the supersaturation to 8.2∗10–8 (12.7∗10–9 mol/l, L␣ = 0.5 µm), and the maximum growth rate to
4.68 A/s (1.20 to 4.25). The data in the brackets are for equivalent batch precipitations. The experiments indicate that the width of the
crystal size distribution increased with suspension density and was independent of reactant addition rate. While the present model
was developed for homogeneous nucleation, diffusion limited growth, and unseeded systems, it may be modified to model seeded
systems, systems containing ripening agents or growth restrainers, and systems where growth and nucleation are kinetically, heterogeneously, or otherwise controlled.
Journal of Imaging Science and Technology 42: 355–364 (1998)
Introduction
It is the object of this work to extend the previously reported model of crystal nucleation for precipitation of
highly insoluble compounds in controlled double-jet precipitation batch processes1–5 to precipitations in continuous suspension crystallizers.
The present proposed new theory of crystallization in continuous mixed-suspension mixed-product-removal
(MSMPR) crystallizers [here referred to as continuous
stirred tank reactor (CSTR)] differs from the previous theory
for continuous precipitation by Randolph and Larson6,7 in
that it correlates the average crystal size of the crystal population with reaction conditions like residence time, temperature, and crystal solubiltity. It is thus intended to
complement the Randolph–Larson theory which is concerned with the crystal population size distribution.
In addition, the new theory consists of three distinct
parts that distinguish it from the Randolph–Larson theory:
Original manuscript received October 20, 1997
* This work was presented at the AIChE Annual Conference, Chicago, IL,
November 11–15, 1996, and at the IS&T 50th Annual Conference, Boston,
MA, May 18–23, 1997.
† IS&T Fellow
© 1998, IS&T—The Society for Imaging Science and Technology
1. The new model is based on a dynamic balance between
crystal nucleation and crystal growth at steady state.
2. The new mathematical treatment of the model is based
on mass and nucleation balance. It shares certain
concepts of mass and nucleation balance with the
Randolph–Larson theory. In addition, it introduces the
concepts and equations of the crystal nucleation theory
previously proposed and experimentally supported by
the author and his coworkers. This leads to new
mathematical equations that have no arbitrary
adjustable parameters. These new equations are
significantly different from those of the Randolph–
Larson theory. Furthermore, the new theory and the
equations make distinctly new predictions. Some of
these were experimentally supported in the present
work.
3. The new theory is supported by experimental results.
As in any new endeavor, only few predictions could be
tested. The theory and the equations give ample
suggestions for further experimental and theoretical
research.
The use of continuous precipitation systems for the precipitation of silver halide dispersions has been investigated previously.8–13 Continuous streams of reactants
(silver nitrate and halide solutions) and a gelatin solution are fed to a well-stirred precipitation vessel and product is simultaneously removed while maintaining a
constant reaction volume and reaction conditions (Fig.
355
Figure 1. Schematic diagram of the continuous suspension crystallizer for silver chloride precipitations.
1). Initially, the precipitation vessel contains an aqueous
solution of gelatin and halide salt. It may include silver
halide seed crystals that would affect the transient period but would not affect the present results, which are
based on spontaneous homogeneous nucleation. After a
transient time period a steady state is reached after which
the size distribution and morphology of silver halide crystals removed from the precipitation vessel remain unchanged. In Gutoff,8,9 Wey and Terwilliger,10 and Wey et
al.11,12 the crystal size distribution was investigated using the formalisms derived by Randolph and Larson for
the mixed-suspension, mixed-product-removal (MSMPR)
system [Randolph and Larson6,7].
Wey et al.12 examined the crystal size distribution of
AgBr using the population balance technique and included
both McCabe’s ∆L law (size-independent growth rate) and
a size-dependent growth model. The crystal population distribution could not be satisfactorily modeled. Using the
large grain population, nucleation and maximum growth
rates were determined using the Randolph–Larson model.
The rates were empirically related to temperature and supersaturation. Their data indicate that the nucleation of
AgBr is homogeneous and that no significant secondary
nucleation mechanism is involved in the precipitation of
AgBr crystals.
Wey, et al.13 studied the transient behavior of unseeded
silver bromide precipitations and determined that the
steady state of crystal population distribution was achieved
only at about 6 to 9 residence times (τ) after the start of
the precipitations, much later than the steady state of
suspension density (at about 4 residence times). Their results also showed that at steady state the crystal population is rather narrowly distributed around an average
crystal size.
In Fig. 2, electron micrographs of AgCl crystals (carbon replica) are shown that were obtained at steady state
for CSTR precipitations at a residence time of 5.0 min.
In Fig. 3, the crystal size distribution of AgCl at steady
state is shown for precipitations varying in residence time
from 0.5 to 5.0 min. It is apparent that the distribution
is rather narrow and can be described by an average crystal size L, and a somewhat symmetrical crystal size distribution. In a log numbered size plot (R/L theory), the
curves are not symmetrical in agreement with AgBr precipitations (Wey et al.12). Clearly, the steady state crystal size distribution does not follow the linear semi-log
correlation predicted by the Randolph–Larson model (Ref.
7, p. 87).
356
Journal of Imaging Science and Technology
Figure 2. Electron micrographs of silver chloride crystals (carbon replica) obtained at steady state. Residence time 5.0 min.,
60°C, pAg 6.45, 2.4% gelatin, suspension density 0.1 mol/l.
An objective of this work is to derive a model that predicts the average crystal size of these precipitations and
to support some of the model’s predictions experimentally.
This model is based on the previously derived nucleation
model for batch double-jet precipitations, the maximum
growth rate of the system, and the dynamic mass balance
between nucleation and growth. The newly developed
model explicitly describes the average crystal size as a
function of experimental variables like solubility of the
product, residence time, and temperature without the assumption of arbitrarily adjustable parameters. The fractions of the incoming reaction stream that are used for
nucleation and growth can be determined. The average
size of the nascent (newly formed) crystals at steady state
also can be calculated. The new model also leads to the
determination of the critical crystal size, which allows
calculation of the supersaturation ratio in the precipitation system.
Theory
Review. Since its first publication, the model by Randolph
and Larson6,7 has found wide application to describe the
crystal size population in continuous MSMPR crystallizers
(Eq. 1):
n = n0 exp(–Lx/Gmτ),
(1)
where n = population density at size Lx , n0 = nuclei population density, number/(volume-length), Gm = maximum
growth rate, and τ = residence time. Equation 1 describes
the expected number distribution of the crystal product at
steady state. This equation is applicable where a straight
line of a plot of the logarithm of population density n versus crystal size Lx describes the crystal size population. It
was observed in precipitations of silver bromide12 and AgCl
(Fig. 3) that the crystal size distribution is not described by
Eq. 1 but is given by13 a somewhat symmetrical distribution around an average crystal size L where only a small,
large-sized part of the crystal population obeys Eq. 1. The
Randolph–Larson theory (Eq. 1) also does not explicitly include other factors that affect the precipitation population
Leubner
Mass-Balance: R0 = Rn + Ri
R0 = Reaction Addition Rate = Product Removal Rate
Rn = Fraction of R0 consumed for Crystal Nucleation
Ri = Fraction of R0 consumed for Crystal Growth
Figure 4. Dynamic mass balance model for the CSTR/MSMPR
system.
Figure 3. AgCl crystal size distribution for residence times varying from 0.5 to 5.0 min. (60°C, pAg 6.45, 2.4% gelatin).
such as reactant addition rate, solubility, or temperature
effects. A new theory is presented that directly correlates
the average crystal size L (cm) to the residence time τ (s),
the solubility Cs (mol/cm3), and temperature T (K), and
makes predictions about the effect on reactant addition rate
and suspension density. This new model is derived from
the nucleation theory for batch precipitation, which was
developed by this author and coworkers.1–3
New Model. In continuous MSMPR crystallizers, reactants, solvents, and other addenda are continuously added
while the product is continuously removed (Fig. 1). In the
following, this precipitation scheme will be referred to as
a continuously stirred tank reactor (CSTR) system. For
the present derivation, only the reaction-controlling reactant that leads to the crystal population will be considered. For instance, silver halides are generally precipitated
with excess halide in the reactor, which is used to control
the solubility. Thus the soluble silver salt, e.g., AgNO3, is
the reaction-controlling compound. The solubility of the
resultant silver halides (chloride, bromide, iodide, and
mixtures) is so low that their concentration in the reactor
will be neglected for the mass balance. For other precipitations where the solubility of the reaction product is significant, the solubility will need to be retained as an
explicit rate term on the right side of Eq. 2. Similarly,
unreacted material needs to be added as a rate term on
the right side of Eq. 2.
Flow rates of other necessary addenda for the precipitation, such as halide salt solutions, water, gelatin, etc., are
included into the calculation of the residence time τ and
suspension density Mt and are important to control the silver halide solubility. The present model is based on several
premises. It is apparent that modifications of the present
model can be obtained by changing some of these premises.
• The reactants react stoichiometrically to form the
crystal population, and the solubility of the resultant
product (in the present experiment, silver halide) is not
significant with regard to the mass balance. It is
straightforward to expand the model to include the
effect of significant solubility of the reaction product.
• Homogeneous nucleation is assumed. For nonhomogeneous and other nucleation processes, the proper
nucleation model must be substituted for the presently
used nucleation model.
• The input reactant stream at steady state is consumed
in a constant ratio for crystal nucleation and crystal
growth (Fig. 4).
• Crystal nucleation in the CSTR system follows the same
mechanism as in double-jet precipitation.
• Crystal growth is given by the maximum growth rate
of the crystal population. In the present derivation, a
single maximum growth rate Gm is assumed, which can
be derived from the experimental results (see below).
However, an analytical equation for Gm as a function of
crystal size and reaction conditions may be substituted
if it is known. This reduces the number of unknowns in
the final equation to the ratio of average to critical
crystal size L/ Lc.
• For the present derivation it is assumed that the
nucleation is by a diffusion-controlled mechanism.3
The following approach was successfully used previously
to describe renucleation in batch precipitations quantitatively.14
When a stream of reactant R0 [addition rate of reactant,
e.g., silver nitrate (mol/s)] is added to the reaction vessel at
steady state, a fraction of the stream will be used to
renucleate new crystals to replace crystals leaving in the
product stream (Rn, addition rate fraction used for nucleation) and the remainder to sustain the maximum growth
rate of the crystal population (Ri, addition rate fraction used
for growth = crystal size increase) (Fig. 4). These assumptions are well met by the precipitation of highly insoluble
materials such as the silver halides that perform at practically 100% conversion. The mass balance requires that
R0 = Rn + Ri.
(2)
The addition rate R0 is given by the concentration (mol/
l) and flow rate (l/s) of the reactants. The product removal
rate at steady state is by definition equal to the reactant
addition rate. In the following, Rn and Ri will be derived
and finally inserted in Eq. 2 to provide the new model for
the crystal population at steady state.
Crystal Nucleation. The number of crystals nucleated
Zn must be equal to the number of crystals leaving in the
reaction stream and is given by the mass balance:
Zn = R0Vm/kvL3,
(3)
where Vm is the molar volume of the reaction product (to
convert molar addition rate into volume cm3/mol) and kv
is the volume factor that converts from the characteristic
average crystal size L to crystal volume (see glossary). This
definition of nucleation rate is different from B0 used in
the Randolph–Larson theory, which is defined as the rate
of formation of nuclei per unit volume in the crystallizer.7
A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides
Vol. 42, No. 4, July/Aug. 1998
357
As will be shown in the new theory and in the experimental section, the nucleation rate is independent of reactor
volume for homogeneous nucleation conditions.
For homogeneous crystal nucleation under diffusion-controlled growth conditions, Eq. 4 was derived for double-jet
precipitations1–3:
Zn = RnRgT/2 ks γDVmCsΨ,
(5)
(6)
Zn = KRn,
(7)
Rn = Zn/K.
(8)
which is solved for Rn:
Substitution of Zn from Eq. 3 into the equation for Rn (Eq.
8), leads to
(9)
Thus, an analytical equation for R n has been found,
which below will be entered into Eq. 2. It remains now to
develop an analytical equation for Ri, which is derived from
the maximum growth rate of the crystal population.
Crystal Growth. The growth rate G is defined as the
change in crystal diameter with time, dL/dt. The maximum growth rate Gm is given by the mass balance between
the maximum growth of the system and the fraction of
the reactant addition rate consumed for this growth Ri.
The maximum growth rate is a function of crystal size,
and thus the individual size classes will grow at different
absolute maximum growth rates. The maximum growth
rate of the whole crystal population is then given by an
average maximum growth rate Gm, which is defined to be
related to the average crystal size L.
If the crystal size is difficult to determine because of a
complicated crystal structure, for instance, denditric crystals, the specific surface area Sm (e.g., surface area/mol of
crystals) may be used to derive the maximum growth rate.
This was not necessary under the present conditions. The
use of Sm was discussed in Leubner,14 and may be transferred to the present model as desired.
Equation 10 results in solving this mass balance for Gm:
Gm = VmRi/3.0kvL2Zt.
358
Journal of Imaging Science and Technology
Ri = 3GmR0τ/L.
(12)
At this point, both Rn and Ri have been expressed by
analytical expressions that contain only fully defined parameters and variables. It is now possible to continue to
the formulation of the new theory.
Crystal Growth and Nucleation in the Continuous
Crystallizer. The foundation has now been laid to combine nucleation and growth at steady state. For this purpose the expressions of Rn (Eq. 9) and Ri (Eq. 12) are
inserted into Eq. 2. After simplifying, Eq. 13 is obtained:
Vm/KkvL3 + 3Gmτ/L = 1.0.
(13)
Reinserting K into this equation and solving for zero
leads to:
kvL3RgT – 2ksγDVm2CsΨ – 3kvGmRgTL2τ = 0.
This leads to a simplified equation for Zn:
Rn = R0Vm/KkvL3.
(11)
Inserting Eq. 11 into Eq. 10 and solving for Ri leads to
the desired analytical equation for Ri:
and Rg is the gas constant, T is the temperature (K), ks is
the crystal surface factor, γ is the surface energy, D is the
diffusion coefficient of the reaction-controlling reactant,
and Cs is the sum of the solubility with regard to the reaction-controlling reactant. The value Lc is the critical crystal size at which a crystal has equal probability to grow or
to dissolve by Ostwald ripening. In previous papers,1–3,14
Eq. 4 was quoted for the specific case of spherical crystal
morphology (ks = 4 π). For batch processes Zn equals the
total number of stable crystals formed during nucleation.
Here Zn is equal to the nucleation rate (dZ/dt) at steady
state. This extrapolation is in agreement1,3 with the underlying derivation of Eq. 4.
For the remainder of the derivation of the equations,
the intermediate variable K is introduced:
K = RgT/2ksγDVmCsΨ.
Zt = R0Vmτ/kvL3.
(4)
where
Ψ = L/Lc – 1.0
Here, Zt is the total number of crystals present in the
reaction vessel during steady state. The value Zt can be
calculated from the average crystal size and the suspension density, which, in turn, is a function of reactant addition rate and residence time.
(10)
(14)
Solving for L results in a very complicated solution as a
function of residence time which is of little immediate use.
But solving for the residence time τ is relatively straightforward:
τ = L/3 Gm – 2ksγDVm2CsΨ/3kvGmRgTL2.
(15)
These equations can be used to determine Gm and Ψ from
the experimental values of residence time τ and the average crystal size L obtained at crystal population steady
state, and by entering the other parameters that are either known or can be experimentally determined. If Gm is
known from other experiments (or an analytical equation
of Gm is available that relates it to the crystal size and
reaction conditions), this may be substituted in Eq. 15.
This would reduce the number of unknowns to Ψ.
From Ψ and the average crystal size L, the critical crystal size Lc can be calculated (Eq. 5), which is related to the
supersaturation ratio by
S* = 1.0 + 2γVm/RgTLc.
(16)
The supersaturation ratio S* is defined by Eq. 17:
S* = (Css – Cs) / Cs,
(17)
where Css is the actual (supersaturation) concentration and
Cs is the equilibrium solubility as defined above.
Special Limiting Conditions for Continuous Crystallization. Two limiting cases of Eq. 15 are of special
interest. For this purpose Eq. 15 is rewritten15:
τ = L(1.0 – 2ksγDVm2CsΨ/kvL3RgT)/3Gm.
(18)
Very Large Average Crystal Size L. If the average crystal size is very large as given by the definition in Eq. 19:
Leubner
L3 >> 2ksγDVm2CsΨ/kvRgT,
(19)
then the term inside the parenthsis of Eq. 18 can be set
equal to one and simplified to
τ = L/3Gm
(20)
L = 3Gmτ.
(21)
or
This indicates that τ and L are linearly related and Gm
the maximum growth rate can be determined from the
linear part of the correlation at large crystal sizes. The
growth rate Gm can then be resubstituted into the original
Eq. 14, 15, or 18, and Ψ can be determined.
Continuous Precipitation in a Plug–Flow Reactor, τ
→ Zero. The second limiting case of the above equation is
when τ, the residence time, approaches zero. The above
equation predicts that under this limiting condition, the
average crystal size will not become zero but reach a limiting value.
Plug–flow reactors are characterized by a short residence
time in the nucleation zone, τ → 0, followed by crystal
growth and ripening processes. The present derivation
allows us to predict the minimum crystal size for the condition where τ → 0:
L3 = 2ksγDVm2CsΨ/kvRgT.
(22)
Nascent Nuclei Size. For the present work, the term
“nascent nuclei” is defined as the stable crystals that are
newly formed during steady state, which continue to grow
in the reactor and which are removed from the reactor.
The size of these crystals is larger than that of the critical
nuclei, which have equal probability to grow or dissolve in
the reaction mixture. The nascent nuclei will have a crystal size distribution that is related to the critical crystal
size. The nascent nuclei population may be represented
by an average nascent nuclei size Ln. With the model that
has been developed up to this point, it is possible to calculate Ln for the different precipitations.
For this purpose we define Ln by:
Ln3 = RnVm/kvZn.
Here, Zn can be calculated from Eq. 3 and Rn can be
calculated from Eqs. 2 and 25. Back-substitution into Eq.
27 leads to Eq. 28:
Ln3 = (Rn/R0)L3.
Vg = kvL3,
(23)
where Vg is the average crystal volume, leads to
Vg = 2 ksγDVm2CsΨ/RgT.
(24)
Thus, for residence times approaching zero, the average crystal volume Vg is only a function of solubility Cs
and temperature T. The value Ψ, which can be calculated
from Eqs. 22 or 24, may be a function of Cs and T. Note
that without the 1/L2 term, Eq. 15 would predict zero crystal size for zero residence time.
Nucleation versus Growth, Rn/Ri. In the model used
for these derivations, the reactant addition stream R0 is
separated in the reactor into a nucleation stream Rn and
a growth stream, Ri (Fig. 4). It is now possible to derive
the ratio of the reactant streams Rn/Ri by dividing Eqs. 9
and 12. After back-substituting for K (Eq. 6), Eq. 25 is
obtained:
Rn/Ri = 2ksγDVm2CsΨ/3kvGmRgTτL2.
(25)
Except for τ in the divisor this is equal to the second
part on the right side of Eq. 15. If we simplify to Eq. 26, it
becomes evident that the ratio Rn/Ri decreases with increasing residence time τ and with the square of the average
crystal size at steady state.
Rn/Ri ~ 1/τL2.
(26)
It appears intuitive that the nucleation should decrease
relative to growth as the residence time and the surface
area of the crystal population (proportional to L2) increase.
The value of the ratio Rn/Ri was calculated from the experimental data. From the ratio Rn/Ri and the mass balance (Eq. 2), Rn and Ri were calculated as a fraction of the
reactant addition rate R0.
(28)
This equation was used to calculate Ln values for the
different residence times.
It is now possible to back-substitute further from Eq.
25, and Eq. 29 is obtained that relates the nascent nuclei
size to the precipitation conditions and the average crystals size at steady state:
Ln3 = 2ksγDVm2CsΨL3/ (3kvGmRgTτL2 + 2ksγDVm2CsΨ)
Substituting
(27)
(29)
The terms in this equation have been defined above and
are listed in the Nomenclature section. It is apparent that
Ln is a complicated function of the reaction conditions and
of the average crystals size, which in itself is a complicated function of the same reaction variables (Eq.14).
Experimental
The present experiments were done before the present
theory was developed. If the theory had been known at
the time of the experiments, a wider range of experiments
would have been performed to determine the present results to a greater degree. Unfortunately, the author is no
longer in a position to provide additional experiments.
However, the present experiments support several important predictions of the theory and might be the starting
point for more extended work in the future. It should be
remembered that the original paper by Randolph and
Larson6a did not supply any supporting experimental results to support their model.
Silver chloride, AgCl, was precipitated in a single-stage
continuous stirred tank reactor (CSTR) system (Fig. 1). The
residence time was varied from 0.5 to 5.0 min (Table I). For
the residence time of 3.0 min the suspension density was
varied from 0.05 to 0.40 mol/l (Table II). The temperature
was held constant at 60°C. The reactor volume, flow rates,
and reactant concentrations are given in Tables I and II.
Bone-gelatin was used as the peptizing agent. The free silver ion activity {Ag+} was controlled in the reactor at pAg
6.45, where pAg = –log {Ag+}. This corresponds16 to a solubility of 6.2 × 10–6 mol Ag/l. The exit stream had the same
free Ag+ concentration as the reactor mixture. This solubility consists of the sum of concentrations of free silver
ion plus complexes of silver ions with halide ions, AgCln1–n (n
equals 1 to 4). The silver chloride precipitated in cubic
morphology. A crystal growth restrainer was added to the
output material to avoid Ostwald ripening and to preserve
the crystal size distribution. The crystal size distribution
A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides
Vol. 42, No. 4, July/Aug. 1998
359
TABLE I. Effect of Residence Time on the Average Crystal Size of AgCl Precipitated in the CSTR System*
Experiment
Residence time
Edge length
Size distribution
No.
τ min
L µm
d.r.
1
2
3
4
0.5
1.0
3.0
5.0
0.207
0.263
0.337
0.413
2.69
2.64
2.58
2.22
Crystal number
Zt × 10
12
73
36
41
37
Reactor volume
Flow rate
Addition rate
Suspension density
V0 ml
F ml/min
R0 mol/min
Mt mol/l
300
300
600
1000
50
25
20
20
0.050
0.025
0.020
0.020
0.083
0.083
0.100
0.100
* Reaction conditions: 60°C, AgNO3, NaCl 1.0 mol/l, 2.4% bone gelatin τ residence time (min), V0 reactor volume (l), F flow rate (ml/min, AgNO3 and
NaCl), R0 molar addition rate (mol/min), Mt suspension density (mol AgCl/l), L (µm) cubic edgelength, d.r. decade ratio (measure of size distribution),
Zt total crystal number in reactor.
TABLE II. Effect of Suspension Density on Average Crystal Size for Precipitations of AgCl in the CSTR System†
Experiment
1
2
3
4
5
6
Suspension density
Edge length
Size distribution
Mt
mol/l
L
µm
d.r.
0.05
0.10
0.10
0.20
0.30
0.40
0.327
0.350
0.337
0.331
0.333
0.344
2.45
2.59
2.58
2.79
(3.51)a
3.25
Crystal number
Zt x 10
12
Reactant concentration
Addition rate
C
mol/l
R0
mol/min
0.5
1.0
1.0
2.0
3.0
4.0
0.005
0.010
0.020
0.020
0.030
0.040
11
18
41
43
63
81
† Reaction conditions: 60°C, pAg 6.45, residence time τ 3.0 min; all experiments except No. 3: AgNO3, NaCl 10 ml/min, gelatin (2.4%) 80 ml/min, V0 300 ml.
Experiment No. 3: AgNO3, NaCl 20 ml/min, gel (2.4%) 160 ml/min, V0 600 ml.
C is the concentration of reactants (AgNO3 and NaCl), R 0 molar addition rate (mol/min), Mt suspension density (mol AgCl/l), L average crystal size
(cubic edgelength, µm), d.r. decade ratio (measure of crystal size distribution), Zt total crystal number in reactor;
(a) data not reliable.
of the crystal suspensions were determined using the
Joyce–Loebl disk centrifuge.14,17 This analytical method
determines the size distribution of the crystals by their
sedimentation time using Stokes’ law and the relative frequency of the crystal sizes by light scattering. This method
determines the crystal size distribution of the material
that represents the final product of the precipitation process. In situ determinations of the crystal size population
were not done because the high concentrations did not allow precise light-scattering experiments. The original data
were determined as equivalent circular diameter (ecd) and
were converted to cubic edge length (cel), where
cel = 0.86 ∗ ecd.
(30)
The crystal size distribution curves are shown in the
ecd scale. The crystal size distribution could be fitted to
the sum of two Gaussian distributions, and thus the distribution cannot be described by a single standard deviation. Thus, the crystal size distribution is given by an
empirical measure, the decade ratio (d.r.), which is defined
by the ratio of the sizes at 90 to 10% of the experimentally
determined crystal size population.
Electron micrographs (carbon replica) of AgCl crystals
precipitated in the CSTR system (5-min residence time)
are shown in Fig. 2. The results for the variation of crystal size with residence time are shown in Fig. 3 and Table
I. The dependence of crystal size and crystal size distribution on suspension density was studied for the 3.0-min
residence time (Table II). The constants used for the calculations are listed in Table III.
Results and Discussion
The new model makes a number of predictions that can
be tested with the experimental results.
• The average crystal size is independent of addition rate
and suspension density.
• The size dependence of the average crystal size on
residence time can be modeled using the equations
given.
360
Journal of Imaging Science and Technology
TABLE III. AgCl Precipitations in the CSTR System. Constants
Used in the Calculations
Constant
Value
Comment
kv
ks
γ
1.0
6.0
52.2
D
1.60*10–5
Vm
Cs
25.9
6.2*10–9
Rg
T
8.3*107
333 K/60°C
cubic
cubic
erg/cm2
(Ref. 16)
cm2/s
(Ref. 16)
cm3/mol AgCl
mol/cm3
(Ref. 16)
erg/deg mol
• The average crystal size has a final value at zero
residence time.
• The maximum growth rate, the critical crystal size, the
supersaturation ratio and the supersaturation, and the
ratio of nucleation to growth of the system can be
determined at steady state.
Crystal Size, Addition Rate, and Suspension Density.
The model predicts that the average crystal size is independent of the reactant addition rate and by implication
of the suspension density. This is supported by the results
in Table II, which show that over a range of addition rates
from 0.005 to 0.04 mol/min and a suspension density of
0.05 to 0.40 mol/l the average crystal size did not significantly vary. At the same time, the total crystal number in
the reactor, Zt, increased proportionally to the molar addition rate.
It is significant that for Experiments 2 and 3, which
have the same suspension density but vary by a factor or
two in molar addition rate, the average crystal size and
the decade ratio are not significantly different. The doubling of the addition rate leads to a doubling in the total
Leubner
crystal number. Similarly, Experiments 3 and 4 have the
same molar addition rate but vary in the suspension density by a factor of 2, while producing the same average
crystal size and total crystal number. This supports the
prediction of the theory that the crystal number is independent of suspension density and, by implication, independent of the population density.
The results in Table II show that the width of the crystal size distribution as measured by d.r. increased with
increasing suspension density. This is indicated by Experiments 2 and 3 where for Experiment 3 the molar addition
rate (and reactor volume V0) was doubled while the suspension density was held constant. The value of d.r. is the
same, indicating that the suspension density, not the molar addition rate affects the width of the crystal size distribution.
Experiments 3 and 4 have the same addition rate, but
Experiment 4 has half the reactor volume and flow rate,
so that it has twice the suspension density of Experiment
3. The experiment with the higher suspension density
(Experiment 4) has the wider crystal size distribution
which reinforces the direct relationship between suspension density and width of crystal size distribution. The
unusually wide size distribution of Experiment 5 is probably due to some undetermined experimental deviation.
The crystal size distribution is governed by two different reactions:
For crystals larger than the stable crystal size, growth
is dominated by the maximum growth rate. This part of
the crystal population can probably be described by the
Randolph–Larson model, which is based on the maximum
growth rate of the crystals.
The crystals smaller than the stable crystal size also
grow at maximum growth rate, but also disappear at some
rate by Ostwald ripening. Ostwald ripening is the process
by which larger crystals increase in size (ripen) at the expense of the dissolution of smaller ones. The critical crystal size at which a crystal has equal probability to grow or
dissolve by Ostwald ripening can be determined by the
present model and experiments.
In addition, it was determined that in controlled doublejet precipitations the maximum growth rate nonlinearly
decreases with increasing crystal size.16
It was shown that the maximum growth rate increases
under crowded conditions when the crystal population
density is very high and where the diffusion layers of the
crystals overlap.14,18,19 This effect may depend on the crystal size and the state of supersaturation in the reactor
and thus may contribute to the increase in crystal size
distribution with increasing suspension density.
Because two different reaction mechanisms, growth and
the Ostwald ripening effect, are differently effective for
the different fractions of the crystal size populations, it
can be anticipated that the shape of the resultant crystal
population might not be symmetrical. This idea is supported by the crystal size distributions shown in Fig. 3.
The result that the average crystal size is independent
of molar addition rate and suspension density allows us
to add the result from Table II to those of Table I for the
correlation of crystal size with residence time.
Crystal Size and Residence Time. The results from
Tables I and II were combined and plotted in Fig. 5. A
linear least square evaluation using L and 1.0/L2 based on
the size/τ correlation predicted in Eq. 18 results in Eq. 31:
τ = 11.88 ∗ L – 0.1026/L2,
where τ is in minutes and L in µm.
(31)
Figure 5. Crystal size of silver chloride crystals (µm cubic edge
length) as a function of residence time (min), 60°C, pAg 6.45,
2.4% gelatin.
The standard error of estimate is 0.77 and 0.0214 for
the first and second constant, respectively. The correlation coefficient is 0.9844.
In Fig. 5, the solid line is given by Eq. 31. The linear
correlation (dashed line) represents the L/τ correlation for
large crystal sizes as defined by Eq. 21.
Maximum Growth Rate, Gm. From Eqs. 31 and 18 the
maximum growth rate was calculated to Gm = 28.1∗10–3
µm/min, or 4.68 A/s. This is in good agreement with the
results by Strong and Wey16 who determined the maximum growth rate of AgCl in controlled double-jet precipitations to between 4.25 to 1.20 A/s for grain sizes
between 0.209 and 0.700 µm. This suggests that the
growth rate Gm may be obtained independent from the
continuous precipitations to be used for the calculations
in the present theory. It was determined in the experiments by Strong and Wey16 that the maximum growth
rate decreased with increasing crystal size. In the doublejet precipitations, Gm can be determined as a function of
crystal size because the crystal size distribution is very
narrow. In the present precipitations, Gm is the average
of the maximum growth of a relatively wide crystal size
distribution.
Minimum Crystal Size. This value is essentially the
same as determined for τ = 0.5 min, 0.207 µm. From the
experimental results the crystal size for zero residence time
was estimated to be 0.205 µm.
Ψ, L/Lc, S*, and Css. The parameters in Table III were used
to calculate Ψ (5.73∗103), L/Lc (~5.73∗103 , Eq. 5), the ratio
of average to critical crystal size. The accuracy of the calculated results is affected by the constants in Table III, especially the values of surface energy γ, diffusion coefficient D,
A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides
Vol. 42, No. 4, July/Aug. 1998
361
TABLE IV. Critical Crystal Size, Supersaturation and - Ratio as a Function of Residence Time
Residence time
Cubic edgelength
Critical crystal size
Supersaturation ratio
Supersaturation
µm
Lc
µm*10–5
S*
min
Css
mol/cm3 *10–8
0.5
1.0
3.0
5.0
0.207
0.263
0.337
0.413
3.61
4.59
5.88
7.21
28.1
22.3
17.6
14.6
18.0
14.5
11.6
9.65
Ψ = 5.73 * 103. For experimental conditions see Table I
TABLE V. Calculated Experimental Constants and Variables CSTR and Batch Precipitations of AgCl
Variable
max. Growth Rate
Aver./Critical
Size ratio
Supersaturation ratio
Supersaturation
CSTR
4.68
5.73*103
12.2
8.2*10–8
Batch
1.20-4.25
A/s
Variable
Gm
Reference
( b)
1.85
1.02-1.09
12.7*10–9
—
for L = 0.5 µm
mol/cm3
L / Lc
S*
Css
( a)
(a,b)
Ψ = 5.73 * 103. References for Batch Precipitations: (a) I. H. Leubner,2 (b) R. W. Strong and J. S. Wey16
TABLE VI. AgCl Crystal Nucleation and Crystal Growth Reactant Rates and Nascent Nuclei Sizes
Residence time
(min)
Cubic edgelength,L
µm
Rn/RI
Rn
% of R0
RI
% of R0
Ln
µm
% Ln
% of L
0.5
1.0
3.0
5.0
0.207
0.263
0.337
0.413
4.79
1.48
0.30
0.12
82.7
59.7
23.1
10.7
17.3
40.3
76.9
89.3
0.194
0.221
0.207
0.196
93.9
84.2
61.4
47.5
Rn = fraction of reactant input rate consumed for crystal nucleation, Ri fraction consumed for crystal growth at steady state.
The value Ln size of nascent nuclei, %Ln = % of nascent nuclei sizes versus average size.
Reaction conditions: AgCl, 60°C, pAg 6.45, 2.4% gelatin.
and solubility Cs. The calculated results thus should be
considered estimates and must be reevaluated when more
reliable data become available for the constants used.
In Table IV, the critical crystal size Lc , the supersaturation ratio S*, and the supersaturation Css during steady
state were calculated as a function of residence time τ.
The data indicate that the critical crystal sizes increase
with residence time, while the supersaturation and supersaturation ratio decrease. A theoretical supersaturation ratio of 1.60∗105 is obtained if one adds 1.0 mol/l silver
nitrate to a solution where the solubility is 6.2∗10–6 mol/l
as in the present experiments.3 In stop–flow experiments,
Tanaka and Iwasaki20 estimated that the size of the primary nuclei formed in AgCl precipitations were about
(AgCl) 8. Thus, the difference between the theoretical
(1.60∗105) and actual supersaturation ratio (14.6 to 28.1)
in this system indicates that Ostwald ripening involving
metastable nanoclusters may play a significant role in the
nucleation/growth mechanism.
The values for the present CSTR system are substantially higher than those reported for controlled batch
double-jet precipitations under the same conditions (Table
V) as reported by Leubner2 and Strong and Wey,16 who
reported values of S* of about approximately 1.02 and 1.09,
and values for L/Lc of 1.85 and 4.2. For comparison of the
systems, an average crystal size of 0.5 µm was used.
The higher values of L/Lc and S* for the CSTR versus the
batch system indicate that during steady state the balance
between maximum growth and renucleation stabilizes much
smaller critical crystal sizes than the batch double-jet precipitation. This is also supported by the wider crystal size
362
Journal of Imaging Science and Technology
distribution in the CSTR system. However, the absence of
very small crystals in the product suggests that upon removal of the high supersaturation in the product stream,
the small initial crystals rapidly dissolve by Ostwald ripening to produce the observed crystal size distributions. This
is in agreement with simulations of the effect of Ostwald
ripening on the crystal size distribution in batch precipitations by Tavare.21 Unfortunately, Tavare did not provide
experimental evidence for his simulations. The effect of
Ostwald ripening and growth restraining agents on the crystal nucleation in batch precipitations was modeled and experimentally supported by this author.22,23
Nucleation versus Growth, Rn/Ri. Using Eqs. 2 and 25,
Rn/Ri and Rn and Ri (as a percent of R0) were calculated as
a function of residence time τ and average crystal size L.
The results are listed in Table VI. The data show that at
short residence times, nucleation (high Rn) is dominant,
while at long residence times, growth (high Ri) dominates
the reaction.
Nascent Nuclei Size. The size of the nascent nuclei Ln
were calculated using Eq. 28. The results are shown in
Table VI. Note that the size of the nascent nuclei is relatively independent of residence time and of the final crystal
size, ranging between 0.194 and 0.221 µm. Interestingly, this
size range is of the same order of magnitude as calculated
for the limiting crystal size for the plug–flow reactor, 0.205
µm. As a consequence of the relatively stable size of the
nascent nuclei, their size relative to the steady state average crystal sizes decreases with increasing residence time.
Leubner
Thus, at the 0.5-min residence time, the size of the nascent nuclei is about 93.9% of the final crystal size, while
for 5.0-min residence time it is only about 47.5% of the
final crystal size.
Conclusions
A new theory of crystallization is proposed for the CSTR
or MSMPR system. The model is based on nonseeded systems with homogeneous nucleation, diffusion controlled
growth, and the nucleation model previously derived for
such systems in controlled double-jet batch precipitations.
It does not need any assumptions about size-dependent
growth (McCabe’s law).
The model predicts the correlation between the average
crystal size and the residence time, solubility, and temperature of the reaction system and allows determination
of useful factors that are experimentally hard to determine such as L/Lc the ratio of average to critical crystal
size; the supersaturation ratio S*; the supersaturation Css,
the maximum growth rate Gm, and the ratio of nucleation
to growth Rn/Ri.
Results of continuous precipitations of silver chloride
were chosen to support the predictions of the model. Silver chloride precipitations in batch double-jet precipitations indicate that the crystal growth is mainly determined
by a diffusion-controlled mechanism.2,16
The model predicts that the average crystal size is independent of reactant addition rate and suspension density, which was supported with experiments where the
molar addition rate was varied from 0.005 to 0.04 mol/min
and the suspension density from 0.05 to 0.40 mol/l. The width
of the crystal size distribution (given by the decade ratio)
increased with suspension density from 2.45 to 3.25 and
was independent of reactant addition rate. The crystal size
distribution is a factor of maximum growth rate (small and
large crystals), Ostwald ripening (small crystals), and possibly of crowded growth conditions. These insights may
be applied to modify the Randolph–Larson model to describe the size distribution of the crystal population.
The model also predicts that the average crystal size
and residence time are linearly related when the average
crystal size is significantly larger than a certain limiting
size is, which can be derived from the experimental correlations. At smaller crystal sizes, the average crystal size
is larger than predicted by the linear part of the correlation. These predictions were confirmed by the experimental results.
The model further predicts that when the residence time
approaches zero, the average crystal size approaches a limiting value larger than zero. For the present AgCl precipitations the limiting value was calculated to about 0.205 µm,
while the average crystal size varied from 0.207 to 0.413 µm
between residence times from 0.5 to 5.0 min. The condition where the residence time approaches zero is similar
to that obtained during nucleation in plug–flow reactors
and thus predicts a lower limit of average crystal size for
the CSTR and plug–flow systems of about 0.205 µm.
The model also allowed us to calculate the part of the
input reactant stream R0 is used for nucleation Rn and for
growth Ri. The ratio Rn/Ri decreased with increasing residence time from 4.79 to 0.12.
The average crystal size of the nascent (newly formed)
crystals Ln was determined for the different residence
times to vary between 0.194 to 0.221 µm. This size range
is in the range calculated for the plug–flow condition (τ →
0), 0.205 µm.
Experiments are needed to confirm the model as a function of solubility and temperature where care has to be
taken to consider that Ψ (= L/Lc – 1.0) may be a function of
solubility and temperature as shown for AgBr, AgCl1–4 and
addition rate.5
The present model may be expanded to include the effect of Ostwald ripening agents and growth restrainers
using the formalism previously applied to precipitations
in batch precipitations.22,23
While the present model was developed for homogeneous
nucleation under diffusion-limited growth conditions and
unseeded systems, it may be easily modified to model
seeded systems and systems where growth and nucleation
are kinetically, hetrogeneous, or otherwise controlled. The
present work also suggests many additional experiments
and new approaches to the evaluation of the results of continuous precipitations.
Acknowledgments. I am indebted to J. A. Budz, J. Z.
Mydlarz, J. P. Terwilliger, and J. S. Wey for reading the
manuscript and for helpful discussions and suggestions,
and to K. Marsh for editing the manuscript.
Nomenclature:
cel = cubic edge length
C
= concentration (mol/l)
Css = supersaturation (mol/cm3)
Cs = solubility (mol/cm3)
ecd = equivalent circular diameter
Ψ
= L/Lc –1.0
D
= diffusion constant (cm2/s)
F
= flow rate (ml/min)
Ft = total flow rate (l/min)
γ
= surface energy (erg/cm2)
Gm = maximum growth rate of crystal population (cm/
s, A/s, µm/s)
L
= average crystal size (µm, cm)
Lc = critical crystal size (µm, cm)
Ln = nascent (newly formed) crystal size (µm, cm)
Lx = crystal size (cm)
kv = volume shape factor, (if L is the edge length of a
cubic crystal, kv = 1.0; if L is the radius of a spherical crystal, kv = 4π/3)
ks
= surface shape factor, (if L is the edge length of a
cubic crystal, ks = 6.0; if L is the radius of a spherical crystal, ks = 4π)
N
= population density, number/(volume-length)
n0 = nuclei population density, number/(volume-length)
Mt = suspension density (mol/l)
Rg = gas constant (8.3 × 107 erg/deg mol)
R0 = addition rate of reactants (mol/s)
Rn = addition rate fraction used for nucleation
Ri = addition rate fraction used for growth (crystal size
increase)
Sm = characteristic surface (e.g., area/mol silver halide)
τ
= residence time, (s, min)
T
= temperature (K)
V0 = reaction volume
Vg = average crystal volume (cm3)
Vm = molar volume, (cm3/mol, crystals)
Zn = number of crystals nucleated at steady state
Zr = number of crystals/ml in the reactor during steady
state
Zt = total number of crystals in reactor during steady
state
References
1. I. H Leubner, R. Jagannathan, and J. S. Wey, Photogr. Sci. Eng. 24, 103
(1980).
2. I. H. Leubner, J. Imaging Sci. 29, 219 (1985).
3. I. H. Leubner, J. Phys. Chem. 91, 6069 (1987).
A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides
Vol. 42, No. 4, July/Aug. 1998
363
4. I. H. Leubner, in Final Program and Advance Printing of Paper Summaries, 45th Annual Conference of the Society for Imaging Science and
Technology, East Rutherford, NJ, p. 13 (1992).
5. I. H. Leubner, J. Imaging Sci. Technol. 37, 68 (1993).
6. A. D. Randolph and M. A. Larson, AIChE J. 8, 639 (1962); H. S. Bransom,
W.J. Dunning, and B. Millard, Disc. Faraday Soc. 5, 83 (1949).
7. A. D. Randolph and M. A. Larson, Theory of Particulate Processes, Analysis, and Techniques of Continuous Crystallization, 2nd ed., Academic
Press, San Diego, CA, 1991.
8. E. B. Gutoff, Photogr. Sci. Eng. 14, 248 (1970).
9. E. B. Gutoff, Photogr. Sci. Eng. 15, 189 (1971).
10. J. S. Wey, J. S. and J. P. Terwilliger, AIChE J. 20, 1219 (1974).
11. J. S. Wey, J. P. Terwilliger, and A. D. Gingello, Res. Discl. 14987 (1976).
12. J. S. Wey, J. P. Terwilliger, and A. D. Gingello, AIChE Symposium Series
No. 193, 76, 34 (1980).
364
Journal of Imaging Science and Technology
13. J. S. Wey, I. H. Leubner, and J. P. Terwilliger, Photogr. Sci. Eng. 27, 35
(1983).
14. I. H. Leubner, J. Imaging Sci. Technol. 37, 510 (1993).
15. J. P. Terwilliger, Eastman Kodak Company, suggested this approach to
determine the limiting conditions for the continuous precipitation model,
private communication, 1996.
16. R. W. Strong and J. S. Wey, Photogr. Sci. Eng. 23, 344 (1979).
17. T. W. King, S. M. Shor, and D. A. Pitt, Photogr. Sci. Eng. 25, 70
(1981).
18. J. S. Wey. and R. Jagannathan, AIChE J. 28, 697 (1982).
19. R. Jagannathan, J. Imaging Sci. 32, 100 (1988).
20. T. Tanaka and M. Iwasaki, J. Imaging Sci. 29, 20 (1985).
21. N. S. Tavare , AIChE J. 33, 152 (1987).
22. I. H. Leubner, J. Imaging Sci. 31, 145 (1987).
23. I. H. Leubner, J. Cryst. Growth 84, 496 (1987).
Nakamura, et al.
Carrier Transport Properties in Polysilanes with Various Molecular Weights
Tomomi Nakamura, Kunio Oka, Fuminobu Hori, Ryuichiro Oshima, Hiroyoshi Naito, and Takaaki Dohmaru*†
College of Engineering, Osaka Prefecture University, 1-1 Gakuencho, Sakai, Osaka 599-8531, Japan
† Research Institute for Advanced Science and Technology, Osaka Prefecture University, 1-2 Gakuencho, Sakai, Osaka 599-8570, Japan
Carrier transport properties in poly(methylphenylsilane) films were studied with interest focusing on the variation of Bässler’s disorder parameters with changing molecular weights. There was no apparent correlation between the charge transport properties and the
molecular density and free volume of the samples, the latter being measured by a positron annihilation technique. This result led us
to a model explaining the variation in Σ with changing molecular weights, a model which is based on a hypothesis that ca. 10 silylene
units at the end of a polymer chain do not participate in forming a σ-conjugated domain. The validity of this model was demonstrated
by UV absorption measurements. The variation of µ0 was interpreted in terms of the partial orientation of the main chains induced by
the mechanical force applied by a bar-coater. It is discussed that µ0 is sensitive to even the slightest orientation of the main chains
while the other disorder parameters are not.
Journal of Imaging Science and Technology 42: 364–369 (1998)
Introduction
Organic polysilanes with σ-conjugated Si backbones have
been extensively investigated because of their unique
physical and chemical properties.1-8 High hole drift mobilities of ~10–4 cm2/Vs at room temperature are one of
their most remarkable properties.2 In addition, it has now
been generally accepted that the carrier transport in organic polysilanes occurs via hopping through the σ-conjugated domains developed along Si main chains.3
Recently, we investigated carrier transport properties
in various organic polysilanes.9 Then we happened to obtain a result that may suggest the molecular weight dependence of hole drift mobilities. This result considerably
inspired our interest in the molecular-weight dependences of hole drift mobilities of polysilanes because it
has long been considered that drift mobilities are independent of the molecular weights of polysilanes when they
are comparatively high.4 Our first detailed study was
performed by using poly(methylphenylsilane)s with narrow molecular weight distribution. Analysis of the results
according to the disorder formalism proposed by Bässler
Original manuscript received September 8, 1997
* IS&T Member
© 1998, IS&T—The Society for Imaging Science and Technology
364
et al.10–12 indicated that the increase in the molecular
weight mainly caused the increase in the positional disorder parameters ( Σ ). 13,14 Later we prepared
poly(methylphenylsilane)s with much narrower molecular-weight distribution than before, carefully avoiding a
tailing of low molecular weight components.15,16 More
detailed re-investigation using these samples showed that
Σ values decreased with increasing molecular weights,15
being quite opposite the tendency of our first study. The
reliability of the latter result is much higher than our
first study in every point such as the purity of the
polysilane samples and the uniformity of the polysilane
films. We tentatively proposed two models to interpret
this unexpected result on the basis of the difference in
the average distance between σ-conjugated domains and
the partial orientation of the main chains.15,16
In this article, the two models were checked in the light
of an additional experimental result obtained from a
positron annihilation technique that gives information on
the microscopic inner structure of materials.
Experimental Procedure
Samples used in this experiment were six
poly(methylphenylsilane)s with various molecular weights
and with small dispersity. Their weight average molecular weights were changed by two orders from ca. 10,000 to
ca. 1,000,000.
TABLE I. The Disorder Parameters Weight and Related Characteristics for Poly(methylphenylsilane) Samples with Various
Molecular Weights
Sample
PS1
PS10
PS50
PS100
MW (×103) MW/MN
14
108
526
1144
1.27
1.28
1.41
1.67
viscosity* (mPa•s) σ(meV)
4.17
—
221.02
425.34
90.24
92.06
89.09
88.60
Σ
µ0 (cm2/Vs)
3.00
2.86
2.33
2.20
3.52 × 102
4.47 × 102
1.70 × 102
1.68 × 102
* Measured on the respective toluene solutions; — not measured .
Hole drift mobilities for the polysilane samples were measured by means of the conventional TOF (Time-of-Flight)
technique. Samples for the TOF measurements were of
sandwich type of Au/polysilane (2.33 µm to about 6.22 µm)/
bisazo compound (ca. 1 µm)/Al/PET film. Details concerning preparation of polysilane samples and the TOF measurements were described previously.15,16
Measurements of the free volume space in polysilane films
were performed by means of the positron–electron annihilation lifetime spectroscopy using the conventional fast–
fast coincidence system. Two BaF2 scintillators coupled with
photomultipliers (HF3378-01, Hamamatsu Photonics,
Shizuka, Japan) were employed to detect 1.28 MeV (birth)
and 511 keV (annihilation) γ-rays. We employed 22NaCl as
a radioactive source and a Kapton film (thickness: 25 µm)
as a cover foil to envelope the radioactive source.
Poly(methyl-phenylsilane) was casted from its toluene solution on a pure iron substrate (size: 8 × 8 × 0.2 mm3, Residual Resistance Ratio (RRR): ≥6000) and dried for 1 h at
room temperature and 2 h at 80°C. A pair of polysilane
samples was prepared and were closely put on both sides of
the radioactive source sandwiched by two Kapton foils.
Positron lifetime measurements were performed at room
temperature; 1,500,000 counts were accumulated for each
lifetime spectrum. All lifetime spectra were computer-fitted by using “POSITRONFIT” in the PATFIT-88 program
of Kirkegaard et al.17 In this calculation, source correction
terms were inputted as fixed values and subtracted from
the lifetime spectrum for each polysilane/Fe sample.
The viscosity of toluene solutions of poly(methylphenylsilane) was measured by a viscometer (LVT CP-40,
Brookfield, Eng. Lab., Stoughton, MA USA).
The density of each polysilane film was measured by a
pycnometric measurement of a NaBr aqueous solution
(density range: approximately 1.00 g/cm3 to 1.41 g/cm3)
with density equal to a particular poly(methylphenylsilane) sample.
UV absorption measurements were performed on five
polysilane solutions of an equal concentration prepared
weightwise with precision of concentration of <0.03%. Toluene was used as a solvent to decrease the evaporation of a
solvent during the preparation of the polysilane solutions.
Results and Discussion
In the present study, we attempted to analyze temperature—and the electric field—dependences of the hole drift
mobilities for each polysilane sample according to the disorder formalism of Bässler et al.12 that is expressed as Eq. 1.
µ = µ0 exp [–(2σ/3kT)2]exp[C{(σ/kT)2 –Σ 2}F1/2] (Σ > 1.5),
(1)
where µ0 is the drift mobility of a hypothetical crystalline
structure (with no disorder), σ is the energy width of
Gaussian distribution of hopping sites (σ/kT is the degree
of the energetic disorder of hopping sites), Σ is the degree of
the positional disorder of hopping sites, and C is a constant.
Table I shows the disorder parameters obtained for four
polysilane samples with various molecular weights, together with the viscosity for each toluene solution. While
the values of σ stay almost constant, Σ values obviously
decrease from 3.00 to 2.20 with increasing molecular
weights. In the previous communication,15 we tentatively
proposed a model that this variation in Σ values was ascribed to the change in the average distance between the
neighboring sites caused by the strain force arising from
the entanglement of Si main chains. But in a recent report,16 we also attempted to interpret this result by proposing the other model that the variation in Σ may be
caused by the partial orientation of the higher molecular
weight polysilanes due to the mechanical force applied by
a bar-coater.
First, we discuss the plausibility of the former model.15
The average distance between the neighboring sites mentioned above is considered to correspond to free space
called “free volume,” which is a direct measure of the inter-main chain distances. We proceeded to measure the
sizes of the free volumes in the polysilane films by means
of a positron annihilation technique18,19 and also to measure the densities of the polysilane films that supplement
the information necessary to describe the inner structure of the polysilane films.
Figure 1 shows positron lifetime spectra for poly(methylphenylsilane) samples with various molecular weights; the
lifetime spectrum for the substrate with no polysilane film
is shown in Fig. 1(D). Comparison of the two lifetime spectra in Fig. 1(D) clearly shows that new slopes appear that
are ascribed to the annihilations of positrons and positroniums in the polysilane film. Analysis of the lifetime spectra in this study shows that five-component deconvolution
gives the best fit for all samples. Two components of the
five are the source correction terms arising from the annihilation of positrons and positroniums in Kapton foils.
Thus, three components are left for polysilane films: τ1, τ2,
and τ3 in the order of lifetime value. The shortest lifetime
component (τ1) is a mixture of two components: one from
the annihilation of positrons in the substrates and the
other from the self-annihilation of para-positroniums in
the polysilane films. The middle lifetime component (τ2) is
considered to correspond to the annihilation of positrons
in microscopic space such as vacancy and chain defects in
the polysilane films and/or the annihilation of orthopositronium in the bulk of the polysilane films.20,21 The
longest lifetime component (τ3) is ascribed to the pick-off
annihilation of ortho-positroniums in free volume in the
polysilane films. We are most interested in these sizes of
free volumes (i.e., τ3 values), which are equal to the average distance between the σ-conjugated domains dwelling
on Si main chains. We estimated the free volume size as
the average free volume radii (FVR) calculated from τ3 values according to Eqs. 2 and 3, which are semiempirically
given on the basis of the assumption that a free volume
space is spherical.22
τ = 0.5 {1 – R /R 0 + sin(2πR/R0)/2π}–1,
(2)
R = R0 – ∆R (∆R = 0.166 nm),
(3)
where τ is the ortho-positronium lifetime (i.e., τ3 in the
present study) and R is the radius of a spherical free
volume.
Lifetime values (τ2 and τ3) and FVR for each polysilane
sample are summarized in Table II, together with the
polysilane densities. The values of τ2 give information on
the bulk of the polysilane film, but their detailed discussion is beyond our scope in this report. The values of τ3
Carrier Transport Properties in Polysilanes with Various Molecular Weights
Vol. 42, No. 4, July/Aug. 1998
365
TABLE II. Positron Lifetimes, Free Volume Radii (FVR) , and Densities Relating to Poly(methylphenylsilane)s with Various Molecular Weights
Sample
τ2 (ns)
τ3 (ns)
FVR (Å)
density (g/cm3)
PS1
PS10
PS50
PS100
0.64
0.65
0.71
0.75
2.27
2.19
2.21
2.26
3.09
3.02
3.04
3.09
1.10
1.10
1.10
1.10
(a)
(b)
Figure 2. Schematic illustration of the hypothesis that the relative densities of the σ-conjugated domains and the void silylene
units differ between (a) high and (b) low molecular weight
polysilane films.
Figure 1. Lifetime spectra for poly(methylphenylsilane)s with
various molecular weights. Lifetime spectrum for the Kapton
source/pure iron substrate system is shown in (D).
and, thus, FVR stay almost constant against the change
in the molecular weight by more than 2 orders. This result shows that the average free volume sizes in the
polysilane films, i.e., the average inter-chain distance, is
constant against the change in the molecular weight from
ca. 10,000 to ca. 1,000,000. Table II also shows that the
densities of the four polysilane samples are precisely equal.
Combining these two supplementary results, we can depict the inner structure of the poly(methylphenylsilane)
films; the densities of both occupied and unoccupied volumes in the polysilane films do not change against the
change in the molecular weight, meaning the invariance
of the density of silylene units in the polysilane films of
various molecular weights.
366
Journal of Imaging Science and Technology
We now propose a model to elucidate the variation in Σ
values against the change in the molecular weight shown
in Table I. According to the disorder formalism by Bässler
et al.,12 the parameter “Σ ” tells the degree of the positional disorder of the hopping sites, which in the case of
polysilanes corresponds to the σ-conjugated domains that
in turn are composed of 15 to 30 silylene units.3 The present
results show that the number of silylene units per unit
volume in the polysilane films is extremely similar despite
the difference in the molecular weight by 2 orders. At first
glance, the present results contradict the former model,
but this discrepancy would be easily avoided by assuming
that there are some void silylene units that do not participate in forming a σ-conjugated domain, as pointed out by
Hayashi et al.,23 who introduced the concept that there
are two types of silylene units in a polysilane film: one in
a σ-conjugated domain and the other between the σ-conjugated domains. We propose a hypothesis that such void
silylene units are located dominantly near the ends of
polysilane chains, on the basis of a INDO/S calculation
for (H2Si)20 by Klingensmith et al.,24 who showed that the
transition density near the end of the polysilane molecule
was much lower than that in the middle. Schematic illustration of the model on the basis of this hypothesis is shown
in Fig. 2.
The key features in this model are two-fold. One is the
spatial disposition of the σ-conjugated domains, which
we have finally figured out from the results in Table II,
that suggests similar free volume spaces among the
polysilane films of different molecular weights. The other
is the number of the σ-conjugated domains per unit volume of the polysilane films. Because the σ-conjugated
domains are generally accepted to be a chromophore for
UV absorption of polysilanes, we attempted to substantiate the latter feature by an absorbance measurement
Nakamura, et al.
Figure 3. UV absorption spectra of toluene solutions of
poly(methylphenylsilane)s with various molecular weights. Measured wavelength range is from 280 to 450 nm.
Figure 4. Molecular-weight dependence of absorbance in
poly(methylphenylsilane)s.
of five polysilane solutions with precisely equal concentration. Although the absolute value of absorbance in
solution is sometimes different from that in film, we may
assume that their relative values do not change significantly between solution and film when absorbance of
some polysilanes are measured. Figure 3 shows the absorption spectra, which is one of the three repeated measurements that gave precisely equal results in which each
spectrum is due to the typical σ–σ* transition for
polysilanes peaking at ca. 340 nm. Figure 3 reveals two
important features. One is a slight red shift of ca. 2 nm
in the absorption maximum wavelength (λmax) with increasing molecular weights. Trefonas et al.25 reported that
λmax in polysilanes abruptly increases with increasing
chain lengths at the shorter chain length region and approaches a limiting value at the chain length of 40 to 50.
But close examination of their original experimental figure suggests a slight increase in λmax even in the chain
length region from 50 to 1350. Our experiment manifests
that this slight red shift still continues even in the higher
chain length degree up to ca. 10,000 for PS100. This slight
red shift suggests that the size of the σ-conjugated domain continues to develop very slightly with increasing
chain-lengths even in the chain length of ca. 10,000. As
shown in Table I, the variation in σ values with changing molecular weights is very small (by approximately 1
to 3 meV). We reported that the values of σ were almost
constant against the change in the molecular weight in
the previous reports,15,16 but it may better be interpreted
that σ values slightly decrease with increasing molecular weights reflecting the slight variation in the distribution of the sizes of the σ-conjugated domains with
changing molecular weights. The other feature in Fig. 3
is the obvious increase in absorbance at λmax for each
polysilane sample with increasing molecular weights, indicating that the number of the σ-conjugated domains
per unit volume become larger with increasing molecular weight. This trend is more clearly seen in Fig. 4, which
depicts the variation of absorbance against the molecular weight.
This variation of the absorbance in Fig. 4 was simulated
according to Lambert–Beer’s law on the basis of the following considerations: (1) each chromophore (σ-conjugated
domain) consists of 20 silylene units, (2) there are 10 void
silylene units that do not contribute to a chromophore at
each end of a polysilane chain, (3) the value of the absorptivity per Si–Si bond is determined as 11,100 1/Mcm. In
this simulation, the number of void silylene units was taken
as a fitting parameter and determined to be 10 for the best
fit. Consideration 1 may be generally accepted because it
was reported that a σ-conjugated domain consists of 15 to
30 silylene units3 and the number of silylene units of 20 is
within this range. Consideration 3 is calculated on the basis of the experimental absorbance of PS100 by assuming
it has no void units, which is in fair agreement with the
value of 12,000 1/Mcm obtained by Harrah and Zeigler.26
The best-fit simulation curve is illustrated by the dotted
line in Fig. 4; the fitting with the experimental plots is satisfactory, which suggests the validity of this model that is
based on the hypothesis that 10 silylene units at each end
of a polysilane chain are void in forming a chromophore.
In the INDO/S calculation by Klingensmith et al. (vide
supra),24 it is reported that a single gauche link somewhere
in the all-trans backbone of (H2Si)20 separates the chromophore into two segments and the lowest energy excitation was localized on the longer segment. Applying their
model to the present case of poly(methylphenylsilane), a
conformational break predominantly occurs near the end
of a polymer chain making the end segment (the shorter
one) void in forming a σ-conjugated domain. Thus, higher
molecular weight polysilanes have smaller numbers of void
segments per unit volume, i.e., larger numbers of σ-conjugated domains per unit volume (larger absorption) leading to smaller Σ values.
Next, we check the validity of the latter model,16 which
proposes that the variation in Σ with changing molecular weights may be caused by the partial orientation of
the higher molecular weight polysilane chains due to
the mechanical force applied by a bar-coater. The key
factors for this model are the length of the main chains
(i.e., the molecular weight) and the viscosity of the
polysilane solutions used in the sample preparation for
the TOF measurements. As shown in Table I, the viscosity of the solutions obviously increases with increasing molecular weight, suggesting that the larger
mechanical force applied by a bar-coater induces the
Carrier Transport Properties in Polysilanes with Various Molecular Weights
Vol. 42, No. 4, July/Aug. 1998
367
Figure 5. Concentration dependences of the positional disorder
parameter. (• and solid line) 1,1-bis(di-4-tolylaminophenyl)cyclohexane doped bisphenol-A-polycarbonate system27; () tri-ptolylamine doped bisphenol-A-polycarbonate system28; () our
results. The values for tri-p-tolylamine doped bisphenol-A-polycarbonate system were calculated from three hole transport parameters: T0, θ, and γ.
higher degree of partial orientation for the higher molecular weight polysilane chains leading to the smaller
Σ values. We attempted to detect the main-chain orientation by polarizing-microscopic observation, but the
observation for each bar-coated sample showed no detectable orientation of Si main chains. However, we can
not abandon the possibility of the partial orientation
completely because µ0 may reflect the very slight orientation of the main chains. The values of µ 0 show a tendency to decrease with increasing molecular weights
although their variation is not always systematic. In
the definition of disorder formalism, the shape of the
hopping sites is regarded as isotropic and therefore µ 0
values reflect only the inter-site distance. But the shape
of the hopping sites (i.e., σ-conjugated domains) for
polysilanes is quite anisotropic, and therefore not only
the inter-site distance but also the shape of the hopping sites are sometimes reflected in the hypothetical
crystalline state, e.g., of oriented polysilane films. The
present tendency of µ0 may demand that not only the
void silylene units but also the effects of the partial orientation of Si main chains are to be taken into consideration. Introducing this additional effect, the number
of the main chains parallel to the direction of the carrier transport becomes smaller with increasing molecular weights, leading to the decrease in µ 0. But the
inter-site distance becomes shorter with increasing
molecular weights because of the higher density of the
σ-conjugated domain, leading to the obvious increase
in µ0. Table I is interpreted to show that the decrease in
µ0 arising from the decreased number of the main chains
parallel to the direction of carrier transport is slightly
more predominant than the increase in µ0 due to the
shorter inter-site distance. The variation in the carrier
transport properties with changing molecular weights
shown in Table I is considered to appear as a result of
the effects of both the void silylene units and the slight
partial orientation of the main chains.
368
Journal of Imaging Science and Technology
It is of interest to compare the σ-conjugated hopping sites
of polysilanes with the molecular hopping sites, the original model for deriving disorder formalism,10–12 with interest focusing on how Σ values are affected by the
concentration of the respective kinds of hopping sites. Although it was found that the variation in the carrier transport seen in Table I was caused by the void silylene units
and the slight partial orientation of the main chains, we
believe the variation in Σ is almost caused by the former
effect and not by the small orientation that is not detected
by the polarizing-microscopic observation. The latter effect
is seen only in µ0 very sensitive to the shape and direction
of the hopping sites. It is assumed that the σ-conjugated
domain concentration is 100% when all silylene units in a
polysilane film contribute to the formation of the σ-conjugated domains. If a void segment at each end of a polysilane
molecule consists of 10 silylene units according to our simulation, the σ-conjugated domain concentrations calculated
for PS100, PS50, PS10, and PS1 are 99.79%, 99.57%,
97.77%, and 82.52%, respectively. Borsenberger27 studied
the concentration dependences of the disorder parameters
for the 1,1-bis(di-4-tolylaminophenyl)cyclohexane doped
bisphenol-A-polycarbonate system and reported that the
values of Σ increased with decreasing dopant concentrations, drastically increasing from 2 to 3 against the decrease
in the concentration from 100% to 80%. Figure 5 shows the
dependence of Σ values on the concentration of the hopping
sites in polysilanes and those in two kinds of molecularly
doped polymers by Borsenberger.27,28 Although the concentration range in the present experiment is much narrower
than that adopted in the Borsenberger’s experiment,27 our
results agree with his result very well in the concentration
range from 100% to 80%. This fact seems to be obvious evidence suggesting that the disorder formalism originally
proposed for carrier transport in molecularly doped polymers can be applied to polysilanes that are typical mainchain polymers.
Summary
Carrier transport properties in poly(methylphenylsilane)s with various molecular weights were studied in
the framework of disorder formalism of Bässler et al.10–12
Conclusions in this study are summarized as follows:
1. The increase in the molecular weight of poly(methylphenylsilane)s from ca. 10,000 to ca. 1,000,000
mainly causes the lowering of the values of Σ.
2. Both the average size of free volume in the polysilane
films and the polysilane densities are invariant
against the change in molecular weight by 2 orders.
3. A model to explain the variation in Σ against the
molecular weight was proposed on the basis of the
void silylene units that do not participate in forming a σ-conjugated domain.
4. The absorbance at λmax of five polysilane solutions
with precisely equal concentration clearly increases
with increasing molecular weights, and a slight red
shift of ca. 2 nm in λmax is observed with increasing
molecular weights.
5. The partial orientation not detected by polarizingmicroscopic observation is reflected as the variance
in µ0 that is sensitive to the shape and the direction
of the hopping sites.
References
1.
2.
R. D. Miller and J. Michl, Polysilane high polymers, Chem. Rev. 89,
1359 (1987).
R. G. Kepler, J. M. Zeigler, L. A. Harrah, and S. R. Kurtz, Photocarrier
generation and transport in ff -conjugated polysilanes, Phys. Rev. B35,
2818 (1987).
Nakamura, et al.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
M. A. Abkowitz, M. J. Rice, and M. Stolka, Electric transport in silicon
backbone polymers, Phil. Mag. B 61, 25 (1990).
M. Stolka, H.-J. Yuh, K. McGrane, and D. M. Pai, Hole transport in organic polymers with silicon backbone (polysilylenes), J. Polym. Sci.
Polym. Chem. Ed. 25, 823 (1987).
M. Stolka and M. A. Abkowitz, Electric transport in glassy Si-backbone polymers, J. Non-Cryst. Solids 97/98, 1111 (1987).
M. A. Abkowitz and M. Stolka, Common features in the electronic transport behavior of diverse glassy solids, Phil. Mag. Lett. 58, 239 (1988).
M. Abkowitz, F.E. Knier, H.-J. Yuh, R. J. Weagley, and M. Stolka, Electronic transport in amorphous silicon backbone polymerts, Solid
State Commun. 62, 547 (1987).
K. Yokoyama and M. Yokoyama, Trap-controlled charge carrier transport in organopolysilanes doped with trapforming organic compounds,
Solid State Commun. 73, 199 (1990).
T. Dohmaru, K. Oka, T. Yajima, M. Miyamoto, Y. Nakayama, T. Kawamura,
and R. West, Hole transport in polysilanes with diverse side-chain substituents, Phil. Mag. B 71, 1069 (1995).
H. Bässler, Localized states and electronic transport in single component organic solids with diagonal disorder, Phys. Stat. Sol. B107, 9
(1981).
H. Bässler, Charge transport in molecularly doped polymers, Phil. Mag.
B 50, 347 (1984).
P. M. Borsenberger, L. Pautmeier and H. Bässler, Charge transport in
disordered molecular solids, J. Chem. Phys. 94, 5447 (1991).
M. Miyamoto, T. Dohmaru, R. West, and Y. Nakayama, Substituent and
molecular-weight effects on hole transport in polysilanes (in Japanese),
J. Soc. Electrophotography Jpn. 33, 209 (1994).
M. Miyamoto, Y. Nakayama, L. Han, K. Oka, R. West and T. Dohmaru,
Effect of molecular-weight on hole drift mobility in polysilanes, Proc. of
Polymer for Microelectronics, Kawasaki, Japan, 1993, p. 176.
T. Nakamura, K. Oka, H. Naito, M. Okuda, Y. Nakayama, and T. Dohmaru,
Effect of molecular weight on hole transport in polysilanes, Solid State
Commun. 101, 503 (1997).
16. T. Dohmaru, T. Nakamura, K. Oka, F. Hori, R. Oshima, Y. Nakayama, H.
Naito, and M. Okuda, Effect of molecular weight on hole transport in
polysilanes, in Proc. of IS&T’s 12th Int. Congress on Advances
in NIP Tech., IS&T, Springfield, VA, 1996, p. 471.
17. P. Kirkegaard, M. Eldrup, O. E. Mogensen, and N. J. Pedersen, Program
system for analysing positron life time spectra and angular correlation
curves, Comp. Phys. Commun. 23, 307 (1981).
18. Y. C. Jean, Positron annihilation spectroscopy for chemical analysis: a
novel probe for microstructural analysis of polymers, Microchem. J. 42,
72 (1990).
19. Richard A. Pethrick, Positron annihilation -a probe for nanoscale voids
and free volume?, Prog. Polym. Sci. 22, 1 (1997).
20. Y. Ohko, A. Uedono and Y. Ujihira, Thermal variation of free volumes
size distribution in polypropylenes. Probed by positron annihilation life
time technique, J. Polym. Sci. B, Polym. Phys. 33, 1183 (1995).
21. Y. Ujihira and H. Nakanishi, Nondestructive analysis by positron measurement (in Japanese), Radioisotop. 30, 511 (1981).
22. H. Nakanishi and Y. C. Jean, Positron and Positronium Chemistry, D. M.
Schrader and Y. C. Jean, Eds., Elsevier, Amsterdam, 1988, p. 95.
23. H. Hayashi, T. Kurando, and Y. Nakayama, Prephotobleaching process
in polysilane films, Jpn. J. Appl. Phys. 36, 1250 (1997).
24. K. A. Klingensmith, J. W. Downing, R. D. Miller, and J. Michl, Electronic
excitation in poly(di-n-hexylsilane) , J. Am. Chem. Soc. 108, 7438
(1986).
25. P. Trefonas, R. West, R. D. Miller, and D. Hofer, Organosilane high polymers: electronic spectra and photodegradation, J. Polym. Sci., Polym.
Lett. Ed. 21, 823 (1983).
26. L. A. Harrah and J. M. Zeigler, Electronic spectra of polysilanes,
Macromol. 20, 601 (1987).
27. P. M. Borsenberger, The concentration dependence of the hole mobility
of 1,1-bis(di-4tolylaminophenyl)cyclohexane doped bisphenol-Apolyearbonate, J. Appl. Phys. 72, 5283 (1992).
28. P. M. Borsenberger, Hole transport in tri-p-tolylamine doped bisphenolA-polyearbonate, J. Appl. Phys. 68, 6263 (1990).
Carrier Transport Properties in Polysilanes with Various Molecular Weights
Vol. 42, No. 4, July/Aug. 1998
369
JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998
Edge Estimation and Restoration of Gaussian Degraded Images
Ziya Telatar*† and Önder Tüzünalp‡
Ankara University Faculty of Science, Department of Electronic Engineering, 06100-Besevler, Ankara, Turkey
The blur function of a degraded image is often unknown a priority. The blur function must first be estimated from the degraded image
data before restoring the image. We propose an algorithm to address the blur estimation problem. The present algorithm based on the
estimation of restoration filter parameters by using edge information of the degraded image is presented to solve the restoration
problem of the degraded image. The information that relates the variance of the Gaussian blur kernel on degraded image is considered.
Simulation results of image restoration illustrate the performance of the proposed estimation method.
Journal of Imaging Science and Technology 42: 370–374 (1998)
Introduction
The restoration of images degraded by blur is still a central
problem in image processing. Blur can be introduced by
atmospheric turbulance, improperly focused lenses,
relative motion, or other environmental factors between
an object being photographed and an image scanner. The
restoration of degraded images differs in each case.
The problem of deblurring of images with known blur
function has been dealt with extensively in the literature.
The restoration algorithms include Fourier domain
methods (inverse filtering, 1–3 blind deconvolution, 3–5
Cepstrum,6–8 etc.) or spatial domain methods.9 In many
applications, however, the blur function is unknown.
Therefore, the estimation or identification of blur function
directly from the blurred image has been a focus of great
deal of interest. A number of techniques have been
proposed to address this problem.
Chang, Tekalp, and Erdem 10 have proposed a blur
identification algorithm in which an observed image has
been segmented into N segments by using a method for
blur identification. Reeves and Mersereau11 have used a
generalized cross validation method for blur identification.
Kayargadde and Martens 12 have used polynomial
transformations to estimate the edge parameters and
image blur.
Although several methods exist to restore degraded
images, there is still room for improvement.3 In this work
we propose a new algorithm for restoration of unknown
Gaussian blurred images using edge estimation. The
proposed algorithm follows an iterative scheme to converge
Original manuscript received October 3, 1997
* IS&T Member
† e-mail: [email protected] ; TEL: ++90 312 2126720/
1145; FAX: ++90 312 223 23 95
‡ e-mail: [email protected] ; TEL: ++90 312 2126720/
1221; FAX: ++90 312 223 23 95
© 1998, IS&T—The Society for Imaging Science and Technology
370
the blur function and then restores the degraded image
after a certain number of iterations. The following sections
present the blur model, our restoration algorithm,
experimental results, and conclusions.
Problem Identification and the Proposed Method
In this work, we address the problem of deconvolution
of unknown Gaussian degradation from an edge
estimation. To describe our techniques, we begin in this
section with a brief description of a blurred image, and
state some important properties relating to it.
Generally, a blurred image can be modeled as follows:
y(n1,n2) = g(n1,n2)*h(n1,n2) + v(n1,n2),
(1)
where the original image g(n1,n2) has been blurred by the
function h(n1,n2) with an additive noise v(n1,n2). Additive
noise may come from the imaging system independent of
the original image. Additive noise degradation model
parameters of imaging systems are known. Thus, additive
noise may be easily removed from the degraded image by
using special image processing techniques such as Wiener
filtering before the restoration. Therefore the additive
noise problem has been left out here; rather, we refer the
reader to Refs. 1 through 5 and 9 for further details.
Neglecting the additive noise, Eq. 1 can be rewritten as,
y(n1,n2) = g(n1,n2)*h(n1,n2).
(2)
Equation 2 states that degradation is the result of the
convolution between the original scenery and the blur
function.
The blur function h(n1,n2) given in Eq. 2 could have
statistically different distributions, and different models
could be identified for each distribution problem. As a
result, the restoration problem may become very complex.
In so far as the Gaussian distribution includes all other
distributions, in general, we assume that all blurring
effects have Gaussian, or normal, distribution. For
example, atmospheric turbulence, unfocused imaging
systems, motion, or evaporation effects could cause the
where, b i+1 is the convergency vector of the model
coefficients and ∆bi is the correction term that depends on
the measurements along a period.
The choice of the amplitude level of the edge points is
particularly important because the edge algorithm may
detect low noise as an edge point. To obstruct the false
edges, first, we introduced a fixed threshold (k). Then,
we chose the amplitudes up to the fixed threshold as an
edge point. Here if k is low, noise may be detected as an
edge point. If k is high, some edge points may not be
detected. Thus, the threshold level k has been chosen as
35% of the amplitude of the maximum edge pixel level.
As a result, the relation between the variance and the
edge algorithm is defined as,
σ 2 = f (∇)=
Figure 1. Edge estimation and restoration algorithm block
diagram.
original image to blur with Gaussian distribution. A
Gaussian distribution can be modeled as,
−
1
h1 (n1 , n2 ) =
2 e
2πσ
( n12 + n22 )
2σ 2
.
(3)
Equation 3 represents the filter model in which the
variance and the matrix size have an effect on the filter
performance. This model will be used for the restoration
of the blurred image. As long as the variance and the
matrix size can be appropriately arranged, a good
approximation of the original image should be obtained.
In this study, we use the gradient edge detection
method3 to estimate the filter model parameters. In a
blurred image, regions of high frequency, called edge
pixels are spread over the neighboring pixels causing the
loss of image details. Thus, the blurred image edge map
does not contain more edge lines or points than the
original one. Using this property, we can say that the
edge map of an image contains important information
about the degradation.
Figure 1 shows a block diagram for the algorithm. The
process to converge to the blur function of the degraded
image is as follows:
Step 1. Find the edge map of the actual image.
Step 2. Choose the filter model parameters, variance and
matrix size from step 1.
Step 3. Construct a filter using the parameters in step 2.
Step 4. Restore the degraded image.
Step 5. Find the edge map of step 4.
Step 6. Compare step 5 and previous edge map. If step 5
> the previous edge map, filter parameters are
actual, else filter parameters are the previous.
Step 7. Choose the next value of the filter parameters.
Step 8. Construct a new filter and repeat steps 5
through 8.
After a certain number of iterations, the best edge map
gives the best filter model parameters used for designing
the restoration filter. (Note that different variances are
used for a fixed matrix size of filter model in the algorithm.
In other words, the variance is searched for a fix blur
matrix size, then the matrix size is changed.) There for
each iteration step:
bi+1 = bi + ∆bi[n1(i + 1); n2(i + 1)],
(4)
Edge Estimation and Restoration of Gaussian Degraded Images
{ ∑∑∇[ y(n ; n )≥ k ]}
1
2
max
.
(5)
where σ2 is the variance, ∇ is the gradient operator, and y is
the blurred image. Equation 5 is used to compute
appropriate matrix size and variance for the restoration
filter model.
Having estimated the filter model parameters, we use
the Fourier domain Cepstrum transform for the filtering.
The Cepstrum algorithm has been extensively used for
image processing applications and its features are well
documented in the literature.3,6–8 Fourier domain Cepstrum
transform of Eq. 2 is obtained as,
y′(ω1,ω2) = g′(ω1,ω2) + h′(ω1,ω2).
(6)
Equation 6 shows that the blur function is decomposed into
a sum of original scenery of the image component and blur
effect component by using the Fourier domain Cepstrum
transformation.
Let the designed filter be h1(n1,n2) after the iterations and
the filtering process in Cepstrum domain as,
y′(ω1,ω2) = g′(ω1,ω2) + h′(ω1,ω2) – h′1(ω1,ω2)
(7)
Minimizing of the error between the blur function model
parameters and the constructed filter model parameters
presents the improvement of the quality in the image.
Mean squared prediction error is then computed to
obtain the restoration error:
E = ∑∑ [ g(n1 , n2 ) − ynew (n1 , n2 )] ,
n n
2
1
(8)
2
where g(n1,n2) and ynew(n1,n2) are the original and restored
images, respectively. The energy of the original signal E1
is defined by
E1 = ∑∑ [ g(n1 , n2 )] .
n n
2
1
(9)
2
To evaluate the improvement in the restored image, we
combine Eqs. 8 and 9,
I = 20 log10
E1
.
E
(10)
Experimental Results
The performance of the proposed algorithm has been
investigated with three different types of blurred images.
The restoration results are presented in this section with a
200 × 200 pixel simulated child image, a real world degraded
Vol. 42, No. 4, July/Aug. 1998
371
Figure 2. (a) Blurred child image with variance 13.75 (left above); (b) edge map of (a) (right above); (c) restored image by filter not
estimated correctly from degraded image (middle left); (d) edge map of (b) (middle right); (e) resulting image from (e), (left below); (f)
edge map of result of iterations (right below).
TABLE I. Mean Square Error Measurement in Blurred and
Restored Child Image
Image
Child
(Fig.-2.)
Estimated
Variance
5.5 (7 × 7 pix.)
10.25 (13 × 13 pix.)
13.75 (13 × 13 pix.)
MSE in
blurred image
1145.3
2400.0
3154.1
MSE in
Improvement
restored image
(dB)
4.811
4.988
6.463
75.8
75.5
73.3
photographic image, and a real world satellite image. The
proposed algorithm estimates the size and variance of blur
function from the blurred image and restores it.
Figures 2(a) and 2(b) show the Gaussian blurred child
image and its edge map in which image details have been
lost. It is seen that not enough edge pixels are on the edge
map. Figures 2(c) and 2(d) show restoration results from
an incorrectly estimated filter parameter. So, the image is
still not very clear. Figures 2(e) and 2(f) present the restored
image and its edge map. The threshold operation prevents
the edge pixels from detection of small noise as an edge
point. The performace of the restoration is shown by the
improvement in the image quality in Table I.
The algorithm has also been applied to real life degraded
images and considerable improvement has been observed
in the resulting images. Figures 3(a) and 3(b) show an
original photograph image and its edge map. The image
was blurred by out of focus lenses. Degradation again has
a Gaussian distribution. The restoration result of the
image is depicted in Fig. 3(c) and its edge map in Fig. 3(d).
Table II also presents the improvement in image quality.
Figures 4(a), 5(a), and 6(a) show some real world satellite
images taken by the Hubble Space Telescope. These images
372
Journal of Imaging Science and Technology
TABLE II. Restoration Results in the Real World Images
Images
Estimated Variance
Fig. 3
Fig. 4
Fig. 5
Fig. 6
5
3
7
4
Improvement
26.6
28.7
26.4
30.15
Image type
Photographic
Satellite
Satellite
Satellite
have been degraded by atmospheric turbulence. Thus,
some details on the images have been lost. The restored
images are shown in Figs. 4(c), 5(c), and 6(c) respectively.
Improvements in all of the satellite images are considerable as given in Table II.
Conclusion
This paper develops a new restoration algorithm for
unknown Gaussian degraded images. Variance and matrix
size of the convolutional Gaussian effect are estimated and
accordingly the blurred image is restored. Our experimental
results show that the proposed method performs effective
restoration for degraded images. If the original scene has
only been degraded by the blur function as in the case of
the simulated child image, the restoration result is
satisfactory. However, there are some unmeasured
observation effects in real world images and these effects
cannot be controlled. Our filter model partially compensates
unmeasured observation effects but not all. So the
restoration results and improvements for the real world
images given in Table II are not as high performance as in
the simulated image.
Telatar and Tüzünalp
Figure 3. (a) A real world photograph image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d)
edge map of (c).
Figure 4. (a) A real world satellite image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d) edge
map of (c).
Edge Estimation and Restoration of Gaussian Degraded Images
Vol. 42, No. 4, July/Aug. 1998
373
Figure 5. (a) A real world satellite image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d) edge
map of (c).
Figure 6. (a) A real world satellite image (left); (b) restored image from (a) (right).
The algorithm needs 125 iteration steps with 43 min
required on a 90-MHz Pentium computer. If the matrix
size is chosen properly, iteration steps can be reduced to
20 with very low computing time, which can be important
for real-time applications. Under these conditions,
restoration quality may decrease, however.
References
1. B. L. McGlamery, Restoration of turbulence-degraded images, J. Opt.
Soc. Amer. 57, 293 (1967).
2. G. M. Robbins and T. S. Huang, Inverse filtering for linear shift variant
imaging systems, Proc. IEEE 60, 862 (1972).
3. J. S. Lim, Two Dimensional Signal and Image Processing, Prentice Hall,1990.
4. J. W. Goodman, Introduction to Fourier Optics, McGraw Hill, 1968.
5. S. C. Pohlig, New techniques for blind deconvolution, Opt. Eng. 20, 281
(1981).
6. D. E. Dudgeon, The computation of two dimensional Cepstra, IEEE Trans.
Acoust. Speech Signal Proc. ASSP-25, 476 (1977).
374
Journal of Imaging Science and Technology
7. J. K. Lee, M. Kabrisky, M. E. Oxley, S. K. Rogers, and D. W. Ruck, The
complex Cepstrum applied to two-dimensional images, Patt. Recog. 26,
1579 (1993).
8. P. A. Petropulu and C. L. Nikias, The complex Cepstrum and Bicepstrum:
Analytic performance evaluation in the presence of Gaussian noise,
IEEE Tran. Acoust. Speech Signal Proc. 38, 1246 (1990).
9. O. Shunichiro, Restoration of images degraded by motion blur using
matrix operators, Int. J. Systems Sci. 22, 937 (1991).
10. M. M. Chang, A. M. Tekalp and A. T. Erdem, Blur identification using
bispectrum, IEEE Trans. Signal Proc. 39, 2323 (1991).
11. S. J. Reeves and R. M. Mersereau, Blur identification by the method of
generalized cross-validation, IEEE Trans. Image Proc. 1, 301 (1992).
12. V. Kayargadde and J. B. Martens, Estimation of edge parameters and
image blur using polynomial transforms, CVGIP 56, 442 (1994).
13. G. Demoment, Image reconstruction and restoration: Overview of common estimation structures and problems, IEEE Trans. Acoust. Speech
Signal Process. 37, 2024 (1989).
14. Z. Telatar, Adaptive restoration of blurred satellite images, in ISI 51st
Session of the International Statistical Institute, Book 2, 499, (1997).
15. Y. Ando, A. Hansuebsai and K. Khantong, Digital restoration of faded
color images by subjective method, J. Imag. Sci. Tech. 41, 259 (1997).
Telatar and Tüzünalp
The Business Directory
Electronic Imaging
and
Color Reproduction
Business Directory Ad
Journal of Imaging
Science and Technology
Burt Saunders
Consultant
8384 Short Tract Rd.
Nunda, NY 14517
716-468-5013
IS&T Mbrs. U.S. $100: Non-Mbrs. U.S. $150
Contact: Pam Forness
TO THE NON-IMPACT
PRINTING INDUSTRY
Six Issues
IS&T, 7003 Kilworth Lane
Springfield, VA 22151
703-642-9090: FAX: 703-642-9094
E-mail: [email protected]
AUTOMATE
COLOR & DENSITY MEASUREMENTS
of Proofs and Calibration Sheets
X-Y scanning stages and software for measuring sheets up to 30′′ × 40′′ with the handheld instruments that you presently use to
make measurements manually
David L. Spooner, PE
rhoMetric Associates, Ltd.
Business: 978-448-5485
Home:
978-448-5583
Edgar B. Gutoff, Sc.D., P.E.
PHOTOTHERMOGRAPHY
Consulting Chemical Engineer
194 Clark Road, Brookline, MA 02146
Phone/Fax: 617-734-7081
John Winslow, Consultant
•13 years experience with Fortune 500
Company
•Specializing in Silver Organic/Silver
Halide Systems
• Coating Seminars
Co-authored Coating and Drying Defects (1995),
Modern Coating and Drying Technology (1992),
and The Application of SPC to Roll Products (1994).
WARREN SOLODAR
Consulting Chemist
consulting in:
• electrostatics • static problems
• electrographics • xerographics • dielectric materials
556 Lowell Rd.
Groton, MA 01450
• Consulting in slot, slide, curtain, and roll
coating; in coating die design, and in
drying technology.
• PATENT SEARCHES
• EXPERT WITNESS
• MATERIALS SOURCING
Please direct your inquiries
in confidence to:
ELECTROSTATIC CONSULTING ASSOCIATES
Telephones
(302) 754-9045• FAX (302) 764-5808
• INK JET INKS
• DYES
• PIGMENTS
PAUL J. MALINARIC
consulting engineer
Call to Inquire for Fax number
2918 N. Franklin Street
Wilmington, DE 19802-2933
• Drying Software
CONSULTING CHEMIST
480 Montgomery Avenue
Merion Station, PA 19066-1213
Phone or Fax: 610-664-5321
EPPING GmbH—
PES Laboratorium
R & D & E for Electrostatic Powder Physics
Measurement and Sale of Devices for
Charge, Conductivity and Magnetic
Parameters for Powders.
Contact:
Send inquires to: John Winslow
Phototherm Consulting Ltd.
407 Orchard Lane
So. St. Paul, MN 55075
612-450-1650
Andreas J. Kuettner
Carl-Orff-Weg 7
85375 Neufahrn bei Freising
Germany
TEL +49-8165-960 35
FAX +49-8165-960 36
E-mail [email protected]
Positions Available and
Positions Wanted
Can now be found on the IS&T homepage
Please visit us at http://www.imaging.org
click on
EMPLOYMENT
OPPORTUNITIES
IS&T
Vol. 42, No. 4, July/Aug. 1998
375
1999 IS&T HONORS AND AWARDS NOMINATION
(It is important that the Committee have complete and accurate information when making selections.
Please complete the questionnaire thoroughly and accurately.)
Name_________________________________________________________________________________________
Employer___________________________________________Current Position______________________________
Home Address____________________________________________________Phone#________________________
Business Address__________________________________________________Phone#________________________
Recommended Honor/Award______________________________________________________________________
How does this person qualify for the recommended honor/award?__________________________________________
______________________________________________________________________________________________
______________________________________________________________________________________________
Place and date of birth_____________________________________________________________________________
Education (include year and school for degrees)_______________________________________________________
______________________________________________________________________________________________
______________________________________________________________________________________________
Brief employment history_________________________________________________________________________
______________________________________________________________________________________________
______________________________________________________________________________________________
IS&T member?______________________________________________________How long?__________________
Current or previous positions______________________________________________________________________
Previous IS&T honors/awards_____________________________________________________________________
Other organization memberships/honors_____________________________________________________________
______________________________________________________________________________________________
Please attach a bibliography. (Any additional information about the candidate may be submitted on an additional sheet.)
Send this form to: IS&T
7003 Kilworth Lane
Springfield, VA 22151
703/642-9090; Fax: 703/642-9094
E-mail: [email protected]
Nominated by
____________________________________________
Name
Date
____________________________________________
Address
____________________________________________
____________________________________________
Deadline: January 1, 1999
376
Journal of Imaging Science and Technology
Voice
FAX
____________________________________________________________________________
E-MAIL
IS&T Honors and Awards—Call for Nominations
Each year IS&T Honors and Awards committee selects scientists, engineers, educators and students who have made outstanding contributions to the field of imaging. Your nominations are
needed. A list of prior recipients may be found on the Society’s web site—www.imaging.org.
Honorary Member
Honorary membership, the highest award bestowed by the Society, recognizes outstanding contributions to
the advancement of imaging science or engineering.
Edwin H. Land Medal
The Edwin H. Land Medal is endowed by Polaroid Corporation and awarded in alternate years by IS&T and
the Optical Society of America. The award recognizes an individual who has demonstrated, from a base of
scientific knowledge, pioneering entrepreneurial creativity that has had major public impact.
Chester F. Carlson Award
The Chester F. Carlson Award, sponsored by Xerox Corporation, Webster Research Center, was awarded for
the first time in 1985. The award has been established to recognize outstanding work in the science or technology
of electrophotography.
Lieven Gevaert Medal
The Lieven-Gevaert Award, sponsored by Bayer Corporation/Agfa Division, recognizes outstanding
contributions in the field of silver halide photography.
Kosar Memorial Award
The Kosar Memorial Award, sponsored by the Tri-State Chapter, recognizes contributions in the area of
unconventional imaging.
Raymond C. Bowman Award
The Raymond C. Bowman Award is sponsored by the Tri-State Chapter. The award is given in recognition of
an individual who has been instrumental in fostering, encouraging, helping, and otherwise facilitating
individuals, either young or adult, in the pursuit of a career, beginning with an appropriate education, in the
technical-scientific aspects of photography or imaging science.
Fellowship
Fellowship is awarded to a Regular Member for outstanding achievement in imaging science or engineering.
Senior Membership
Senior Membership is awarded for long term service to the Society at the national level.
Journal Award (Science)
The Journal Award recognizes an outstanding contribution in the area of basic science, published in the
Journal of Imaging Science and Technology during the preceding year.
Charles E. Ives Award (Engineering)
The Charles E. Ives Award, sponsored by IS&T’s Rochester Chapter, is given in recognition of an outstanding
contribution published originally in the Journal of Imaging Science and Technology during the preceding
calendar year. The publication should be in the general area of applied science or engineering, concerned with
the successful application of scientific and engineering principles to an imaging problem or with a technical
problem solved with imaging technology.
Itek Award
The Itek Award is for an outstanding original student publication in the field of imaging science and engineering.
Service Award
The Service Award is given in recognition of service to a Chapter, or to the Society.
Vol. 42, No. 4, July/Aug. 1998
377
IS&T
Recent Progress
Series
Keeping up with the latest technical information is a task that
becomes increasingly difficult.
This is not only caused by the
large amount of information, but
also by its dispersed distribution
at a variety of conferences. The
“Recent Progress” series collects, through the eyes of the Society for Imaging Science and
Technology, technical information from several conferences
and publications into a concise
treatise of a subject. This series
allows the professional to stay
up-to-date and to find the relevant data in the covered field
quickly and efficiently.
Now Available
• Recent Progress in Color
Management and Communications 1998 (Mbr. $65;
Non-Mbr. $75)
• Recent Progress in Color
Science 1997 (Mbr. $65;
Non-Mbr. $75)
• Recent Progress in Toner
Technology 1997 (Mbr. $65;
Non-Mbr. $75)
• Recent Progress in Ink-Jet
Technologies 1996 (Mbr. $55;
Non-Mbr. $65)
• Recent Progress in Digital
Halftoning 1995 (Mbr. $55;
Non-Mbr. $65)
Plus shipping and handling:
$4.50 U.S.;
$8.50 outside the U.S.A.
Contact IS&T
to order Today!
Phone: 703-642-9090
Fax: 703-642-9094
E-mail: [email protected]
www.imaging.org
378
Journal of Imaging Science and Technology
IS&T’s NIP14:
International
Conference on
Digital
Printing
Technologies
October 18–23, 1998
The Westin Harbour Castle Hotel
Toronto, Ontario, Canada
General Chair: Dr. David Dreyfuss
Lexmark International, Inc.
Come and meet with us in Toronto for IS&T's NIP14: International Conference on Digital Printing Technologies. Over
the years, the NIP Conferences have emerged as the preeminent forum for discussion of advances and directions in the
field of non-impact and digital printing technologies. A comprehensive program of more than 170 contributed papers
from leading scientists and engineers, is planned along with
daily keynote addresses, an extensive program of tutorials,
a print gallery and an exhibition of digital printing products,
components, materials and equipment. Following the presentations each day, the authors will be available for oneon-one discussions. The proposed program topics are:
• Electrostatic Marking
Processes
• Electrostatic Marking
Materials
• Photoreceptors
• Ink-Jet Processes
• Ink-Jet Materials
• Media for Digital
Printing
• Print and Image Quality
• Color/Science/Image
Processing
• Advanced and Novel
Printing
• Desktop, Commercial and
Industrial InkJet Printing
• Thermal Printing
• Liquid Toner Processes
and Materials
• Electrography and
Magnetography
• Textile and Fabric
Printing
• Production Digital
Printing
• Quality Control
Instrumentation
• Wide and Grand Format
Printing
• Enabling Technologies
behind Recent Major
Product Announcements
The Society for Imaging Science and Technology
7003 Kilworth Ln., Springfield, VA 22151
703-642-9090; FAX: 703-642-9094;
E-Mail: [email protected]
Vol. 42, No. 4, July/Aug. 1998
379
Preliminary Program now available
September 7-11, 1998 • University of Antwerp (UIA) Belgium
Secretariat: Jan De Roeck c/o Agfa-Gevaert N.V., Septestraat 27, B-2640
Mortsel, Belgium. Tel: +32(1)3 444.88.78; Fax: +32(1)3 444.88.71; e-mail:
icps.be; http://www.icps.be
Conference sessions in three tracks:
I.
Nanostructured Materials for Imaging
Symposium on Advanced Characterization Techniques for Nanostructured Materials
Symposium on Environmental Issues for Imaging Systems
II. Printing and Non-AgX Imaging Systems
Symposium on Textile Printing and Related Industrial Applicatiosn
III. Electronic Imaging
Symposium on Information Technology: A new Century for Medical Imaging
Short Courses:
Digital Image Preservation
Ink Jet Printing: The Basics of Photorealistic Quality
Colour in Multimedia
Comparing the Imaging Physics of Film and CCD Sensors
Electronic Imaging Systems Fundamentals
Modern Display Technologies
Questions and Answers to Colour
Image Processing in Computed Radiography
380
Journal of Imaging Science and Technology
Franziska Frey
Annette Jaffe
Lindsay MacDonald
Michael Kriss
Nitin Sampat
Patrick Vandenberghe
Jean-Pierre Van de Capelle
Piet Vuylsteke
IS&T—The Society for Imaging Science and Technology
7003 Kilworth Lane, Springfield, VA 22151
President
ROBERT GRUBER, Xerox Corporation, 800 Phillips Road, W114-40D, Webster, NY 14580
Voice: 716-422-5611
FAX: 716-422-6039
e-mail: [email protected]
Executive Vice President
JOHN D. MEYER, Hewlett Packard Laboratories, 1501 Page Mill Rd., 2U-19, P.O. Box 10490, Palo Alto, CA 94304
Voice: 650-857-2580
FAX: 650-857-4320
e-mail: [email protected]
Publications Vice President
REINER ESCHBACH, Xerox Corporation, 800 Phillips Road, 0128-27E, Webster, New York 14580
Voice: 716-422-3261
FAX: 716-422-6117
e-mail: [email protected]
Conference Vice President
WAYNE JAEGER, Tektronix, M/S 61-IRD, 26600 S. W. Parkway, Wilsonville, OR 97070-1000
Voice: 503-685-3281
FAX: 503-685-4366
e-mail: [email protected]
Vice Presidents
JAMES KING, Adobe Systems Inc., 345 Park Ave., MS: W14, San Jose, CA 95110-2704
Voice: 408-536-4944
FAX: 408-536-6000
e-mail: [email protected]
JAMES R. MILCH, Eastman Kodak Company Research Labs, Bldg. 65, 1700 Dewey Ave., Rochester, New York 14650
Voice: 716-588-9400
FAX: 716-588-3269
e-mail: [email protected]
W. E. NELSON, Texas Instruments, P. O. Box 655474, MS 63, Dallas, TX 75265
Voice: 972-575-0270
FAX: 972-575-0090
e-mail: [email protected]
SHIN OHNO, Sony Corporation, Business & Professional Systems Co., 4-14-1 Okata, Atsugi 243, Kanagawa 243-0021, Japan
Voice: 81-462-27-2373
FAX: 81-462-27-2374
e-mail: [email protected]
MELVILLE R. V. SAHYUN, Department of Chemistry, University of Wisconsin, Eau Claire, WI 54702
Voice: 715-836-4175
FAX: 715-836-4979
e-mail: [email protected]
DEREK WILSON, Coates Electrographics, Ltd., Norton Hill, Midsomer Norton, Bath, BA3 4RT, England
Voice: 44-1761-408545
FAX: 44-1761-418544
e-mail: [email protected]
Secretary
BERNICE ROGOWITZ, IBM Corp., T. J. Watson Research, P. O. Box 704, M/S H2-B62, Yorktown Heights, NY 10598-0218
Voice: 914-784-7954
FAX: 914-784-6245
e-mail: [email protected]
Treasurer
GEORGE MARSHALL, Lexmark International, Inc., 6555 Monarch Rd., Dept. 57R/031A, Boulder, CO 80301
Voice: 303-581-5052
FAX: 303-581-5097
e-mail: [email protected]
Immediate Past President
JAMES OWENS, Eastman Kodak Company, Research Labs., M.C. 01822, Rochester, NY 14650
Voice: 716-477-7603
FAX: 716-477-0736
e-mail: [email protected]
Executive Director
CALVA LEONARD, IS&T, 7003 Kilworth Lane, Springfield, VA 22151
Voice: 703-642-9090
FAX: 703-642-9094
e-mail: [email protected]
Chapter Officers
BINGHAMTON, NEW YORK (BI)
Bruce Resnick, Director
Albert Levit, President
Christopher Turock, Secretary
ROCHESTER, NEW YORK (RO)
Joanne Weber, Director
Dennis Abramsohn, President
Joanne Weber, Secretary
TRI-STATE
James Chung, Director
Frederic Grevin, President
Robert Uzenoff, Secretary
BOSTON, MASSACHUSETTS (BO)
Lynne Champion, Director
Jeffrey Seideman, President
Jim Boyack, Secretary
ROCHESTER INSTITUTE OF
TECHNOLOGY (RT)
Faculty Advisors
Zoran Ninkov and Jonathan Arney
TWIN CITIES, MINNESOTA (TC)
Stan Busman, Director
Jeanne Haubrich, President
Susan K. Yarmey, Secretary
EUROPE (EU)
Hans Jörg Metz,
Director/President
Open, Secretary
RUSSIA (RU)
Michael V. Alfimov,
Director/President
T. Slavnova, Secretary
WASHINGTON, DC (WA)
Joseph Kitrosser, Director
Open, President
Open, Secretary
KOREA (KO)
J.-H. Kim, Director
Young S. You, President
TOKYO, JAPAN (JA)
Yoichi Miyake, Director
Tadaaki Tani, President
Takashi Kitamura, Secretary
IS&T Corporate Members
The Corporate Members of your Society provide a significant amount of financial support that assists IS&T in disseminating information and
providing professional services to imaging scientists and engineers. In turn, the Society provides a number of material benefits to its Corporate
Members. For complete information on the Corporate Membership program, contact IS&T, 7003 Kilworth Lane, Springfield, VA 22151.
Sustaining Corporate Members
Applied Science Fiction
8920 Business Park Drive
Austin, TX 78759
Imation Corporation
1 Imation Place
Oakdale, MN 55128-3414
Tektronix, Inc.
P.O. Box 4675
Beaverton, OR 97076-4675
Eastman Kodak Company
343 State Street
Rochester, NY 14650
Lexmark International, Inc.
740 New Circle Road NW
Lexington, KY 40511
Xerox Corporation
Webster Research Center
Webster, NY 14580
Hewlett Packard Labs.
1501 Page Mill Road
Palo Alto, CA 94304
Polaroid Corporation
P.O. Box 150
Cambridge, MA 02139
Supporting Corporate Members
Konica Corporation
No. 1 Sakura-machi
Hino-shi, Tokyo 191 Japan
Kodak Polychrome Graphics
401 Merritt 7
Norwalk, CT 06851
Xeikon N.V.
Vredebaan 71
2640 Mortsel, Belgium
Donor Corporate Members
Agfa Division Bayer Corp.
100 Challenger Road
Ridgefield Park, NJ 07760
BASF Corporation
100 Cherry Hill Road
Parsippany, NJ 07054
Canon , Inc.
Shimomaruko 3-30-2
Ohta-ku, Tokyo 146 Japan
Clariant GmbH
Division Pigments & Additives
65926 Frankfurt am Main Germany
Delphax Systems
Canton Technology Center
5 Campanelli Circle
Canton, MA 02021
Felix Schoeller Jr. GmbH & Co. KG
Postfach 3667
49026 Osnabruck, Germany
Fuji Photo Film USA, Inc.
555 Taxter Road
Elmsford, NY 10523
Fuji Xerox Company Ltd.
3-5 Akasaka, 3-chome
Minato-ku, Tokyo 107 Japan
Hallmark Cards, Inc.
Chemistry R & D
2501 McGee, #359
Kansas City, MO 64141-6580
Hitachi Koki Co., Ltd.
1060 Takeda, Hitachinaka-City
Ibaraki- Pref 312 Japan
KDY Inc.
9 Townsend West
Nashua, NH 03063
Ilford Photo Corporation
West 70 Century Road
Paramus, NJ 07653
Kind & Knox Gelatin, Inc.
P.O. Box 927
Sioux City, IA 51102
Minolta Co., Ltd.
1-2, Sakuramachi
Takatsaki, Osaka 569 Japan
Mitsubishi Electric
5-1-1 Ofuna, Kamakura
Kanagawa 247 Japan
Nitta Gelatin NA Inc.
201 W. Passaic Street
Rochelle Park, NJ 07662
Questra Consulting
300 Linden Oaks
Rochester, NY 14625
Research Laboratories of Australia
7, Valetta Road, Kidman Park
S. Australia, 5025, Australia
Ricoh Company Ltd.
15-5, Minami-Aoyama
1-chome, Minato-ku, Tokyo 107 Japan
SKW Biosystems, Inc.
2021 Cabot Boulevard West
Langhorne, PA 19047
Sharp Corporation
492 Minosho-cho
Yamatokoriyama 639-11 Japan
Sony Corporation
6-7-35 Kita-shinagawa
Shinagawa, Tokyo 141 Japan
Sony Electronic Photography &
Printing
3 Paragon Drive
Montvale, NJ 07645
Trebla Chemical Company
8417 Chapin Ind. Drive
St. Louis, MO 63114
XMX Corporation
46 Manning Road
Billerica, MA 01821-3944