automatic generation of tactile graphics

Transcription

automatic generation of tactile graphics
AUTOMATIC GENERATION OF TACTILE GRAPHICS
by
Thomas P. Way
A thesis submitted to the Faculty of the University of Delaware in partial
fulllment of the requirements for the degree of Master of Science in Computer and
Information Sciences
Fall 1996
c 1996 Thomas P. Way
All Rights Reserved
AUTOMATIC GENERATION OF TACTILE GRAPHICS
by
Thomas P. Way
Approved:
Approved:
Approved:
Approved:
Kenneth E. Barner, Ph.D.
Professor in charge of thesis on behalf of the Advisory Committee
Errol L. Lloyd, Ph.D.
Professor in charge of thesis on behalf of the Advisory Committee
Errol L. Lloyd, Ph.D.
Chairman of the Department of Computer and Information Sciences
John C. Cavanaugh, Ph.D.
Interim Associate Provost for Graduate Studies
ACKNOWLEDGMENTS
Work on this project was performed at the University of Delaware's Applied Science and Engineering Laboratories, operated jointly with and located at
the Alfred I. duPont Institute, in Wilmington, Delaware. As part of the \Science,
Engineering and Mathematics" project, funding was provided by the National Science Foundation, grant number HRD-9450019. Additional funding was provided by
the Nemours Research Programs.
I thank Dr. Barner and Dr. Richard Foulds for their wisdom and insight,
and for providing me with the opportunity to perform graduate work at the Applied Science and Engineering Laboratories. Thanks to Dr. Lloyd for his generous
support and encouragement, particularly as the deadline approached. Heartfelt appreciation goes to Dr. Lori Pollock for her guidance and moral support. Special
thanks is extended to my colleagues in the SEM project and at the Applied Science
and Engineering Laboratories for their suggestions, criticism and praise. Sincere
appreciation is extended to the fourteen brave souls who gave of their time to serve
as \human lab rats" in the two incarnations of my experiments.
This thesis and the hours of research and intense study it represents would
not have been possible without the encouragement and nancial support of my
parents, Stan and Laurie Way, my grandmothers, Ellen B. Way and Margaret M.
Pelland, and my mother-in-law, Mary B. Larsen. The caring and support of my
iii
family, from my brother John, sisters Melinda and Julie, and brother-in-law Ray, to
my extended family that is sprinkled about the country in California, Connecticut,
Illinois, Maine, Maryland, Michigan, Virginia, and Washington, has been a great
source of strength throughout this endeavor. Thanks also to my cats Bess, George,
Harriet and Onessa for always making sure that my clothing was coated with plenty
of cat hair at the start of each day.
Thanks to my daughter Emma for reminding me daily that there are things
far more important than research, reading lists and report cards. Finally, a special
thank you is long overdue to my wife Laura, for her love, understanding and strength
in the face of this unpredictable and arduous journey through graduate school. I
appreciate it more than you can know.
iv
TABLE OF CONTENTS
LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : x
LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiv
ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xvi
Chapter
1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
2 BACKGROUND : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10
2.1 The Human Sensory System : : : : : : : : : : : : : : : : : : : : : : : 11
2.1.1 Information and Sense : : : : : : : : : : : : : : : : : : : : : : 11
2.1.2 Bandwidth Comparison : : : : : : : : : : : : : : : : : : : : : 11
2.2 Tactual Perception : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13
2.2.1
2.2.2
2.2.3
2.2.4
Cutaneous Sensing : : : :
Spatial Sensing : : : : : :
Tactile Pattern Perception
Aiding Comprehension : :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
13
14
15
16
2.3 The Blind Population : : : : : : : : : : : : : : : : : : : : : : : : : : : 18
2.3.1 Denition of Terms : : : : : : : : : : : : : : : : : : : : : : : : 18
2.3.2 Misconceptions : : : : : : : : : : : : : : : : : : : : : : : : : : 18
v
2.3.3 The Blind Computer User : : : : : : : : : : : : : : : : : : : : 19
2.4 Access Technology for Blind Computer Users : : : : : : : : : : : : : : 20
2.4.1 Static Tactile Graphics : : : : : : : : : : : : : : : : : : : : : : 21
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
23
23
23
23
24
24
25
28
28
Auditory Interfaces : : : : : : : : : : : : : : : : : : :
Dynamic Tactile Interfaces : : : : : : : : : : : : : : :
Haptic Interfaces : : : : : : : : : : : : : : : : : : : :
Dynamic Tactile Display Research : : : : : : : : : : :
Moving Toward Eective Tactile Display of Graphics
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
28
30
33
34
35
2.4.1.1
2.4.1.2
2.4.1.3
2.4.1.4
2.4.1.5
2.4.1.6
2.4.1.7
2.4.1.8
2.4.1.9
2.4.2
2.4.3
2.4.4
2.4.5
2.4.6
Raised-line drawing boards
Tactile-experience pictures :
Buildup displays : : : : : :
Embossed paper displays : :
Braille graphics : : : : : : :
Vacuum-forming method : :
Microcapsule paper : : : : :
Other methods : : : : : : :
Summary : : : : : : : : : :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
2.5 Representation of Images : : : : : : : : : : : : : : : : : : : : : : : : : 36
2.5.1 Quantization : : : : : : : : : : : : : : : : : : : : : : : : : : : 36
2.5.2 Computerized Representation : : : : : : : : : : : : : : : : : : 37
2.6 Image Processing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37
2.6.1 Applicability to Tactual Perception and TACTICS : : : : : : 39
3 TACTICS: TACTILE IMAGE CREATION SYSTEM : : : : : : : 40
3.1 Automatic Generation of Tactile Graphics : : : : : : : : : : : : : : : 40
3.2 Genesis of TACTICS : : : : : : : : : : : : : : : : : : : : : : : : : : : 41
3.3 Image Processing Algorithms : : : : : : : : : : : : : : : : : : : : : : 42
3.3.1
3.3.2
3.3.3
3.3.4
Notation : : : :
Edge Detection
Blurring : : : :
Segmentation :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
vi
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
42
44
46
47
3.3.5 Negation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 50
3.3.6 Median Filtering : : : : : : : : : : : : : : : : : : : : : : : : : 51
3.4 Image Processing Tools : : : : : : : : : : : : : : : : : : : : : : : : : : 52
3.5 Tactile Imaging : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52
3.5.1 Description : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52
3.5.2 Development : : : : : : : : : : : : : : : : : : : : : : : : : : : 53
3.5.3 Sequencing of Algorithms : : : : : : : : : : : : : : : : : : : : 53
3.6 Tactile Output : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 54
3.6.1 Microcapsule Paper : : : : : : : : : : : : : : : : : : : : : : : : 54
3.6.2 Tactile Image Enhancer : : : : : : : : : : : : : : : : : : : : : 55
3.6.3 Additional Equipment : : : : : : : : : : : : : : : : : : : : : : 56
3.7 Experimental Procedure for Tactile Image Creation : : : : : : : : : : 56
3.7.1 Acquisition of Images : : : : : : : : : : : : : : : : : : : : : : : 56
3.7.2 Simplication : : : : : : : : : : : : : : : : : : : : : : : : : : : 57
3.7.3 Tactilization : : : : : : : : : : : : : : : : : : : : : : : : : : : : 57
4 EVALUATION OF TACTICS : : : : : : : : : : : : : : : : : : : : : : 58
4.1 Overview of Experimental Protocol : : : : : : : : : : : : : : : : : : : 58
4.1.1
4.1.2
4.1.3
4.1.4
Selection of Subjects : : : : : : : : : : : : : : : : : : : :
Production of Materials : : : : : : : : : : : : : : : : : :
Aggregate Image Processes : : : : : : : : : : : : : : : : :
Psychophysics and Experimental Procedure Justication
4.1.4.1
4.1.4.2
4.1.4.3
4.1.4.4
Detection : : :
Discrimination
Identication :
Comprehension
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
59
59
60
67
:
:
:
:
:
:
:
:
:
:
:
:
67
67
68
68
4.2 Experiments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 69
4.2.1 Pilot Study : : : : : : : : : : : : : : : : : : : : : : : : : : : : 69
4.2.1.1 Subjects : : : : : : : : : : : : : : : : : : : : : : : : : 70
vii
4.2.1.2
4.2.1.3
4.2.1.4
4.2.1.5
Materials : : : : : : : :
Procedure : : : : : : : :
Results : : : : : : : : :
Discussion of pilot study
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
70
71
71
72
4.2.2 Simple Discrimination Experiment : : : : : : : : : : : : : : : 74
4.2.2.1
4.2.2.2
4.2.2.3
4.2.2.4
Subjects :
Materials
Procedure
Results :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
74
75
75
76
4.2.3 Timed Discrimination Experiment : : : : : : : : : : : : : : : : 78
4.2.3.1
4.2.3.2
4.2.3.3
4.2.3.4
4.2.3.5
Subjects : : : : : : : : : : : : : : : : : :
Materials : : : : : : : : : : : : : : : : :
Procedure : : : : : : : : : : : : : : : : :
Results : : : : : : : : : : : : : : : : : :
Comparison with simple discrimination :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
78
79
79
79
81
4.2.4 Identication Experiment : : : : : : : : : : : : : : : : : : : : 81
4.2.4.1
4.2.4.2
4.2.4.3
4.2.4.4
Subjects :
Materials
Procedure
Results :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
82
82
82
83
4.2.5 Comprehension Experiment : : : : : : : : : : : : : : : : : : : 84
4.2.5.1
4.2.5.2
4.2.5.3
4.2.5.4
Subjects :
Materials
Procedure
Results :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
84
85
85
86
4.2.6 Signicance of Results : : : : : : : : : : : : : : : : : : : : : : 87
5 OBSERVATIONS, DISCUSSION AND CONCLUSIONS : : : : : 89
5.1 Observations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89
5.2 Discussion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92
5.3 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95
viii
6 FUTURE DIRECTIONS : : : : : : : : : : : : : : : : : : : : : : : : : : 97
6.1
6.2
6.3
6.4
Development of End User Application : :
Extension to Refreshable Tactile Display
Multimodal Interface : : : : : : : : : : :
Mapping Color to Texture : : : : : : : :
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
97
98
98
99
BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 100
Appendix
A LISTING OF IMAGES : : : : : : : : : : : : : : : : : : : : : : : : : : : 108
A.1 Pilot Study Images : : : : : : : : : : : : : : : : : : : : : : : : : : : : 108
A.2 TACTICS Evaluation Images : : : : : : : : : : : : : : : : : : : : : : 108
B SIMPLE AND TIMED DISCRIMINATION IMAGE PAIRINGS 110
B.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110
B.2 Flexi-Paper Pairs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110
B.3 Matsumoto Kosan Paper Pairs : : : : : : : : : : : : : : : : : : : : : : 111
C IDENTIFICATION EXPERIMENT IMAGES AND
CATEGORIES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112
C.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112
C.2 Listing of Images and Categories : : : : : : : : : : : : : : : : : : : : 112
D COMPREHENSION EXPERIMENT IMAGES,
DESCRIPTIONS AND QUESTIONS : : : : : : : : : : : : : : : : : 114
D.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 114
D.2 Listing of Images, Descriptions and Questions : : : : : : : : : : : : : 114
E COLLECTED TACTICS PARAMETERS : : : : : : : : : : : : : : : 118
F HUMAN SUBJECTS REVIEW BOARD EXEMPTION : : : : : 119
G TACTILE IMAGE EXAMPLES : : : : : : : : : : : : : : : : : : : : : 120
ix
LIST OF FIGURES
1.1
Astronaut Edwin E. Aldrin, Jr. poses beside a deployed U.S. ag on
the surface of the moon. (NASA) : : : : : : : : : : : : : : : : : : : 2
1.2
First ever electron micrograph of Ebola Zaire virus, taken by Dr. F.
A. Murphy at the Centers for Disease Control in 1976. Diagnostic
specimen in cell culture at 160,000X magnication. (CDC) : : : : : 2
1.3
Figure 1.1 after image was processed using TACTICS. : : : : : : : 9
1.4
Figure 1.2 after image was processed using TACTICS. See
Appendix G for samples of expanded tactile images. : : : : : : : : : 9
2.1
Microcapsule paper (enlarged view) showing layer of polystyrene
microcapsules on polyethylene or paper transport medium. : : : : : 25
2.2
Microcapsule paper after image is axed to the surface by
photocopying or ink drawing. : : : : : : : : : : : : : : : : : : : : : 27
2.3
Simplied view of the Tactile Image Enhancer, showing internal
workings of the device for expanding previously exposed
microcapsule paper. : : : : : : : : : : : : : : : : : : : : : : : : : : 27
2.4
Microcapsule paper after exposure in image enhancer, showing
expanded capsules. Note that capsules may not expand fully when
only partially covered by printing, although this degree of expansion
is unpredictable. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 27
2.5
Telesensory's Optacon II in action [83]. User places index nger of
one hand on vibrotactile pin array and guides scanner across
material to be viewed with other hand. (Telesensory) : : : : : : : : 30
x
2.6
Layout of the vibrotactile pin matrix display of the Optacon. : : : : 31
2.7
Active pin matrix display of the Optacon, demonstrating display of
the capital letter S. : : : : : : : : : : : : : : : : : : : : : : : : : : : 32
2.8
Tactile Vision Substitution System (TVSS ) [98]. : : : : : : : : : : : 33
3.1
Format of two-dimensional image. : : : : : : : : : : : : : : : : : : : 42
3.2
Before and after Sobel edge detection algorithm. (public domain) : 44
3.3
Image before and after application of blurring algorithm. : : : : : : 46
3.4
Image before and after application of K -means segmentation
algorithm, with K = 2. : : : : : : : : : : : : : : : : : : : : : : : : : 47
3.5
Image before and after application of an adaptive K -means
segmentation. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 49
3.6
Image before and after application of negation algorithm. : : : : : : 50
3.7
A noisy processed image before and after the application of median
ltering. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51
3.8
Tactile Image Enhancer. (Repro-Tronics) : : : : : : : : : : : : : : : 55
4.1
Original unprocessed grayscale image of the chimney end of a house.
(public domain) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61
4.2
Image of house before and after processing using Sobel edge
operator with thresholding. : : : : : : : : : : : : : : : : : : : : : : 61
4.3
Image of house before and after processing using Sobel edge
operator without thresholding. : : : : : : : : : : : : : : : : : : : : : 61
4.4
Image of house before and after processing using K-means adaptive
segmentation algorithm. : : : : : : : : : : : : : : : : : : : : : : : : 63
4.5
Image of house before and after processing using Sobel edge
operator without thresholding followed by K-means segmentation. : 63
xi
4.6
Comparison of eect of Sobel edge detection using xed thresholding
from Figure 4.2 (left) with Sobel edge detection utilizing adaptive
K-means segmentation (for thresholding) from Figure 4.5 (right). : 63
4.7
Image of house before and after processing using K-means
segmentation followed by Sobel edge detection. : : : : : : : : : : : 65
4.8
Images of a face demonstrating the dierence between two sequences
of processing. From left to right: Original image, image after Sobel
edge detection without thresholding followed by K-means
segmentation, and image after K-means segmentation followed by
Sobel edge detection. (US Govt) : : : : : : : : : : : : : : : : : : : 65
4.9
Image of house before and after processing using the aggregate
sequence of processes: blurring, Sobel edge detection without
thresholding, K-means segmentation and median ltering. : : : : : 66
4.10 Comparison of image of house using the aggregate process from
Figure 4.9 (left) and the same aggregate sequence of processes with
the exception of the initial blurring step (right). : : : : : : : : : : : 66
G.1 Electron micrograph of Ebola Zaire virus before and after processing
with TACTICS. (CDC) : : : : : : : : : : : : : : : : : : : : : : : : 120
G.2 Figure G.1 expanded on microcapsule paper. : : : : : : : : : : : : : 121
G.3 Image of space shuttle Challenger landing before and after
processing with TACTICS. (NASA) : : : : : : : : : : : : : : : : : : 122
G.4 Figure G.3 expanded on microcapsule paper. : : : : : : : : : : : : : 123
G.5 Image of moon before and after processing with TACTICS. (NASA) 124
G.6 Figure G.5 expanded on microcapsule paper. : : : : : : : : : : : : : 125
G.7 Image of a face before and after processing with TACTICS. (US
Govt) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 126
G.8 Figure G.7 expanded on microcapsule paper. : : : : : : : : : : : : : 127
xii
G.9 Image of a desktop computer before and after processing with
TACTICS. (public domain) : : : : : : : : : : : : : : : : : : : : : : 128
G.10 Figure G.9 expanded on microcapsule paper. : : : : : : : : : : : : : 129
G.11 Image of a tornado in Oklahoma before and after processing with
TACTICS. (public domain) : : : : : : : : : : : : : : : : : : : : : : 130
G.12 Figure G.11 expanded on microcapsule paper. : : : : : : : : : : : : 131
G.13 Image of Emma before and after processing with TACTICS.
(personal) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 132
G.14 Figure G.13 expanded on microcapsule paper. : : : : : : : : : : : : 133
xiii
LIST OF TABLES
2.1
Summary of information bandwidth limitations for three senses [45].
12
2.2
Reading modes used by a group of 7,987 totally blind students. : : 20
4.1
Summary of per subject average results of the tactile image
matching task for ve image processes [94]. : : : : : : : : : : : : : 72
4.2
Summary of overall results of simple discrimination task for four
image processes. The Aggregate Process is comprised of blurring,
Sobel edge detection without thresholding, K-means adaptive
segmentation, and median ltering, applied in that order. : : : : : 77
4.3
Summary of percentage of correct responses comparing eects of two
varieties of microcapsule paper on simple discrimination task. : : : 77
4.4
Summary of percentage of correct responses comparing results of
blind versus sighted subjects performing simple discrimination task. 78
4.5
Summary of overall results of timed discrimination task for four
image processes. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 80
4.6
Summary of percentage of correct responses comparing eects of two
varieties of microcapsule paper on timed discrimination task. : : : : 80
4.7
Summary of percentage of correct responses comparing results of
blind versus sighted subjects performing timed discrimination task. 80
4.8
Summary of percentage of correct responses comparing results of all
subjects on simple discrimination versus timed discrimination tasks. 81
xiv
4.9
Summary of overall results of identication task for four image
processes. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 83
4.10 Summary of percentage of correct responses comparing results of
blind versus sighted subjects performing identication task. : : : : 83
4.11 Summary of results of comprehension task for three subtasks and
overall comprehensibility of tactile images prepared using Aggregate
process. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87
4.12 Summary of percentage of correct responses comparing results of
blind versus sighted subjects performing comprehension task. : : : 87
E.1 Summary of parameters relevant to TACTICS and tactile image
perception. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 118
xv
ABSTRACT
Access to visual information by blind and visually impaired persons is often
achieved through its manual translation into tactile form. This conversion is a timeconsuming eort involving the use of glue, string, scissors, cardboard and other craft
materials, tracing paper and marking pens, or computer-aided drawing packages, to
produce a tangible representation of the original image. Although worthwhile, such
an approach is neither timely nor easily reproducible, and clearly necessitates the
involvement of a specially skilled sighted individual in the process.
Computers excel at displaying information via multiple media, including the
CDROM and ubiquitous Internet. This omnipresence of the computer in everyday
life provides ready availability to a myriad of graphical, textual and auditory information for sighted and blind individuals alike. For blind computer users, text-based
information is output as synthesized speech or as braille via a special purpose printer
or display. The surging prevalence of the graphical user interface (GUI), however,
introduces severe impediments for the blind community, resulting from a forsaking
of the textual in favor of the visual. This trend toward visual display techniques
means that, in the midst of the dawning \Information Age," a blind person has
reduced access to information. Pictures, drawings, video and animation are not
directly accessible to the blind computer user.
xvi
This thesis develops a composite software/hardware system for automatic
translation of electronic images into tactile form. In this system, an aggregate
process comprised of a sequence of image processing algorithms is applied to an
image to produce a simplied version of the original. This caricaturized image is
subsequently output in a raised tactile graphic form on microcapsule paper, suitable
for display to a blind person.
To motivate the techniques used in this system, topics in human perception,
tactile graphics production and image processing are explored. To provide access to
visual information for blind persons, an understanding of how we as humans interface
with the world around us and how tactile graphics are produced is vital. A summary
of pertinent background regarding human factors and perceptual issues, particularly
as they relate to blindness, is provided. The technologies and techniques for tactile
graphic production are reviewed, as is current research in this area. The use of
image processing techniques for purposes of simulating some aspects of the visual
system is justied, and applicable algorithms for such processing are discussed.
Presented next are the specic techniques used in this system to produce
tactile images from visual ones rapidly and automatically. The ecacy of these
techniques is examined in terms of recognizability, classiability and comprehensibility as measured in a series of experiments. Finally, future directions in which this
work may lead are discussed.
xvii
Chapter 1
INTRODUCTION
\One picture is worth a thousand words [36]." So reads the well-worn age-old
adage. The professor describes a particularly challenging concept to the class and
nds herself awash in a sea of blank stares. What does she do next? She puts
chalk to slate and illustrates the dicult topic with a diagram. The glazed looks are
replaced with light bulbs. A man is strolling along a busy city sidewalk. He reads
the newspaper as he walks, oblivious to where he is stepping. We see a banana peel
on the walkway ahead. Instantly, we anticipate what will happen next. Moments
later, a shoe lands squarely on the peel and the newspaper ies into the air as the
man tumbles to the pavement. Consider the remarkable live television image of a
human being standing on the surface of the moon in 1969 (Figure 1.1). Consider
the consequences of the sinister microscopic Ebola virus, so deadly it kills its prey
in weeks or even days (Figure 1.2). Words alone cannot express the full impact of
such images in the way that the pictures can.
Visual information, whether to illustrate a point, make us chuckle or inspire
us, is all around us and speaks to us in a most powerful way. Suppose that you do
not have the sense of sight. All of those pictures are now virtually inaccessible to
you. A diagram on the chalkboard is nothing more than a series of bone-jangling
squeaks. A walk down a city sidewalk is a cacophony of footsteps, car horns and
1
Figure 1.1: Astronaut Edwin E. Aldrin, Jr. poses beside a deployed U.S. ag on
the surface of the moon. (NASA)
Figure 1.2: First ever electron micrograph of Ebola Zaire virus, taken by Dr. F.
A. Murphy at the Centers for Disease Control in 1976. Diagnostic
specimen in cell culture at 160,000X magnication. (CDC)
2
street vendors. The television screen is just a slightly curved piece of glass that
crackles with static electricity when you brush your ngers across it. A photograph
is simply a slick piece of paper. Blindness eliminates access to the myriad of visual
information that many of us rely on to make our way in the world each day.
For persons who are blind, the answer to this impaired access is to rely upon
the senses of hearing and touch. In 1834, Louis Braille perfected an embossed-dot
code for sightless reading and writing, regarded as one of the most signicant contributions to the education of blind persons and replacing a less eective embossed
letter system [57]. With braille, letters of the alphabet, numbers and other symbols are represented by raising various combinations of dots in a six dot (2 3) or
eight dot (2 4) rectangular array. Braille is the standard method for producing
books that blind persons can read [26]. Computers can convert the written word
into speech, so that any text printed on the computer screen can be spoken aloud.
Neither braille nor audio output, however, can yet provide good access to raw visual
information [32]. Future applications may one day include an articially intelligent
image-to-text converter that would examine a computerized picture and generate
a textual description that could subsequently be output as speech. This problem,
known as Image Understanding, remains unsolved due to (1) information lost when
a two-dimensional image is created from the three-dimensional world, (2) object
occlusion, and (3) the eects of inter-reaction of visual phenomena on the value of
each pixel [9, 12, 74, 78]. Specic applications of image understanding techniques
are currently in use in the areas of mobile robot navigation, complex manufacturing tasks, medical image processing, and analysis of satellite images [74]. However,
solving this dicult problem of articial vision in the general case is not likely to
happen soon. The richness and variety of the spatial information contained in the
visual domain requires thousands upon thousands of words to describe the complete
content adequately, a method which is of questionable practicality and far beyond
3
the state of the art. This absence of practical image description techniques points
out the necessity for eective tactile rendering.
One straightforward method of providing access to an image for blind people
is to associate with the image a brief textual description that can be accessed at
will. Such a description necessitates special preparation by a sighted person [8, 25].
Furthermore, this preparation must be done for every image to assure complete
accessibility. For limited applications this method might be feasible, but it is not
practical in the general case. For example, with the literally millions of images on
the Internet already, and a constant stream of new ones pouring onto the network
daily, it is dicult to envision providing a textual description for each.
There are certain images that, because of their particular timeliness or rapid
change, would be quite dicult to describe adequately using text or audio. Consider
images of a satellite weather map that may be updated once a second, twenty-four
hours a day. Clearly, inclusion of an individually written textual description of such
images, even given the current state of the art, is improbable at best.
The sense of touch is relied upon frequently by blind persons in lieu of sight.
One common method of presenting visual images in a touchable or tactile fashion
is through use of tactile graphics. The term tactile refers to the sense of touch [55].
Tactile graphics provide a raised representation of such visually useful materials as
maps, graphs and other simple drawings. By current practice, these are prepared by
a sighted person individually and by hand. This preparation is neither timely nor
ecient. Timeliness, however, is not a major issue for infrequently changing items,
such as maps [25].
Many blind persons rely upon the computer as a pipeline connecting them
to a deep well of easily accessible textual information. The dawning of the so-called
4
\Information Age" has brought with it a shift from textual to graphical representation of information. Everywhere one goes on the Internet, glitzy icons, images and
animation are replacing words. This explosive growth of reliance upon graphics as
the choice for information presentation, which includes the dominance of the graphical user interface (GUI), has had a signicant positive impact on sighted computer
users and a drastic negative one on blind computer users [11, 92]. The volume of
graphical information residing on the Internet and present by the very nature of the
GUI paradigm makes it impractical to include a textual description with each and
every graphic.
Some barriers are overcome with new commercial GUI-friendly screen review
and speech synthesis software and hardware. These systems can be combined with
well-developed technologies such as embossed braille printers and braille cell displays
to provide limited access. Directory navigation and text-based tasks such as word
processing in the GUI environment can be handled by keeping an o-screen model of
the on-screen graphics [65, 71]. In such a model, words are drawn on the computer
screen as pictures of these words, collections of pixels set to the right color and
intensity in the right positions on the screen. Meanwhile, each word of the original
text is kept in a location in the computer memory, and is associated with its picture
on the screen. In this way, words can be provided to a speech synthesizer or braille
display, thus giving access to a person who cannot see the screen [15, 32].
One commercial device, the Optacon (see page 30), can produce a vibrotactile
representation of whatever words or drawings pass underneath its hand-held scanner.
The Optacon display is a ngertip-sized matrix of tiny pins that vibrate individually
in response to an object, such as text or a simple drawing, viewed by the scanner [83].
Thus, an a feels like a vibrating letter a to the nger. Unfortunately, this a may
also feel like similarly shaped letters such as c, e, o, s or u. This inherent ambiguity
5
means that even with quite a bit of training, reading with an Optacon is slow [10,
21, 22]. Other experimental means, including an Optacon-based device, have proven
successful in aording limited tactile access to very simple symbols, providing a
means to distinguish between, for instance, a circle, a triangle and an X [99].
These methods, however, fall short in their ability to provide access to complex visual information, such as photographs. A photograph is a two-dimensional
depiction of a three-dimensional view. By complex we mean an image with the qualities of a typical photograph. These qualities include having many shades of color
and levels of intensity, shadows and other depth cues such as overlap and relative
size, and complicated shapes. We glean clues about orientation of, and relative positions among, objects in a photograph from shadows, shape and size [18]. Presenting
such complex information as a tactile image is nontrivial, to say the least.
Tactile imaging is the process of turning a visual item, such as a picture,
into a touchable raised version of the image so that this tactile rendition faithfully
represents the original information. Properly done, tactile imaging provides access
for blind persons to visual information that is inaccessible via other means such
as audio or textual description. Tactual perception, the physiological capabilities
of the human sensory system to explore and discern via the sense of touch, is well
understood. Factors such as the size and shape of the ngertip, temporal and
spatial response of the nerve receptors in the skin, and incorporation of kinesthetic,
or haptic, cues must be considered. These factors limit the size and detail of tactile
images to within the response ranges of these various factors [55, 82].
The way in which the mind perceives and classies images is a well-studied
area, one in which a number of theories have developed. Among these, perhaps
the most accepted view is that of human memory being arranged hierarchically
6
from general to specic in terms of one or more qualities of the object being perceived. Whether the information is visual or tactile, the brain uses this same general
framework for classication [18, 42, 46]. Thus, producing usable tactile images from
photographs is a challenge requiring a careful balance of resolution, size, shape and
detail. Having too much detail in a tactile image will result in much of its content
being lost, actually degrading its clarity and utility due to an information overload
of sorts. This overload results from limitations of tactual perception, particularly
the physiological disparity between the resolution of the human eye and ngertip.
Including too little detail will result in a tactile image that may not feel like anything more than a simple shape, not adequately representing the original image at
all [43]. This ambiguity is due to the manner in which the brain categorizes what it
perceives, in this case classifying tactually indistinguishable items as the same, even
though the unprocessed visual originals may have been quite dierent.
In this thesis, one major step toward creating access to complex visual images
is considered. The well-studied areas of tactual perception, the human sensory system in general, image processing techniques, and tactile graphics are discussed. To
justify a heretofore unexplored combination of factors and theories from these areas,
a broad array of necessary background information is provided. This information
plays a formative and vital role in the motivation of this research. These techniques,
taken individually, are common, general and well-known. Taken as a whole and in
very specic combinations, the results are unique and noteworthy.
Specically, the content begins with a brief review of the human sensory
system, focusing on how we interface with our world. Relevant statistics regarding
the blind population are presented, and an overview of blind computer user interface
technology is provided. Perception at the tactual and mental levels and related
human performance parameters are discussed, and the visual and tactual senses are
7
contrasted. This content propels a further discussion of image processing techniques
that can roughly simulate the abilities of these perceptual systems.
This background information then is used to motivate a discussion of a new
method for the automatic translation of visual images into tactile images called
the TACTile Image Creation System (TACTICS) [95, 96]. This prototype system
provides access to previously inaccessible visual information using image processing
and tactile graphics production techniques. The goal of this system is to free the
blind computer user from reliance upon a sighted individual to prepare custom
tactile graphics, or tactics [30], and to overcome the considerable time delay in doing
so. Tactile images (Figure 1.3 & Figure 1.4) of photographic images are produced
by TACTICS in seconds or minutes as opposed to hours or days. The components
and mechanics of the system, including the image processing algorithms, output
medium and overall procedure for conversion of images from visual to tactile, are
described as well.
Justication for the techniques used by TACTICS is presented in the form of
results of a series of experiments. These experiments explore a range of tasks from
simple discrimination to image content comprehension. From a careful analysis
of the results, conclusions are drawn regarding the eectiveness of TACTICS, and
future extensions to the system, as well as related areas of future work, are proposed.
It is hoped that the reader of this thesis will have a \Why didn't I think of
that?" reaction to the research and results presented here. Although the scientic
underpinnings of this system are comprehensive and complex, the system itself is
straightforward, elegant and intuitive. If the eventual eect of this research is to
aord better access to visual information for blind persons, then perhaps it will be
judged to be a valuable contribution to science.
8
Figure 1.3: Figure 1.1 after image was processed using TACTICS.
Figure 1.4: Figure 1.2 after image was processed using TACTICS. See Appendix G
for samples of expanded tactile images.
9
Chapter 2
BACKGROUND
The ecacy of a method for automatically converting visual information
into tactile information necessarily is dependent upon a variety of factors, which are
reviewed in this chapter. To guide the design of such a system, an understanding
of the human factors of sensation and perception, including how the sense of touch
compares to the sense of sight, is important. There are lessons to be learned from
past and current techniques for tactile graphic production and other non-visual
methods used by blind persons to access computer-based information. The medium
for the description of visual information that is under consideration in this thesis
is the computerized image. How such images are represented and the techniques
that can be used to operate upon them are explored, and their correspondence to
human tactual perception is considered. The background provided in this chapter
will be used to motivate the prototype system and experimental protocol detailed
in following chapters.
1
1
See Appendix E for a summary of various parameters related to tactual perception and aecting the design of TACTICS.
10
2.1 The Human Sensory System
The fundamental issue in presenting visual information in a meaningful tactile
form is the understanding of some basics of human sensory perception. By reviewing
how the human sensory system collects and comprehends information and what the
limits are to the type and amount of information the senses can process, it may
be possible to identify factors that can play a role in the conversion of information
intended for one sense to a form suitable for another sense.
2.1.1 Information and Sense
Humans receive all of their information about the world around them using
one or more of ve senses [18]. The Gustatory Sense provides information on taste
qualities such as sweet, salty, sour and bitter. Often working in conjunction with
taste is the Olfactory Sense, which provides smell information. The Auditory Sense,
our hearing, allows us to receive auditory information such as music, speech and
noise. The Tactual Sense is comprised of touch and kinesthesis, providing information about such physical world qualities as temperature, perception of texture,
position and motion. Finally, the Visual Sense, our sense of sight, is how we receive
visual information including color, brightness, depth of eld, and motion.
2.1.2 Bandwidth Comparison
The bandwidth of a sense refers to the capacity of that sense to receive and
perceive information. Studies show that vision, as one might intuitively expect, is
our highest bandwidth sense, followed by hearing and touch (Table 2.1) [45]. The
11
Table 2.1: Summary of information bandwidth limitations for three senses [45].
Sense Modality
Limit bits=sec
Skin (vibrotactile)2
102
Ear
104
Eye
106
Visual Sense is two orders of magnitude better at carrying information than the
Auditory Sense, which is two orders of magnitude better than the Tactual Sense.
The Gustatory and Olfactory Senses are much more prone than the others
to the eects of adaptation, and are not ecient at carrying information at a rate
anywhere near that of even the Tactual Sense. Adaptation refers to the tendency
of a sense to grow accustomed to a stimulus, thereby becoming less sensitive to it
over time. Taste and smell are prone to adaptation and have comparatively slow
recovery times, while the other three senses have speedier recovery times that are
roughly proportional to their bandwidths. As the highest bandwidth and most
resilient sense, vision is clearly of the greatest importance among the senses, and
therefore the hardest to do without. By comparison, the other senses have lower to
much lower information capacities which makes the problem of sensory substitution
for vision a dicult one to address [18, 43].
The implications for development of a vision substitution system are signicant by virtue of this large bandwidth disparity. Visual information cannot simply
be mapped directly to the auditory or tactual domains, but clearly must be reduced
by some bandwidth correlated scaling factor. Further, this scaling must preserve
the meaning of the original visual information to be useful. It is this information
reduction task which forms the basis for the system we develop in this thesis.
2
The results of previous research indicated that the human ngertip processes
vibrotactile signals at a rate no more than 10 bits=sec [20, 31, 48].
1
12
2.2 Tactual Perception
Tactual perception primarily refers to active exploratory and manipulative
touch. Study of the physiological factors involved in tactual perception is important
if one is to gain an understanding of how best to create tactile images. For a
tactile image to be useful, a blind person must be able to explore it with the sense
of touch, usually the ngers, and extract some content information. Thus, limits
to tactual perception, such as resolution of the human ngertip, image scale as a
factor of comprehension, and how the mind processes such information are important
considerations [54, 55].
2.2.1 Cutaneous Sensing
The basic physiology of the human skin denes limits to the ability of our
sense of touch. Of particular importance to tactile graphics are the dierence limen
and its relation to temporal response thresholds and masking phenomena. The
dierence limen is the minimum statically discernible displacement between two
points such that the points are distinct. In eect, this is tactile resolution, which for
the skin of the ngertip is approximately 2.5mm. When statically felt, two points
closer than this distance tend to feel like one point, whereas two points farther apart
than this feel like two distinct points [82]. This gure indicates that the resolution
of the ngertip is much lower than the human eye. Therefore, we can safely say
that tactile images require lower resolution than visual images. The denitive work
on this two-point threshold, including its use as an indicator of the relative spatial
resolution as a function of body locus, is in [97].
13
2.2.2 Spatial Sensing
Spatial sensing incorporates what we know about static sensing, embellished
with further measurements of sensory abilities taken during motion of the nger [80].
Related to the two-point dierence limen is the minimum discernible displacement
of a point on a surface. For highly smooth surfaces and under carefully controlled
laboratory conditions, a 2-micron high point can be felt using active touch [50]. The
height of a braille dot, an easily discernible object, is in the range 0.02 - 0.05cm [26].
This is a generally acceptable range of heights for tactile graphics, with heights at
the upper end of the range naturally providing relative improvements in perceptibility [55], much as brighter lighting or higher volume can improve perceptability
in the visual and auditory domains. The limiting factor for the height of tactile
graphics is inherent in the media in which they are produced.
Spatial tactile discrimination has been measured using square-wave gratings of varying groove amplitudes and separations under conditions of active exploration [39, 55, 82]. Sequences of gratings were presented to the distal pad of
the right index nger in both the same and orthogonal orientations to the axis of
the nger. Observers noted dierences in orientation of the grooves, which revealed
the distance at which orientation of grooves became indiscriminable. This study
demonstrated that the minimum tactually discernible grating resolution is 1.0mm,
and that such discrimination improves linearly as the grating width increases above
1.0mm. This result is due to the forward masking eect of one stimulus upon perception of subsequent stimuli. The cutaneous receptors in the skin require a period of
time to recover after cessation of one stimulus before correct sensing of a subsequent
stimulus can begin [54].
14
Taken together, these factors appear to indicate that the resolution of a tactile
image should be somewhat ner than 1 dot=mm to produce a relatively smooth feel
to the image, while resolutions much lower than this seem to provide little or no
benet to tactile perceptibility. For comparison, a resolution of 1 dot=mm equals
25.4 dots=inch, and the resolution of a standard laser printer is at least as ne as 300
dots=inch. This resolution is sucient and signicant, since the system developed
in this thesis relies on a laser printer in the tactile graphics production process.
2.2.3 Tactile Pattern Perception
The visual sense responds well to minute dierences in stimulus, while the
sense of touch tends to need greater variation in stimulus patterns to succeed in
perceptual tasks [44, 55]. Although touch can discriminate and recognize complex
tactile patterns [43], such perception involves a number of complicated cognitive
processes [47].
There is strong basis for the supposition that spatial information, which includes graphics, is stored in the visual cortex portion of the brain [46]. This mechanism is similar for sighted and blind persons, regardless of whether this information
is gathered using the sense of sight or touch. Research indicates that the ability
to store and subsequently retrieve tactually perceived spatial information can vary
greatly from individual to individual. This variation depends to a signicant degree on the level of visual memory (see page 18) a blind person possesses, as often
determined by the age of the onset of blindness. There is comparatively little variation in such ability among the sighted population [77]. The storage and retrieval
of spatial information is believed to be organized in a hierarchical fashion in the
brain, which classies information based on gross characteristics rst, followed by
15
detailed characteristics [9, 90]. Although the resolution of the sense of touch degrades slowly with age [69], which unfortunately equates with a statistical rise in
blindness [77, 79], experience with tactile graphics can make up for this slight loss
of touch sensitivity [43, 93].
The method typically used by a blind person to explore a tactile graphic
tends to support the hierarchical view of human spatial memory. The exploration
by a blind person of a tactile graphic generally is performed in two stages. First, the
entire image is explored as a whole, providing a general tactile overview. Second, the
details of the tactile image are explored. Research has veried this methodology [34]
and has shown that this technique is used by blindfolded sighted persons as well.
These results indicate that the concept of a hierarchical structure of the human
spatial memory is a reasonable assumption.
It is important to note that the acuity of the touch sense is comparable to
blurred vision in similar tasks [1, 53]. The signicance of this relationship is that
any tactile representation of visual information, based on what we already know
about tactual perception, should be suciently simple to make up for this reduced
level of acuity [21, 25, 55, 82]. This result supports our choice of pursuing methods
of image simplication in producing tactile images from their visual counterparts.
2.2.4 Aiding Comprehension
Comprehension of a tactile display is increased when the reader is somehow
clued in to what will be felt [25]. Just as one expects photographs in a newspaper to
have an associated caption, so too would one reasonably expect that the comprehensibility of a tactile image would be enhanced by including some associated textual
information. This enhancement can be accomplished using standard techniques,
16
such as by incorporating braille text with an image or by using speech output from
a computer speech synthesizer to add information and increase comprehension.
In a photograph, information about the relative depth within the eld of
view of objects is provided by masking, shadows and size [18, 46]. This information
is not readily discernible in a tactile format and is a factor which can inhibit the
comprehensibility of a tactile image. One surprising side eect of congenital blindness (see page 18) on comprehension is the relative insensitivity to orientation of
the tactile graphic being touched. Where blindfolded sighted subjects in one study
were confused by a rotated or non-upright tactile graphic representation of a known
object, blind subjects suered little confusion. These blind subjects were quite facile
at mentally rotating the spatial information perceived from the graphic representation, performing much better at comprehension tasks than the sighted subjects
under the same conditions [16].
Representing depth and perspective in a tactile image is dicult, if not impossible, using a two-dimensional tactile display medium. Further, the congenitally
blind individual lacks a visual frame of reference for interpretation of such inherently three-dimensional information when it is mapped onto a two-dimensional display [55]. This shortcoming of two-dimensional tactile graphics display methods can
be handled by some of the up-and-coming haptic display technologies (see page 33).
17
2.3 The Blind Population
2.3.1 Denition of Terms
The American Foundation for the Blind recommends that the term blind be
reserved for individuals with no usable sight whatsoever, while low vision, visually
impaired or partially sighted can be used to describe those with some usable vision.
These terms coincide with standard medical diagnostic guidelines which divide visual
impairment into two classications: no light perception (NLP) and light perception
(LP). An individual with corrected visual acuity of 20=200 in the better eye or a
visual eld of 20 degrees or less in the better eye is considered legally blind. A blind
person is either congenitally blind , being blind from birth or during the rst ve years
of life and possibly lacking visual memory, or adventitiously blind, with blindness
beginning after the age ve and with the probable presence of visual memory. Visual
memory means the ability to classify and remember objects we perceive in terms of
visual characteristics, such as shape, size, color, position and perspective [77].
2.3.2 Misconceptions
There exist numerous misconceptions regarding blind persons [37, 60, 77].
Positive misconceptions are that blind people are exceptionally musical, possess
extraordinary senses of hearing and touch, and are highly intelligent. Negative
misconceptions include suppositions of helplessness, dependence, laziness and lack
of intelligence. Of particular relevance is the supposed increased sense of touch.
Touch sensitivity varies little from person to person, with no statistical dierence
between the sighted and blind population [56]. However, it does seem reasonable
that a blind person may be more accustomed to relying on the sense of touch and
interpreting tactual information [5, 43].
18
2.3.3 The Blind Computer User
Statistics released by the World Health Organization in 1987 estimate that
there are 30- to 40-million blind people in the world [77]. According to 1989 statistics from the National Society to Prevent Blindness, approximately 500,000 U.S.
residents are legally blind [77]. Of those gures, roughly ten percent are totally
without sight [79].
The increase in the general population's reliance upon the computer carries
over to the blind population as well [11]. As the number of computer users continues
to grow quite rapidly, any precise count of users would obviously be out of date
even before it was written down. However, what is certain is that this number
is suciently large to support an assertion that blind computer users make up a
sizable group. It is worth noting that the availability and aordability of synthetic
speech output via computer has broadened access to information for this population
as compared to braille access to the same information.
According to the American Printing House for the Blind (APH), of the blind
population residing in the United States and of reading age, fewer than 16 percent
are uent in braille, while worldwide the gure is lower still [93]. Another study
cites the braille uency rate among blind and visually impaired computer users at
10 percent [32]. While these low braille literacy rates are discouraging, there is some
reason for optimism in the future. In a study of school systems for blind children,
more than one third of the students were found to be uent in braille, although
audio output, either in the form of recorded books or speech synthesis, was still the
mode of choice at the time of the study (Table 2.2) [93, 100].
19
Table 2.2: Reading modes used by a group of 7,987 totally blind students.
Method
Percentage 3
Aural
61
Braille
37
Braille & Large Type
1
Large Type
1
The size of the blind population in proportion to the general population is
expected to remain steady [77, 79]. The portion of the visually impaired population
that has some residual sight, and that can access computers using sight-enhancement
techniques such as screen magniers, will not necessarily be helped by the research in
this thesis. While the theories and methods developed here have wide applications,
including the elds of telecommunications, rehabilitation engineering and computer
vision, the focus here will be on providing access to those blind and visually impaired
persons who cannot benet from currently existing sight enhancement technology.
For purposes of this thesis, this group will be referred to as blind computer users.
2.4 Access Technology for Blind Computer Users
Blind persons have a great many means for accessing textual and visual
information [10, 14, 15, 17, 24, 25, 30, 29, 32, 49, 61, 88]. A number of these
methods already do or can be adapted to provide blind computer users with access
to graphical information. Many traditional methods of access, such as braille output
in one form or another, are, and continue to be, widely used. Their ecacy is
unquestioned. Some relatively recent developments, such as speech output, are also
3
Note that a small percentage (approximately 2%) possessed enough residual
sight to make use of Large Type, either alone or in combination with Braille
writing, although due to either extremely low acuity or a narrow eld of view
these students were classied as totally blind [100].
20
eective and quickly merging with traditional methods to create new standards for
access. Research is active in the development of dynamic and refreshable tactile
displays [15, 24]. Innovations in the materials and techniques used to display visual
information in a non-visual fashion are achieving some success [27, 92]. These new
methods show promise, although technology continues to lag behind concept.
The task of accessing visual information is one of mapping information from
the visual domain to that of one of the other senses. Knowing that this is essentially
an information volume-reduction problem, given that the bandwidth of each of the
other four senses is signicantly lower than that of vision, it is helpful to look at
some of the more successful approaches to tackling this problem before developing
additional solutions. These methods fall into the general categories of Static Tactile
Graphics, Auditory Interfaces, Dynamic Tactile Graphics and Haptic Interfaces. In
addition to these available means, there is active research in this area that is worth
reviewing as well. Note that there is no current technology available for mapping
vision to the senses of taste or smell.
2.4.1 Static Tactile Graphics
Methods for production of static tactile graphics are varied and usually require the intervention of a sighted person in their preparation [25, 88]. This active
participation is a consequence of the diculty of converting visual information into
tactile information, the Image Understanding Problem. Clearly a picture on a at
computer screen is of no use to a blind person, necessitating the involvement of a
sighted individual should access to such a picture's content be desired.
The process of converting computer graphics to tactile graphics can be a
labor-intensive and time-consuming one. There are three important steps in this
21
process: (1) editing, (2) transferral and (3) production. Consider any original twodimensional graphic, such as a pencil sketch, ink drawing, graph, diagram, illustration or printed picture.
For a tactile graphic display to be comprehensible, it must not contain too
much information. General design guidelines, developed through years of practical
application and renement of technique, suggest that a tactile graphic should contain the least amount of information possible to convey the content of the image
successfully. Clutter or an overabundance of detail in a tactile image can detract
from its usability and hamper one's ability to understand its content [44, 55]. Thus,
it is important to simplify complex images in the editing step of the process of converting them to tactile images. Experience shows that a tactile graphic that is too
large or too small detracts from comprehensibility as well [99]. The size of a tactile
image should be kept within a hand span, or roughly 3in to 5in on a side.
Transferral entails placing the image onto some tactile output medium. A
picture is rst traced on tracing paper, and then is transferred to the tactile display
material using carbon paper and retracing. Other methods for transferral include the
pantograph, which is an instrument consisting of four arms jointed in parallelogram
form. It is adjustable to produce tracings of smaller, the same, or larger sizes.
Using grids to scale images is also a common technique, as is use of the enlargement
capabilities of modern photocopier machines.
The production step is where the physical tactile graphic is produced. There
are numerous methods considered standard; without exception, all require the intervention of a sighted person to translate a visual image into a tactile one. There
are a number of commonly used methods for tactile graphic production [24, 25, 88],
including the following:
22
2.4.1.1 Raised-line drawing boards
Designed to be used by blind persons for producing raised-line drawings, this
common tool is also useful for fast production of tactile versions of visual originals.
A stylus produces a raised line when drawn over a plastic lm, giving an instant
tactile representation.
2.4.1.2 Tactile-experience pictures
This method is often used for young children. Pictures are constructed of a
variety of materials, including wood, plastic, cloth, sandpaper, fur, and metal, which
are glued to a sti cardboard backing. This method involves individually fashioning
each piece out of the desired material and assembling the resulting pieces into the
tactile picture.
2.4.1.3 Buildup displays
Similar in method to tactile-experience pictures, buildup displays rely on
multiple layers of paper to build up a raised drawing. Additional materials, such as
wire, string and even staples, may be added to enhance the drawing.
2.4.1.4 Embossed paper displays
This technique reproduces a drawing on heavy paper using a collection of
embossing tools. A reverse view of a sketch is rst transferred to the back of a sheet
of embossing paper. The tools are then used to trace the sketch, embossing it as a
series of raised dots.
23
2.4.1.5 Braille graphics
Graphics embossing can be produced more simply and speedily using a standard braille printer connected to a computer. Operating in graphics mode, the
printer maps pixels (see page 37) of the original image to braille dots to produce
the embossed version of the picture. The resolution of this method is low; to be
eective, the original image must be a simple line drawing. This method has two
distinct advantages: many blind computer users have access to a braille printer and
no sighted intervention is required for its use. Hence, with the proper processing
techniques applied to images, as will be described in the discussion of TACTICS (see
page 41), it may be possible to utilize such a printer to produce adequate tactile
representations of pictures.
2.4.1.6 Vacuum-forming method
This method, also known as \thermoforming," excels at producing multiple
copies of a tactile graphic in a very durable format. It requires a raised master made
of stable or unpliable material. Next, the master is placed on a perforated tray in
the vacuum-forming machine. A sheet of thin plastic is fastened over the master
such that it forms an airtight cover. A heating unit is placed over the plastic as
air is sucked out from below the master, deforming the now pliant plastic over the
master. Once cooled, the plastic sheet is a durable replica of the original. This
process can take as little as one minute, which is acceptable for producing multiple
copies.
24
2.4.1.7 Microcapsule paper
Referred to variously as \capsule paper," \swell paper" or \pu paper," this
is a quick and economical way to produce tactile graphics. It is paper that has been
coated with microscopic capsules of polystyrene (Figure 2.1), each being 100m
in diameter.
There are two types of microcapsule paper available on the international market. Flexi-Paper is a polyethylene-based paper manufactured by Repro-Tronics, in
Westwood, New Jersey [73]. It is tan in color and is quite durable under conditions of folding and crumpling. The Matsumoto Kosan Company of Osaka, Japan,
produces a paper-based version [58], white in color, that provides for blind persons
a more familiar sti feel resembling that of heavy braille embossing paper while
being less resistant to the eects of folding than Flexi-Paper. Both are comparable in price ($1.00 U.S. per sheet). With an unexpanded capsule diameter of
100m, the unexpanded resolution of both brands is therefore 10 capsules=cm (2.54
10 capsules=in). The capsules expand upward and outward consistently to a diameter (height) of 0.2mm to 1.0mm, yielding an expanded resolution of 10 to 50
capsules=cm (25 to 127 capsules=in). In practical observations in the laboratory,
the typical expanded diameter is 0.3mm and typical expanded height is 1.0mm.
4
4
Microcapsules
Figure 2.1: Microcapsule paper (enlarged view) showing layer of polystyrene microcapsules on polyethylene or paper transport medium.
25
To benet from this expanded resolution, a printer should have a resolution
of at least 127 dots=inch, the best possible resolution of expanded microcapsule
paper based on manufacturers' specications. Printing at a higher resolution will not
produce a gain in tactile image resolution since the polystyrene capsules expand both
upward and outward, meeting to create a contiguous surface with other expanded
capsules within the range of the above noted resolution. Thus, a typical laser printer
with a resolution of 300 dots=inch is entirely adequate for initial output of the image
to be expanded. The amplitude of this expansion is aected by the temperature of
the heating element, with higher temperatures producing slightly more pronounced
expansion.
Original graphics are photocopied onto the microcapsule paper using a standard oce copy machine (Figure 2.2). Graphics can also be applied to the microcapsule paper using ink pens, markers and other drawing implements. The only
requirement is that the graphic be rendered in black. Once the image is applied to
the microcapsule paper, it is inserted image side up into a heating machine, referred
to as the Tactile Image Enhancer (Figure 2.3). For expanding multiple pages, each
exposed sheet of microcapsule paper must be fed individually into the Enhancer.
When exposed to a heat source of 120-125 degrees Celsius (248-257 degrees
Fahrenheit), portions of the paper that are printed in black expand. The microcapsules beneath the black lines of a diagram absorb more heat than the other
microcapsules and expand in diameter, raising the drawing from the background
(Figure 2.4).
An added benet is that one can draw directly on the microcapsule paper,
which then can be raised immediately. The time taken to raise one drawing already
on a sheet of microcapsule paper is approximately ten seconds. Even accounting for
26
Image
Figure 2.2: Microcapsule paper after image is axed to the surface by photocopying or ink drawing.
Heating element
Paper path
Transport
rollers
Figure 2.3: Simplied view of the Tactile Image Enhancer, showing internal work-
ings of the device for expanding previously exposed microcapsule paper.
Expanded capsules
Figure 2.4: Microcapsule paper after exposure in image enhancer, showing ex-
panded capsules. Note that capsules may not expand fully when only
partially covered by printing, although this degree of expansion is unpredictable.
27
printing from a computer, photocopying onto the microcapsule paper, and subsequent raising, the entire process is still reasonably fast. Instant raised lines can be
produced on microcapsule paper using a new heat-pen device developed by ReproTronics.
2.4.1.8 Other methods
Numerous other methods exist for producing tactile graphics, although none
are widely used. For purposes of completeness we mention only their names here.
These additional methods include relief maps, cork maps and graphs, nongurative
pictures, sewing-machine diagrams, embossed aluminum-foil displays, movable-parts
displays, annel-board diagrams, magnetic-board diagrams, electroforming processing, nyloprint, silk screening, the solid-dot process, foam-ink printing, storm relief
printing, and screen drawings. Exhaustive coverage of all of the above techniques
are available in a variety of sources, including [15, 24, 25, 88].
2.4.1.9 Summary
These static display methods typically produce long-lasting, eective displays
of static visual information. For dynamic information, such as material displayed
on a computer screen, other access methods are more appropriate.
2.4.2 Auditory Interfaces
This thesis focuses on the production of tactile graphic output of information of a primarily graphical or visual nature, but it is worth noting that auditory
output is the method of choice for display of textual information for blind computer
28
users [15, 32]. While there is a wide variety of methods for production of tactile
graphics, output of computer-generated speech is more generic. Screen review software is used by the blind computer user to explore the textual material and to
select the desired passage. Typically, the software sends the text it encounters to
a hardware device, such as a speech-synthesis card added as an enhancement to a
computer, for conversion from text to speech [86]. There are many such software
programs and hardware devices on the market and in wide availability. The usability of the user interface and quality of the produced speech in such software and
hardware varies from manufacturer to manufacturer.
One big benet of speech output is that users who cannot read braille can use
it; in addition, it is generally quite aordable. Reliable speech synthesizers are available for most computers, and the quality of speech is typically quite good. Perhaps
the most attractive feature of the screen review and speech synthesis output method
is adjustable speaking speed, enabling a blind person to listen at 300 words=minute
or more [15, 81, 88], a speed that is quite competitive with typical sighted-reading
speeds of 250 to 500 words=minute [23].
The Nomad is an example of a multimodal device, combining static tactile
graphics with audio output. A tactile graphic, such as a map, is produced and axed
to the display surface of the Nomad. This surface is addressable via computer; and
each region can be mapped to sounds that will play in response to the associated
region being touched. The Nomad is well suited to museum displays and shoppingmall maps but requires assistance from a sighted person for conguration [24].
29
2.4.3 Dynamic Tactile Interfaces
Currently, the only dynamic tactile display device in wide use is the Optacon
(Figure 2.5). It is a vibrotactile display, comprised of a ngertip-sized matrix of 144
vibrating pins, arranged in a 24-row, 6-column format (Figure 2.6). This display is
contained in a portable case (8in 6in 2in, 4.0 lbs) and is powered by one 5volt, rechargeable, nickel-cadmium battery. Vibration is caused by piezoelectric lm
bimorphs, which vibrate with varying amplitude at 230Hz in response to varying
levels of current. Its use involves placing the nger of one hand onto the vibrotactile
display pad and using the other hand to pass a scanning device over the desired text
or image.
Figure 2.5: Telesensory's Optacon II in action [83]. User places index nger of one
hand on vibrotactile pin array and guides scanner across material to
be viewed with other hand. (Telesensory)
30
Figure 2.6: Layout of the vibrotactile pin matrix display of the Optacon.
The Optacon was designed as an alternative to braille for reading printed
text; but reading speeds are slower (50 words=minute after months of training
and practice) than with braille (104 words=minute), and much slower than with
synthesized speech output (300+ words=minute) [24, 26, 91]. The price of a new
Optacon, in the neighborhood of $4,000.00 U.S., is also an issue for some [24, 83].
As of the publication of this thesis, the company which produces the Optacon,
Telesensory, plans to discontinue production; and negotiations are underway with
other companies to continue production in the future [84].
During use, the pins of the Optacon display react independently in a oneto-one mapping of pixels, or groups of pixels, to pins in response to an image or
text passed under the lens of the scanner. Black regions of the scanned item cause
pins to vibrate while white regions inhibit vibration. Thus, a letter, line or picture
feels like a vibrating replica of the original [83] (Figure 2.7). However, the vibrating
display produces a noticeable amount of buzzing noise, and the vibration itself tends
to temporarily dull the sense of touch on the nger resting on the display after a
period of use.
31
S
Denotes active or
vibrating pins
Figure 2.7: Active pin matrix display of the Optacon, demonstrating display of
the capital letter S.
Precursor to the Optacon was the Tactile Vision Substitution System
(TVSS ) (Figure 2.8), which used a similar technique to display a vibrating representation of an image on a user's back [6, 98]. The image was captured by a
television camera and sent to a more widely spaced array of vibrating pins. The
idea of the system was eventually to produce a system by which a blind person could
wear a video camera and backpack display and actually maneuver through the world
using the vibrating representation of what the camera saw for guidance. The technique may have been ahead of its time, being bulky and noisy, even by early 1970's
standards. Modern technology may yet produce such a system for independent,
walk-around vision replacement [10, 17, 22].
Producing a dynamic tactile display is an active area of research. In a subsequent section we review some prominent research in this area.
32
Figure 2.8: Tactile Vision Substitution System (TVSS ) [98].
2.4.4 Haptic Interfaces
The term haptic refers to the proprioceptive, or positional, sense, which is
an extension of touch [41]. Thus, a haptic interface can represent three or more dimensions, whereas a tactile display provides only two dimensions. Haptic interfaces
are an important display method in virtual reality systems, capable of reproducing
a sense of position in space, interaction of forces, and even textures. Of course, the
original information must be multidimensional as well, often generated by mathgraphing packages or custom graphing software.
Examples of this highly active area of research include development of a
method for display of graphs of mathematical functions and scientic data using a
three-degree of freedom device called the PHANToM [30, 29, 59], protein molecule
docking simulations [14], three dimensional volume haptization [35], and successful
experiments in simulating textures with an enhanced joystick device [61, 62].
33
These devices are generally very expensive ($10,000.00 U.S. and up) and so
are still relegated to a small number of research facilities. It is hoped that eventually
aordable haptic interfaces will be readily available, providing blind computer users
with an even greater ability to explore traditionally visual information physically.
An in-depth study of haptic interfaces is beyond the scope of this work, although
progress in this area is clearly important to note. An extensive bibliography on this
topic is available in [62].
2.4.5 Dynamic Tactile Display Research
Enabling blind persons to access visual data on a computer meaningfully is
an area of vigorous research. Some of the more pertinent projects from the present
and near past include:
A virtual tactile tablet incorporating a vibrotactile display module demonstrated that increasing a graphic's size and its display resolution improved
recognition, while merely varying the complexity of a graphic's geometric
shape did not dramatically eect object recognition [99].
Experiments with a single-pin tactile mouse revealed that immediate tactile
feedback improved response times in GUI navigation tasks [85].
The use of nickel-titanium shape-memory allow (SMA) to provide actuation
of a tactile display shows promise as the basis for a lightweight and portable
display, although the power consumed and the heat produced by such a display
are still high. Further, current shape-memory alloy suers from brittleness,
slow response and recovery times, and lack of long-term durability [33].
34
A 64-solenoid, four-level, pin-based ngertip display, used to investigate tactual comprehension improvement through representation of levels of graphics
image intensity by varying pin heights on the display [28].
A virtual tactile computer display which uses electromechanically actuated
pins in a rectangular tactile array comparable in size to the sensing area of
the ngertip [40].
The use of polymer gels, or electrorheological uids, for fabrication of actuators
which then conceivably could be used in the development of a tactile display.
Such uids become rm when current is passed through them and could also
serve as the basis for a direct-touch, deformable tactile display [27, 63, 64, 68].
Past research delved into electrocutaneous stimulators, which delivered tiny
electrical shocks to the skin, and air jet stimulators, which replaced the pin
array with an arrangement of tiny holes where pus of air are aimed at the
skin [22]. Neither of these methods was particularly successful; these two
methods are generally accepted by the mainstream research community as
unworthy of further consideration.
2.4.6 Moving Toward Eective Tactile Display of Graphics
Audio output is not a solution for most graphics problems because of the
diculty of the Image Understanding Problem. In order for synthesized speech
output to provide adequate access to an image, the image would rst have to be
understood by the computer, an unlikely occurrence at present. The most promising
direction for research is toward creation of a refreshable tactile display. Such a
display would be the tactile equivalent of a standard computer screen, or cathode
ray tube, providing direct access to the graphical contents of the computer.
35
For such a dynamic display to be usable by blind persons, attention must be
paid to how graphic material is to be displayed. Clearly, the ngertip possesses a
much lower resolution than the eye, so complex visual information must be simplied
somehow.
Developing a system for performing such simplication, including factors related to method, eectiveness, usability, and future applicability, is the scope and
direction of this thesis work.
2.5 Representation of Images
An image is an alternative representation of some visual scene [52, 78]. These
representations include sketches, drawings, photographs, computerized graphics and
pictures, and motion picture lm and videotape. For purposes of this thesis, we can
safely restrict the discussion to computerized images.
2.5.1 Quantization
In order to create a computer image from some other type, some form of
quantization is performed. In this process, samples of the image are taken using a
scanner or digital camera at some regular interval and size, based on the desired
resolution of the nal quantized image. Each sample is assigned a discrete value, or
set of values, that represent the intensity or color of the sample as closely as possible.
In the process of performing this discretization, some resolution and clarity of the
original is necessarily lost, at least with any practical system. This loss is due to
sampling round-o error when mapping the analog real-world into the digital world
of the computer and allows the image to be processed by computer [72, 78].
36
2.5.2 Computerized Representation
The basic unit of the computerized image is the picture element or pixel [52].
For images represented solely as shades of gray, each pixel is assigned a single value,
typically an 8-bit integer. Thus, such an 8-bit grayscale image has an intensity range
of 256 levels of gray, with 0 typically indicating black and 255 indicating white.
Similarly, color images have three such 8-bit intensity levels associated with each
pixel, one each for the Red, G reen and B lue components. Each pixel in this 24-bit
color RGB image therefore can represent over 16 million (256 ) colors. Conceptually,
and physically, an image is stored in a two- or three-dimensional array (see page 42)
in the computer's memory.
3
For purposes of this research, we consider primarily complex computer images, quantized representations of photographs, electron micrographs, individual
video images, etc., as these present the greatest diculties when creating a tactile
representation. Simple images, such as sketches, diagrams, and line drawings often
can be converted straightforwardly into tactile form. Complex images are typically
comprised of a broad and unpredictable mixture of shape, color, intensity, and other
real-world complexities, presenting the most signicant challenges to access by the
blind computer user.
2.6 Image Processing
Image processing is a broad term describing the algorithmic transformation of
an image from one form to another [72]. Processes are divided into general categories
of point processes, area processes, frame processes and geometric processes [52].
Point processes are the simplest and most frequently used of the image processing
operations. A point process is an algorithm that modies a pixel's value in an image
37
based solely upon that single pixel's value or location. Common point processes are
image brightening, negative images, image thresholding, image contrast stretching
and image pseudocoloring.
Area processes use groups of pixels surrounding a central pixel of interest to
derive information about an image. This group of pixels, often referred to as a neighborhood, is examined in some algorithmic fashion as a group. This examination, for
instance, can determine the brightness trend information or spatial frequency, with
the result utilized in determining a new value applied to the central pixel of the
neighborhood. Examples of area processes include edge enhancement and detection, image sharpening, smoothing and blurring, and removal of random noise. An
area-process algorithm typically involves the convolution of some weighting factors
contained in a convolution kernel. Convolution (see page 45) can be thought of as a
weighted summation process, which produces a new value for a central pixel based
on some function of the values of a number of its neighbors.
Frame processes use information from two or more images, or video frames,
together with a combination function to produce a new image. Among the many
practical applications of frame processes are motion detection, background removal,
image-quality enhancement and image combination.
Geometric processes change the spatial positioning or arrangement of pixels
within an image based upon some geometric transformation. Typical operations
performed by geometric processes include image scaling, sizing, rotation, translation
and mirror imaging. Example uses include spatial aberration correction, image
composition and special eects.
38
2.6.1 Applicability to Tactual Perception and TACTICS
Production of tactually perceivable tactile images bears some similarity to the
challenges of the eld of computer vision. The aim of computer vision is automatically to provide analysis of an image on which some decision can be based [12, 66].
Image processing techniques are invariably used in this task to transform an image in such a way as to produce some form of useful output. Similarly, the aim of
TACTICS is to present a visual image in a tactile format such that it is useful in
some way to an observer. Image processing techniques would appear to be a natural
approach to use. The limits to tactile resolution, and the understood importance of
reducing to an essential minimum the information presented to the ngertip, clearly
calls for a simplifying transformation of complex images.
Many image processing algorithms are known for accomplishing various simplifying transformations on an image [7, 9, 52, 70, 72, 76, 87]. We can reduce
a photograph to line information only, remove noise, caricaturize a human face,
reduce resolution or separate an image into distinct regions. These techniques,
and others, are motivated and applied in a tactile graphics creation system called
TACTICS. Viewed in terms of computer vision, the aim of this prototype system is
to process images automatically images such that the result can be output as useful,
in this case comprehensible, tactile graphics.
39
Chapter 3
TACTICS: TACTILE IMAGE CREATION SYSTEM
Converting visual information into tactile information in an automatic, timely
and ultimately comprehensible fashion is the force propelling development of this
prototype system. The lessons learned from the areas of tactual perception, tactile
graphic production and the applicability of image processing techniques to tactile
graphic generation are extended to and applied in the creation of this system. The
details of the system, including the justication for its development, the specic
algorithms used for image simplication, the software and hardware utilized, and the
complete procedure for acquiring, transforming and tactilizing visual information,
are discussed.
3.1 Automatic Generation of Tactile Graphics
The production of tactile graphics, as we have seen, can be a time-consuming
process of careful translation from visual to tactile form necessitating the involvement of a sighted person. Cost and timeliness prevent most blind persons from
having ready access to the abundant high-quality computer images available on the
Internet and elsewhere. With an automatic method for performing such translations, increased access to the wealth of computerized graphical information could
be provided. Such information is, at present, essentially inaccessible, requiring the
40
intervention of a sighted person to perform conversion from visual to tactile form.
Automatic computerized conversion can be accomplished aordably, using readily
available or easily adaptable technology, combined with the appropriate image processing techniques.
A technique for the automatic generation of tactile graphics involves acquiring
an image, performing some simplifying processing, and displaying the result on a
tactile output medium, such as microcapsule paper or a dynamic, real-time tactile
display.
3.2 Genesis of TACTICS
The TACTile Image Creation System (TACTICS) is an attempt to further
the state-of-the-art of research in the area of automatic tactile graphic generation.
This prototype system is made up of software and hardware components, making
use of available image processing packages and static tactile graphic production
techniques.
The impetus behind the development of this experimental system was a perceived lack of research being performed in addressing accessibility issues related to
complex image information. The focus of much of the research in computer access
to graphical information for blind persons is restricted to narrow categories of information, such as mathematical formulae, iconic navigation, or better auditory access
to text. Our aim is to provide a general method for providing access to photographic
and other visual information that is in electronic form. It is hoped that this thesis
will serve as a starting point in what is an exciting and heretofore uncharted area
of research, rich with implications and possibilities.
41
3.3 Image Processing Algorithms
There are a great many algorithms that process images to produce a wide
variety of eects. In this thesis we are concerned with the eect more than with the
specic means. For a thorough understanding of how the classes of algorithms we
have chosen operate on images, and how they relate to our goal of image simplication, we present a brief and somewhat simplied introduction to each of them. For
purposes of this discussion, we assume that an image is grayscale, although these algorithms have forms that work equally well for color images. Since we are concerned
neither with moving images nor geometric transformations, we do not consider frame
or geometric processes; rather, we restrict coverage to a number of point and area
processes. Detailed theoretical treatment of image processing techniques is available
in [72], while an implementation-oriented approach is given in [52].
3.3.1 Notation
For clarity, the notation used within this thesis to describe images and image
processing algorithms is dened here. A grayscale image X of overall width w and
height h can be represented by a two-dimensional array of points, each of which has
a certain value, denoted by Xm;n, representing the brightness or intensity of that
point (Figure 3.1).
X X ::: X w
X X ::: X w
X X ::: X w
::: ::: ::: :::
Xh Xh : : : Xhw
Figure 3.1: Format of two-dimensional image.
11
12
1
21
22
2
31
32
3
1
2
42
A color image has a set of three intensity values, one each for the red, green
and blue components of each pixel, associated with each position in the array .
Formally, an 8-bit grayscale image is described by:
1
X = 1 m w; 1 n h; Xm;n 2 f0; 1; : : : ; 255g
(3.1)
The set of points N in a square region of width w0 surrounding a given point
is the neighborhood of that point. For points that are closer than w ? points to an
image boundary, the neighborhood will include only those points falling within the
image. The neighborhood of a point Xm;n is denoted by the set:
0
1
2
Nm;n = f w0 is odd;
max(m ? w ? ; 1) i min(m + w ? ; w);
max(n ? w ? ; 1) j min(n + w ? ; h) :
Xi;j
0
0
1
2
0
1
2
0
1
2
2
1
(3.2)
g
An algorithm a is represented by a mathematical function Fa that transforms
an image X into a processed image Y , as follows:
Y = Fa (X )
1
(3.3)
Although color images are often represented in this RGB format, numerous
other representational schemes exist. Among the most common of these methods are: Cyan, Magenta and Yellow (CMY), Hue, Saturation and Value (HSV),
Hue, Saturation and Lightness (HLS), Hue, Saturation and Intensity (HSI), and
Hue, Chrominance and Intensity (HCI).
43
3.3.2 Edge Detection
An edge detection algorithm attempts to locate and highlight edges in an
image (Figure 3.2). These edges are simply the portions of an image where there is
a rapid change in intensity. The faster such a transition is made from light to dark,
or vice versa, the more likely an edge detection algorithm is to consider the center of
such a transition as an edge. Each pixel that is found to be part of an edge is set to
the color white, while non-edge pixels can be left alone or assigned the color black
using some thresholding function. A common version of this algorithm is the Sobel
edge detector, which accomplishes edge detection by using the scaled average of one
of a 3 3 pixel neighborhood's horizontal or vertical directional derivative, as rst
described in [70]. The Sobel edge detection function makes use of two matrices, or
masks, one each for the vertical and horizontal directions:
V
2
66
= 666
4
?1 0 1
?2 0 2
?1 0 1
3
7
7
7
7
7
5
H
2
6
6
= 666
4
1 2 1
0 0 0
?1 ?2 ?1
3
7
7
7
7
7
5
(3.4)
Figure 3.2: Before and after Sobel edge detection algorithm. (public domain)
44
These masks are convolved over an image. Generally speaking, convolution
is a linear-only algorithm that involves passing over an input image pixel by pixel,
applying some transformation to each point or to the neighborhood of a point to
generate a new value, and then placing that new value at the same position in an
output image. In the case of Sobel edge detection function FS , the two masks V
and H are applied as follows for each point (m; n) in image X :
Am;n = Nm;n V
(3.5)
Bm;n = Nm;n H
(3.6)
A0m;n =
0 =
Bm;n
X
u2Am;n
q
X
v2Bm;n
u
(3.7)
v
(3.8)
0
FS (Xm;n) = A0m;n + Bm;n
2
2
(3.9)
This is a very computationally expensive operation to perform, particularly
for larger images, due to the necessary 20 multiplications, 19 additions and 1 squareroot operation per pixel. There are numerous methods described in the literature
that can speed up this process. In the system implemented for this thesis, the technique used is a combination of shortcuts and a simple comparison and thresholding.
Note that one-third of the elements in each mask are 0s, so a third of the multiplications can be eliminated. Rather than multiply elements by -1 or 2, a unary negative
sign or left-shift-by-one-bit operation is used, respectively. The computational cost
of these two modications is equivalent to an addition rather than multiplication
step. Since the maximum intensity value of a pixel is 255, squared values above
65025 (255 255, which is precomputed one time only) can be merely assigned 255.
Finally, for the remaining computations, the square root is taken as in Equation 3.9.
45
With these modest modications, the number of operations performed is reduced
to 2 multiplications (the squaring operations), 17 additions and 1 square root.
3.3.3 Blurring
Often referred to in the literature as low pass ltering, blurring reduces the
detail in an image by removing the high frequency component [78]. It accomplishes
this by using the values of all pixels in a neighborhood, assigning some function
of those values to the center pixel. Application of either a Gaussian or averaging
function are two common techniques to accomplish blurring. Averaging is the most
straightforward and fastest technique and, considering the low resolution of the
human ngertip, is sucient. The blurring function FB is described as:
P
FB (Xm;n) = jvN2Nm;nj
m;n
v
(3.10)
Applying this function to all pixels in an image produces a blurry version of
the original image (Figure 3.3).
Figure 3.3: Image before and after application of blurring algorithm.
46
This is also described as the convolution over X by a blurring mask or kernel.
For example, the blurring algorithm used in this research is accomplished with the
following 3 3 kernel B :
3
2
1 1 17
6
7
6
6
(3.11)
B = 66 1 1 1 777
5
4
1 1 1
3.3.4 Segmentation
Images are generally comprised of one or more regions, dened as sections or
segments of an image whose members are closely related by color or intensity. A
common technique for locating segments is called K -means segmentation [51, 76, 89]
(Figure 3.4). In this algorithm, each pixel is assigned to one of some number K of
dierent groups, based on its own intensity level. This technique divides pixels with
closely related intensities into like groups or clusters, producing an image that is
segmented by intensity. A similar segmentation can be performed based on color.
Algorithmically, the K -means segmentation applied to image X is described as
follows [89]:
Figure 3.4: Image before and after application of K -means segmentation algorithm, with K = 2.
47
Step 1. Choose K initial cluster centers z (1); z (1); : : : ; zK (1). These can
1
2
be chosen arbitrarily as, say, the intensity values of the rst K pixels in X ,
or evenly spaced across the range 0 ? 255 as is implemented in the system
described in this thesis.
Step 2. At the kth iterative step, distribute the intensity values fXm;ng among
the K cluster domains, using the relation:
Xm;n 2 Sj (k); if jXm;n ? zj (k)j < jXm;n ? zi (k)j
(3.12)
8i = 1; 2; : : : ; K; i 6= j , where Sj (k) denotes the set of intensity values whose
cluster center is zj (k).
Step 3. From the results of Step 2, compute the new cluster centers zj (k +
1); j = 1; 2; : : : ; K , such that the sum of the squared distances from all points
in Sj (k) to the new cluster center is minimized. This is simply the mean of
Sj (k), given by:
zj (k + 1) =
P
Xm;n 2Sj (k) Xm;n
jSj (k)j ; j
= 1; 2; : : : ; K
(3.13)
It is from this manner in which each of the K cluster centers are iteratively
updated with the average value for each cluster that the name \K -means" is
derived.
Step 4. If zj (k + 1) = zj (k) for j = 1; 2; : : : ; K , the algorithm has converged
and can be terminated. Otherwise, go back to Step 2 and continue.
The fundamental drawback of this general statistical analysis of, or histogrambased approach to, image segmentation is the inherent disregard for spatial coherence [67]. Adaptive segmentation attempts to take into account a smaller portion
48
of an image, producing a segmentation based only on that portion. The eect of
this process can be to retain more of the original image information, producing a
segmentation which more closely resembles the original (Figure 3.5). This result
often is achieved at some computational expense and many times produces a result
only marginally better than a straightforward segmentation algorithm for purposes
of image simplication and automatic tactile graphics generation.
As implemented for this thesis, the adaptive version of the algorithm performs
the same steps as the K -means segmentation algorithm, with the dierence being
that it operates to convergence on each pixel in X before moving to the next pixel.
Thus, the K -means algorithm is performed on some subset or window of, and in
complete isolation from, the image as a whole. Inspiration for this implementation
is drawn from portions of an adaptive segmentation algorithm that uses a Gibbs
random eld model and a hierarchical approach described in [67].
Figure 3.5: Image before and after application of an adaptive K -means segmentation.
49
3.3.5 Negation
The negation of an image is produced by inverting the intensity of each pixel
in the image (Figure 3.6). This process involves inverting the intensity of each pixel
in turn, reassigning this new value to each. Negation is described by this simple
function:
FN (Xm;n) = 255 ? Xm;n
(3.14)
Every home photographer is familiar with the negatives that are returned
with developed lm. The negation of a computerized image is just such a negative
image. Negation often is applied in conjunction with another algorithm. In the
case of a strictly black and white or binary image with more black than white,
subtracting the intensity of each pixel from the maximum reverses the eld and, it
is hoped, makes foreground features such as edges black. This negation improves
the legibility of a tactile image, specically when it is output on microcapsule paper,
since the black portion of the image raises while the white portion remains at.
Figure 3.6: Image before and after application of negation algorithm.
50
3.3.6 Median Filtering
Median ltering is a method for removal of noise from an image [72]. Generally, noise in an image is described as an individual pixel of greatly diering intensity,
or outlier, compared to the typical pixel in a neighborhood. Dierentiating noise
from minute detail, or ltering out noise while leaving the desired image intact, is
not always so straightforward [4], particularly when an image is complex. Performing edge detection on an image, as is often applied in our TACTICS processing,
tends to accentuate these outliers, whether noise or detail.
The median ltering algorithm sorts the intensity values of pixels in a neighborhood, assigning the median value of the neighborhood to the center pixel. This is
repeated for all pixels in the image, with the eect being a reduction in the number
of outliers while preserving edges and non-noisy portions of the image (Figure 3.7).
An especially fast version of the median ltering algorithm can be found in [38].
The function FM for median ltering is described as:
FM (Xm;n) = Median(Nm;n)
(3.15)
Figure 3.7: A noisy processed image before and after the application of median
ltering.
51
3.4 Image Processing Tools
The software for our prototype system for automatic generation of tactile
images is implemented in the C programming language as an extension to the Xwindows image processing application XV, developed at the University of Pennsylvania [13]. As of publication, the complete source code for this package is readily
available via anonymous ftp at ftp.cis.upenn.edu in the directory pub/xv. The
license fee is quite reasonable for this user-friendly software, and it was found to be
easily extended to include additional image processing algorithms. The extended
version is available via ftp at ftp.asel.udel.edu in pub/sem/xv-mod.tar.Z. Instructions on how to add additional algorithms to the XV package are in the le
xvalg.c.
Some preliminary experimentation with various image processing algorithms
was performed using MATLAB's Image Processing Toolbox [87]. The exibility
of MATLAB, combined with its wide acceptance and availability, made this an
attractive and practical development platform.
3.5 Tactile Imaging
3.5.1 Description
Tactile imaging is the conversion of a visual image into a form that is perceivable using the sense of touch. This conversion can be accomplished using a variety
of techniques. TACTICS performs this conversion automatically by applying image
processing algorithms to a complex image, such as photographic and other visual
information from the areas of science, engineering, mathematics, medicine, art and
others. This conversion is done entirely in software, developed as an extension to
52
XV. This prototype system involves a number of experimenter-selected sequences
of algorithms applied in a controlled fashion, although this process easily could be
implemented to run entirely unsupervised, and in fact some trials were conducted
to verify this conclusion.
3.5.2 Development
Development of the software package involved acquiring the source code for
XV, conducting a search of image processing literature to determine the techniques
best suited for the purposes of image transformation, and implementing a number
of these algorithms as extensions to XV. The algorithms that were chosen represent
some of the most widely used or standard techniques, although some attention was
paid initially to more sophisticated methods. It was found that elegant and sophisticated algorithms, while of considerable benet in the visual domain, produced little
if any benet when used to produce tactile images. This absence of benet is due
primarily to the lower resolution of the ngertip, which cannot take advantage of
details ner than its physiologically imposed limits. Generally, simpler was found
to be better in the course of this research.
3.5.3 Sequencing of Algorithms
When more than one image processing algorithm is applied to an image, the
sequence of application can greatly eect the outcome. For example, applying edge
detection to an original image followed by segmentation produces a relatively simple
and smooth outline of the original, while applying segmentation followed by edge
detection produces a more complex and jagged outline of the original. Another
example of the eect of sequenced algorithms is in the use of a blurring algorithm.
53
By blurring an original image before applying edge detection the resulting edges
are thicker and the occurrence of falsely identied edges is lesser. These examples
illustrate the importance of considering the interactions among image processing
algorithms when attempting to convert an original image into a simplied version
suitable for tactile exploration. In the next chapter a number of pertinent algorithm
sequences will be discussed and their use in TACTICS will be motivated.
3.6 Tactile Output
3.6.1 Microcapsule Paper
Microcapsule paper was chosen as an output medium due to its wide availability, relatively low cost, and ability to render tactile graphics quickly. We compared
the two known brands of paper on the market, Repro-Tronics Flexi-Paper and a
paper imported by the Matsumoto Kosan Company. A comparison of the manufacturers' specications for the two types of paper reveals that there is very little
dierence in the vital qualities of resolution, response time, cost and displacement.
One laboratory observation, as measured using a mill-meter, is that the displacement achievable with the Matsumoto Kosan paper tends to be more consistent in
practice than the Repro-Tronics paper. Measurements reveal this to be the case,
but also show that typical displacement is approximately 1mm for both varieties of
paper.
The signicant dierence between the two appears to be the durable nature
of the Flexi-Paper, which is highly resistant to folding and crumpling. The stier
Matsumoto paper is more familiar in feel to the blind community, being similar to
the heavy paper used by embossing braille printers, but is prone to cracking and
creasing under adverse conditions. For purposes of our experiments, we used both
54
types of paper and discovered that subjects often preferred the slightly stier feel
of the Matsumoto paper versus the spongier feel of the Flexi-Paper.
3.6.2 Tactile Image Enhancer
To develop, or pu up, the microcapsule paper we used a Repro-Tronics
Tactile Image Enhancer (Figure 3.8). The device has a motor-driven roller which
passes the paper face up underneath a tubular light bulb. The heat from the lamp is
absorbed by the dark regions of printing on the paper, causing the polystyrene microcapsules in those areas to expand but leaving the unprinted regions at. The time
taken to develop a single sheet of either type of microcapsule paper is approximately
ten seconds.
Figure 3.8: Tactile Image Enhancer. (Repro-Tronics)
55
3.6.3 Additional Equipment
Original and processed computerized images were rst printed out on a commercial 600dpi oce laser printer. Next, they were copied onto microcapsule paper
using a typical oce photostatic copier machine. Other than these devices, the Tactile Image Enhancer, and the computer itself, the only additional material needed
in the prototype system was a large supply of both varieties of microcapsule paper.
3.7 Experimental Procedure for Tactile Image Creation
The procedure for producing a tactile image from a visual one is straightforward. The involvement of a sighted person is necessary in the current stage of
our research system. Future versions of TACTICS could be made to operate in an
unsupervised manner, eliminating the need for a sighted person to be involved. The
procedure involves three phases: Acquisition, Simplication, and Tactilization.
3.7.1 Acquisition of Images
Images were acquired in a fairly random manner from standard image processing benchmark collections, scientic data acquisition, and from a wide array of
sources available on the World Wide Web. Every attempt was made to select a
representative sampling of the available images (see Appendix A). We also looked
for candidates from similar classes of images, for example faces, or more generally
rounded images, which could prove dicult to distinguish from one another once
simplied, as a way to test how ambiguity is dealt with by our prototype system
when used by experimental subjects.
56
3.7.2 Simplication
The preparation of simplied images was achieved using a number of diering,
aggregate, image processing sequences. An image was rst loaded into XV. Then,
the applicable sequence of image processing algorithms was applied. Finally, this
processed image was printed on a laser printer in preparation for expansion in the
subsequent phase.
3.7.3 Tactilization
The printed version of the processed image was photocopied onto one of the
two types of microcapsule paper. The microcapsule paper was then fed through
the Tactile Image Enhancer, creating the raised tactile image. This procedure was
repeated for all images using the variety of image processing algorithm sequences as
specied in the experiment protocol.
57
Chapter 4
EVALUATION OF TACTICS
The primary goal of the procedures used by TACTICS to convert visual information automatically into tactile information is to provide meaningful access to
previously inaccessible content. A series of experiments was conducted to evaluate
the eects of this prototype system upon a subject's ability to (1) discriminate,
(2) identify and (3) comprehend tactile representations of visual information. A
general accounting of subject selection and experimental material production, including the use of various image processing techniques, is provided. The selection
of the specic aggregate image processes for use in these experiments is discussed
and justication is given linking these processes with theories of psychophysics.
For each experiment conducted as part of this evaluation, descriptions of the
subjects, materials used and experimental procedures are provided. The results of
each experiment, including data comparing results based on types of microcapsule
paper and the level of vision of subjects, are reported and analyzed.
4.1 Overview of Experimental Protocol
The protocol used in these experiments was designed to evaluate the eect
of TACTICS upon the accessibility of visual information in a tactile form. Every
58
attempt was made to acquire a diverse sample of subjects and images and to assure
that experimental materials were produced in an automatic and uniform fashion
free from the aesthetic biases of a sighted person.
4.1.1 Selection of Subjects
Blind, low-vision and sighted subjects were used in the following experiments.
As previously noted, the tactile acuity of blind and sighted persons, whether male or
female, is essentially identical [56], although blind persons tend to have more experience making active use of the sense of touch [93], while sighted subjects generally
have a more highly developed visual memory [77]. Any dierence in the performance
of blind and sighted subjects is noted and discussed.
4.1.2 Production of Materials
As mentioned earlier, images were gathered electronically from a variety of
sources and were prepared rst by grayscaling any that were color images to achieve
uniformity. This homogeneity was necessary because microcapsule paper expands
only in response to the color black. Depending on the experiment, one or more
image processing algorithms were then applied in a specic order to each image.
Once the images were processed, they were printed out on a standard oce
laser printer, photocopied onto sheets of microcapsule paper, and expanded using
the Tactile Image Enhancer. Both types of microcapsule paper (see page 25) were
used in the production of experimental materials.
59
4.1.3 Aggregate Image Processes
The ve experiments conducted made use of image simplication techniques
selected from a collection of seven aggregate image processes, dened here as:
1. No Processing: A tactile image is produced directly from the original grayscaled version of the image (Figure 4.1). Experimentally, these images serve
as a benchmark upon which the eectiveness of further processing can be
measured. The unprocessed image represents the visual information in its raw
form, the state in which it is currently available without the intervention of a
sighted person.
2. Edge Detection (with thresholding): Emphasizing the edge information
in an image might be all the simplication that is needed. Much of the theory
previously discussed indicates that converting an image into a simpler sketch
or line-drawing representation should enhance recognition. The Sobel edge detection operator is used here (Figure 4.2), as it is widely used and considered
to be eective for general purpose edge detection, although any one of a number of edge detectors could quite easily be substituted. Note that thresholding
is performed on the image, with edge points being set to one intensity value
while non-edge points are set to a second value. In this way, a binary edge-only
version of the original is produced. Depending on the implementation of the
thresholding conducted in association with a given edge detection algorithm,
it may be necessary to apply negation to the result (see page 50).
3. Edge Detection (without thresholding): By eliminating the thresholding
inherent in the standard Sobel edge detection algorithm, and instead merely
replacing each point in an image with the raw output of the Sobel operator
for that point, a slightly dierent result is produced (Figure 4.3). Note that
60
Figure 4.1: Original unprocessed grayscale image of the chimney end of a house.
(public domain)
Figure 4.2: Image of house before and after processing using Sobel edge operator
with thresholding.
Figure 4.3: Image of house before and after processing using Sobel edge operator
without thresholding.
61
the edges are still highlighted but there is quite a bit of background noise
remaining in the image. In this form the image is not particularly useful
for purposes of tactile graphics because most of it would still expand when
developed on microcapsule paper; but when coupled with a subsequent Kmeans segmentation, an adaptive thresholding technique, the resulting image
has a more complete edge detection than the edge detector that uses xed
thresholding (see \Edge Detection (with thresholding) and Segmentation"
below). The reason for this is that strict thresholding tends to disregard some
of the less dened edge information, while in this case that information is
left behind potentially to be recognized by the more sophisticated adaptive
thresholding as performed by K-means segmentation algorithm.
4. Segmentation: Performing a segmentation divides an image into regions. In
this application, we perform a binary segmentation via adaptive thresholding
using the K-means segmentation algorithm, which produces regions of white
and black only (Figure 4.4). This representation is modeled on the way it
is believed that the mind classies and stores image information, namely in
some hierarchical fashion, from general characteristics to specic [9, 90]. In
the case of segmentation, general characteristics are emphasized. Note that in
some instances negation (see page 50) was applied following an application of
segmentation to emphasize content rather than background.
5. Edge Detection (with thresholding) and Segmentation: As mentioned
above, performing a segmentation on a previously unthresholded, edge detected image serves further to enhance edge information (Figure 4.5) that
might normally be ignored by a standard edge detector that uses xed thresholding. By using this aggregate process, the dubious result of applying edge
detection without thresholding is actually advantageous in that a more completely edge detected image is produced (Figure 4.6).
62
Figure 4.4: Image of house before and after processing using K-means adaptive
segmentation algorithm.
Figure 4.5: Image of house before and after processing using Sobel edge operator
without thresholding followed by K-means segmentation.
Figure 4.6: Comparison of eect of Sobel edge detection using xed thresholding
from Figure 4.2 (left) with Sobel edge detection utilizing adaptive Kmeans segmentation (for thresholding) from Figure 4.5 (right).
63
6. Segmentation and Edge Detection: A comparison with the reverse procedure, namely application of segmentation rst followed by edge detection
with thresholding, is revealing (Figure 4.7). Note that, while the simplied
image still resembles the original, the representation is more discontinuous and
noisy, the result of extracting segmented region edges. Since thresholding is
performed by the initial K-means segmentation, the two varieties of the Sobel edge detector described here produce identical results. For more complex
images, such as faces, the results are even more noticeable (Figure 4.8).
7. Blurring, Edge Detection, Segmentation and Median Filtering: This
aggregate process takes into account as much of the previously discussed cognitive and perceptual theory as possible to produce a result that, at least
visually, appears to be quite simple (Figure 4.9) while still clearly resembling
the original. The blurring step represents the lower bandwidth capabilities
of the ngertip as compared with the eye. The result of this blurring has a
potentially benecial side-eect, thicker edges, which appears during the subsequent edge detection. Without the initial blurring step, the resulting lines
in the nal representation tend to be thinner and sometimes less continuous
(Figure 4.10).
When the edge detector is applied without thresholding to the blurred image,
edges appear thicker due to the slight spreading or softening of rapid intensity
changes in the original. The segmentation step, as before, cleans up the result
of the edge detector. The nal median ltering step removes any stray noise
that was not removed by the segmentation. In fact, there is a proportion of
noise that is enhanced rather than removed by the adaptive thresholding of
the segmentation step. Median ltering counteracts much of that eect.
64
Figure 4.7: Image of house before and after processing using K-means segmentation followed by Sobel edge detection.
Figure 4.8: Images of a face demonstrating the dierence between two sequences
of processing. From left to right: Original image, image after Sobel
edge detection without thresholding followed by K-means segmentation, and image after K-means segmentation followed by Sobel edge
detection. (US Govt)
65
Figure 4.9: Image of house before and after processing using the aggregate se-
quence of processes: blurring, Sobel edge detection without thresholding, K-means segmentation and median ltering.
Figure 4.10: Comparison of image of house using the aggregate process from Figure 4.9 (left) and the same aggregate sequence of processes with the
exception of the initial blurring step (right).
66
4.1.4 Psychophysics and Experimental Procedure Justication
To evaluate the eectiveness of this processing for automatic generation of
tactile images from visual images, ve sets of experiments were performed. These
experiments were designed to measure performance on a basic psychophysical level.
The eld of psychophysics, the study of physical and psychological aspects of perception and their interrelationships, identies four basic perceptual tasks: (1) detection,
(2) discrimination, (3) identication, and (4) comprehension [18]. As with all the
senses, these four attributes apply to tactual perception, which is a major concern
regarding the methods put forth in this thesis.
4.1.4.1 Detection
Measuring detection using the sense of touch involves designing a task that
addresses the question, \Is there anything there?" As previously discussed, many
limits of the physical detection abilities of the ngertip are known. Since the properties of microcapsule paper produce tactile graphics that are well within the range
of such touch perception, any experiment designed here would be trivial. Thus, it
is safe to accept as an assumption that TACTICS produces tactile images that are
detectable. Thus, no experiments were performed to measure detection of tactile
images, as all experiments relied on the implicit ability of subjects to detect the
raised tactile images.
4.1.4.2 Discrimination
The ability to discriminate is an important perceptual task for any of the
senses. Discrimination answers the question, \Is this stimulus dierent from that
67
one?" For the sense of touch, discrimination tells us simply whether two tactile
objects are the same or dierent. The experiments to measure the eectiveness of
TACTICS to aid in discrimination involved a task similar to the traditional matching
game of Concentration. In the study, subjects felt one of a closed set of similarly
processed tactile images and then attempted to locate the identical tactile image
from among a randomly arranged duplicate set. In further experiments, subjects
felt a series of arbitrarily paired tactile images for a period of time and then reported
whether or not each of the pairs felt similar or dissimilar.
4.1.4.3 Identication
Being able to identify what something is by its perceived characteristics is
another basic perceptual task. Identication as it applies to tactile images involves
the eectiveness of a representational technique to allow a person to answer, \What
is it?" The cognitive load imposed by identication is higher than that for detection
or discrimination, so the experiment to measure it is also more involved. In the
experiment to assess this factor, subjects felt a series of tactile images, and for each
image were given four categories and asked to identify into which category each
stimulus belonged.
4.1.4.4 Comprehension
Comprehension means that questions regarding the content of an image
should be answerable. Comprehension is generally accepted as a key consideration
it the eectiveness of any perceptual event and therefore is an important factor to
explore in the design of an interface to a GUI environment for blind computer users.
This experiment measured how well a selected TACTICS aggregate image process
68
aected comprehension of tactile images. Subjects were provided with a brief description of each image and then were asked a number of questions regarding the
content of each image.
4.2 Experiments
Five experiments were conducted to evaluate the eectiveness of the system.
The rst was a pilot study, which measured simple discrimination of tactile images
and was aimed at determining whether or not further exploration of these techniques
was worth pursuing. More rigorous tests were then performed to examine simple
and timed discrimination and tactile image identication and comprehension. Note
that approval to conduct these experiments was obtained from the University of
Delaware Human Subjects Review Board (see Appendix F). All experiments were
conducted by this author.
4.2.1 Pilot Study
A pilot study was conducted to determine whether the use of image simplication for purposes of automatic generation of tactile graphics was a worthwhile
technique to explore further [94]. A set of eight digital images was collected. This
set purposely included some ambiguity of overall shape. The set was comprised of
rounded images, three faces and a hot air balloon, and square-shaped images, the
chimney of house, a notebook computer, a space-shuttle launch and a diagram of a
human heart.
A matching task, described in more detail below, was performed to measure each subject's ability to discriminate among the tactile images. Results were
69
recorded for each of the subjects regarding successful versus unsuccessful matches
for each of the images and processes applied. The results for the various processes
were compared, and some interesting anecdotal evidence was noted.
4.2.1.1 Subjects
A group of four sighted subjects, two male and two female, all in the 20- to 40year-old range, was used in the study. The subjects all participated voluntarily and
were co-workers of the author at the Applied Science and Engineering Laboratories,
a joint research facility of the University of Delaware and the A.I. duPont Institute.
For the experiment, subjects were blindfolded and given no information regarding
the content of the tactile images.
4.2.1.2 Materials
Each image was processed in ve ways: (1) using grayscaling alone (for uniformity), (2) K-means adaptive segmentation, (3) Sobel edge detection, (4) K-means
segmentation followed by Sobel edge detection, and (5) Sobel edge detection followed
by K-means segmentation. For each combination of processing, the eight resulting
images were printed out in the same size (2.5in x 2.5in), and arranged in an arbitrary
order on a single blank sheet of paper.
A second sheet was prepared using the same processed images arranged in
a dierent random order. Subsequently, these sheets were photocopied onto microcapsule paper (Repro-Tronics) and raised using the enhancer device. Thus, the
resulting experimental materials consisted of pairs of sheets of raised images, one
70
pair for each of the ve processing sequences (see Appendix B). Each sheet contained all eight images, and each member of a pair had the eight images arranged
in a dierent order from its mate.
4.2.1.3 Procedure
Subjects were asked to perform a basic matching task using each pair of sheets
of processed tactile images. For each of the ve types of aggregate processes the
appropriate pair of sheets was placed on a table in front of the seated and blindfolded
subject. First, the subject's hand was placed onto one processed image on one sheet,
and the subject was allowed to explore the image freely. Then, the subject's hand
was guided to an arbitrary location on the second sheet and the subject attempted
to locate the identical object on this second sheet. This task was repeated for each
of the eight images on each of the ve pairs of identically processed sheets.
4.2.1.4 Results
Table 4.1 contains the results of the pilot experiment. The table columns
indicate the type and order of image processing used, the average number of matches
out of eight per subject, the average percentage of matches per subject and analysis
of variance for each of the algorithm combinations used. Note that analysis of
variance was used to gauge interaction between the group of unprocessed Grayscale
versions of images with each of the groups of various other processing used. Analysis
of variance was also used to explore interaction of the results of various processing
versus the results expect by chance (12.5% or ).
1
8
71
Table 4.1: Summary of per subject average results of the tactile image matching
task for ve image processes [94].
Mean Mean Pct.
Image process
Matches Matched
Grayscale
2.25/8
28%
and K-means
6.25/8
78%
and Sobel
4.75/8
59%
and K-means & Sobel 5.75/8
72%
and Sobel & K-means 7.75/8
97%
p
p
(vs. Grayscale)
(vs. Chance)
1.00e+00
2.85e-05
4.01e-04
3.60e-03
4.47e-06
2.50e-03
7.60e-07
5.53e-06
2.29e-04
1.71e-07
4.2.1.5 Discussion of pilot study
The results of the pilot study indicate that even a modest amount of simplication yields a marked improvement in tactile image discrimination. Comparison of
the means shows that all image processing techniques used increased the subjects'
chances for correctly locating matching tactile images. Images that were simpler at
the outset were recognized more easily in all cases. In particular, the illustration
of the human heart chambers and a photograph of an opened notebook computer
tended to be distinguishable even with no processing, probably due to a white background and simple initial representation. There was often confusion among the three
images of human faces and a hot air balloon, each of which had an essentially oval
shape. We observed a general tendency among subjects for discrimination ability to
increase as tactile images became simpler.
Analysis of variance indicates statistically signicant interaction between each
of the forms of processing used when compared with the unprocessed Grayscale originals. The processing that utilized edge detection followed by segmentation showed
the greatest interaction in addition to the best mean performance for the matching task. The other forms of processing also showed strong interaction, indicating
simplication of various forms had a noticeable eect.
72
Compared with results expected by chance, there are strong indications of
interaction for all forms of processing with the exception of Grayscale alone. Thus,
it is fair to say that it is quite probable that the improvement in subject performance
is not merely the result of random chance.
Some interesting anecdotal evidence was gathered. A number of subjects
reported, upon feeling the processed images, that they thought there was more than
one face among the images, though none had any idea ahead of time as to the
content of the images. This content identication was not reported upon feeling the
original unprocessed images.
After an initial period of trying various exploratory techniques, each of the
subjects independently arrived at the same method for exploring the images. The
tendency was to use the outside edges of an image for gross classication and comparison. Once this gross comparison was made, the details of the interior of an image
were explored, seemingly to dierentiate among those with similar overall shapes.
The crucial result of this pilot study was that simplication techniques, applied automatically to electronic images using computerized image processing, improved discrimination for tactile images. When combined with the anecdotal accounts of the content identication that was apparently facilitated by image simplication, this result strongly indicated that this method was valid and deserved
further investigation.
Because of the strength of these preliminary ndings, improvements were
made to the prototype system and four additional experiments were designed and
conducted. These four experiments tested simple discrimination, timed discrimination, identication, and comprehension.
73
4.2.2 Simple Discrimination Experiment
Two forms of tactile discrimination experiments were conducted. In this
rst experiment, subjects were allowed to explore freely the initial and secondary
tactile images for a total of one minute per pair. This provided the subjects with
enough time to glean some information about both the general shape of the image
and some of the more prominent internal details. As described below, a matching task was conducted to measure the eectiveness of the four image processing
techniques under consideration when applied strictly for purposes of discrimination.
This task is roughly analogous to that of a sighted person leisurely browsing through
photographs in a magazine or on the Internet, for instance.
In addition to measuring the eects of processing on discrimination, a study
was conducted to determine the eect on discrimination of one form of microcapsule
paper versus the other. Results from this comparison of microcapsule papers will be
extrapolated to other more complex tactual perception tasks of identication and
comprehension.
4.2.2.1 Subjects
Ten subjects ranging in age from 22 to 60 were used in this experiment. The
subjects participated voluntarily and came from a variety of backgrounds, including
college students, homemakers, computer programmers, and a retired chemist. All
subjects were educated to at least the four-year college level. Seven subjects were
male, three were female. Three subjects were blind, seven were sighted. The two
male blind subjects were adventitiously blind, one at age 19, the other at age 39. The
one female blind subject was congenitally blind. Additionally, one male subject was
classied as low vision. Subjects had little to no experience with tactile images and
74
microcapsule paper, although one blind male subject used similar tactile materials
as study aids for a college course.
4.2.2.2 Materials
Materials were produced on both types of microcapsule paper using identical
tactile images for each set. Each set consisted of 40 sheets, each with a pair of
raised tactile images per sheet, one on either side of a raised line that divided each
sheet in half (see Appendix B). Each tactile image was limited to four inches in
width, which is within the width of one hand span. The height of each image
followed proportionally from the scaling of the width and also stayed well within
the height of one hand span. Samples were drawn from the set of original images
(see Appendix A) to prepare the testing materials, which were comprised of image
pairs. Half of the pairs consisted of identical images, and half were not identical.
Each pair was prepared using each one of the four processes under consideration,
and the same processing was applied to both images on a sheet. The four processes
used were (1) no processing, (2) K-means adaptive segmentation, (3) Sobel edge
detection with thresholding, and (4) an aggregate process of blurring, Sobel edge
detection without thresholding, K-means segmentation and median ltering.
4.2.2.3 Procedure
Subjects were asked to perform a discrimination task using one complete set
of 40 tactile-image pairs. Subjects were seated at a table, blindfolded if sighted,
and presented with each of the 40 sheets from a given set in an arbitrary sequence.
For each sheet, subjects freely explored the pair of tactile images on the sheet for
a period of time totaling one minute and were then asked to report whether the
75
images felt the same or dierent. Subjects also could reply that they could not say
one way or the other, although this reply was rarely used. During this procedure,
responses were recorded, as were any unsolicited comments made by the subject in
reaction to the materials or procedure. Subjects were given neutral feedback after
each matching task along the lines of, \Good. Now here is the next one."
Overall, each complete set of 40 tactile image sheets was used with ve (onehalf) of the subjects, so that some comparison could be made of the two forms of
microcapsule paper under identical experimental conditions. The same set of 40
sheets of testing materials that was used for a subject in this simple discrimination
experiment was randomly reordered and used in the following timed discrimination
experiment for the same subject. Note that half of the subjects completed the simple
and timed discrimination tasks using the Repro-Tronics paper, and the other half
the Matsumoto Kosan paper. With the exception of the type of paper on which the
materials were prepared, the two complete sets of testing materials were identical
in every respect.
4.2.2.4 Results
The results of the simple discrimination experiment are summarized in the
following three tables. Table 4.2 provides an overview of how subjects performed on
average for each of the four image processes applied. Analyses of variance indicate
signicant interaction between the unprocessed original and any of the other processing performed. Compared with results expected by chance, analyses of variance
indicate signicant interaction for all forms of processing. In the case where no
processing was used, analysis of variance does not indicate interaction.
76
Table 4.3 compares the results of using one type of microcapsule paper versus
the other. Table 4.4 compares the performance of blind versus sighted subjects. In
these tables, analyses of variance does not indicate any interaction between groups
of subjects based on the dierent processes applied, whether compared by output
medium or level of vision.
In these tables, Matches refers to the sum of all correct responses in the
discrimination task made by the 10 subjects in all trials. The Mean Pct. Matched is
the computed average percentage of these correct responses. The results of analyses
of variance are denoted by p and compare results for each of the forms of processing
with those for the unprocessed originals, and with chance (50%).
Table 4.2: Summary of overall results of simple discrimination task for four image
processes. The Aggregate Process is comprised of blurring, Sobel edge
detection without thresholding, K-means adaptive segmentation, and
median ltering, applied in that order.
Mean Pct.
p
p
Image process
Matches Matched (vs. None) (vs. Chance)
No Processing
50/100 50.00% 1.00e+00 1.00e+00
K-means Segmentation 83/100 83.00%
2.07e-05
3.49e-07
Sobel Edge Detection 81/100 81.00%
4.70e-06
1.53e-09
Aggregate Process
95/100 95.00%
4.40e-08
1.93e-11
Table 4.3: Summary of percentage of correct responses comparing eects of two
varieties of microcapsule paper on simple discrimination task.
Image process
Flexi-Paper Matsumoto-Kosan
p
No Processing
48.00%
52.00%
6.41e-01
K-means Segmentation 90.00%
78.00%
2.60e-01
Sobel Edge Detection
80.00%
82.00%
3.05e-01
Aggregate Process
96.00%
94.00%
3.59e-01
77
Table 4.4: Summary of percentage of correct responses comparing results of blind
versus sighted subjects performing simple discrimination task.
Image process
Blind Subjects Sighted Subjects
p
No Processing
53.33%
52.86%
2.94e-01
K-means Segmentation
83.33%
81.43%
6.01e-01
Sobel Edge Detection
63.33%
84.29%
6.43e-02
Aggregate Process
90.00%
95.71%
7.45e-01
4.2.3 Timed Discrimination Experiment
The second experiment imposed a strict time limit of 10 seconds for exploration of each pair of images. This limited-time experiment was designed to measure
the eectiveness of TACTICS image processing techniques in a situation reminiscent of a sighted person skimming or quickly scanning through a series of images,
making quick determinations. The goal of this experiment was to test how use of
these image simplication methods might aect the ability of a blind computer user
to perform browsing and navigation tasks using touch in a GUI environment on a
level comparable to a sighted computer user.
4.2.3.1 Subjects
The same subjects were used for this experiment as in the previous simple
discrimination experiment. Since this experiment was always performed immediately following the simple discrimination experiment, subjects had gained limited
experience with and developed individual techniques for exploring the tactile image
materials.
78
4.2.3.2 Materials
Materials were the same as in the simple discrimination experiment (see Appendix B), with identical materials being used for the same subject for both the
simple and timed discrimination experiments. Although the identical materials were
used for a given subject, they were randomly reordered to counteract possible bias
related to ordering.
4.2.3.3 Procedure
The procedure for this experiment was identical to the previous discrimination task, with the single exception being that subjects were limited to 10 seconds
per image-pair matching task.
4.2.3.4 Results
The results of the timed discrimination experiment are summarized in the
following three tables. Table 4.5 provides an overview for how subjects performed
for each of the four image processes applied. Analyses of variance indicate signicant interaction between the unprocessed original and any of the other processing
performed. Compared with results expected by chance, analyses of variance indicate some degree of interaction for all forms of processing. In the case where no
processing was used, however, analysis of variance does not indicate interaction.
Table 4.6 compares the results of using one type of microcapsule paper versus
the other. Table 4.7 compares the performance of blind versus sighted subjects.
Analyses of variance for these two tables show that there is no statistical evidence of
79
interaction between groups of subjects based on processing used, whether compared
by output medium or level of vision.
Table 4.5: Summary of overall results of timed discrimination task for four image
processes.
Image process
No Processing
K-means Segmentation
Sobel Edge Detection
Aggregate Process
Matches
55/100
77/100
73/100
87/100
Mean Pct.
p
Matched (vs. None)
55.00% 1.00e+00
77.00%
1.80e-03
73.00%
2.90e-03
87.00%
4.67e-05
p
(vs. Chance)
3.82e-02
1.34e-04
1.24e-04
3.24e-06
Table 4.6: Summary of percentage of correct responses comparing eects of two
varieties of microcapsule paper on timed discrimination task.
Image process
Flexi-Paper Matsumoto-Kosan
p
No Processing
50.00%
58.00%
1.33e-02
K-means Segmentation 92.00%
68.00%
2.30e-01
Sobel Edge Detection
74.00%
72.00%
8.47e-01
Aggregate Process
96.00%
82.00%
2.30e-01
Table 4.7: Summary of percentage of correct responses comparing results of blind
versus sighted subjects performing timed discrimination task.
Image process
Blind Sighted
p
No Processing
43.33% 60.00% 6.54e-01
K-means Segmentation 86.67% 72.86% 4.91e-01
Sobel Edge Detection 73.33% 72.86% 1.96e-01
Aggregate Process
93.33% 84.29% 7.47e-01
80
4.2.3.5 Comparison with simple discrimination
Results of the performance of subjects in the simple and timed discrimination
tasks are compared in Table 4.8. While the trend based on mean performance was for
subjects to discriminate tactile images slightly less successfully under time pressure,
analyses of variance did not indicate any interaction between the two discrimination
modalities. This result is therefore inconclusive in regard to the eect of time
pressure on tactile discrimination.
Table 4.8: Summary of percentage of correct responses comparing results of all
subjects on simple discrimination versus timed discrimination tasks.
Image process
Simple Timed
p
No Processing
50.00% 55.00% 2.85e-01
K-means Segmentation 83.00% 77.00% 4.03e-01
Sobel Edge Detection 81.00% 73.00% 1.61e-01
Aggregate Process
95.00% 87.00% 2.26e-01
4.2.4 Identication Experiment
For the identication task, subjects explored a series of tactile images and
for each attempted to classify it into one of four categories that varied for each
image. This task was designed to provide some insight into the eectiveness of
TACTICS to produce a tactile image that resembles the original in such a way that
it is identiable given some small amount of pre-information. This is analogous to
the visual task of identifying photographs based on some small amount of textual
information, such as a caption.
81
4.2.4.1 Subjects
The subjects used for this experiment were the same, though now somewhat
more experienced with tactile image exploration and interpretation, as those used
in the previous two discrimination experiments.
4.2.4.2 Materials
For this experiment, 10 images were selected from the original set of 30 images
that reected a diversity of shape and content. Each image was processed using each
of the four processes under consideration, with the result of each process placed onto
an individual sheet of the Matsumoto Kosan microcapsule paper. The result of this
preparation was a set of 40 sheets, each with a tactile image that was processed in
one of four ways.
For each of the 10 images, four possible categories were dened (see Appendix C). Of these four categories, one correctly identied the content of the image,
two identied objects that may closely resemble the content of the image, and one
resembled the content of the image less closely.
4.2.4.3 Procedure
For each of the 40 arbitrarily presented sheets, the experimenter verbally
listed the four possible categories as the subject freely explored the tactile image.
At the conclusion of a period of no more than 30 seconds, the subject was asked to
state which category most closely matched the tactile image that was explored. The
responses were recorded, and the procedure was similarly repeated for all tactile
images in the set.
82
4.2.4.4 Results
The results of the tactile image identication experiment are shown in the
following two tables. Table 4.9 summarizes overall performance of subjects on the
identication task for each of the four forms of image processing applied in the production of the tactile images. Analyses of variance indicate signicant interaction
between the unprocessed original and any of the other processing performed. Signicant interaction is also indicated by analyses of variance comparing the various
processing with results expected by chance (25%).
Table 4.10 compares performance of blind versus sighted subjects in the same
task and for the same four processes. Analyses of variance comparing sighted and
blind subjects did not indicate interaction between the subject groups.
Table 4.9: Summary of overall results of identication task for four image processes.
Pct.
p
Image process
Identications Identied (vs. None)
No Processing
7/100
7.00% 1.00e+00
K-means Segmentation
55/100
55.00% 3.84e-09
Sobel Edge Detection
46/100
46.00% 6.36e-07
Aggregate Process
85/100
85.00% 5.05e-13
p
(vs. Chance)
1.83e-06
2.24e-07
2.02e-04
8.93e-13
Table 4.10: Summary of percentage of correct responses comparing results of blind
versus sighted subjects performing identication task.
Image process
Blind Sighted
p
No Processing
10.00% 5.71% 4.83e-01
K-means Segmentation 56.67% 54.29% 7.89e-01
Sobel Edge Detection 43.33% 47.14% 7.23e-01
Aggregate Process
76.67% 88.57% 1.13e-01
83
4.2.5 Comprehension Experiment
Based on results of the Pilot study, which were further supported by the discrimination and identication experiments, the aggregate process of edge detection
followed by segmentation was found to be best among those considered for improving performance in a basic tactile image discrimination task. We used this process
as a foundation, enhancing its eect by adding an initial blurring step and following
up with a median ltering step. By applying blurring initially to the original image, detail was reduced and edges generated by subsequent processes were thicker
and more easily perceived. The application of median ltering as a post process
removed the rare instances of undesired noise that remained. The applicability and
eectiveness of the algorithms and sequencing chosen for this aggregate process was
supported by results of the discrimination and identication experiments.
This experiment measured the ability of subjects to comprehend tactile images prepared using the aggregate processing. The assumption was made that unprocessed images would be incomprehensible; and, indeed, this assumption was
supported by results of the discrimination and identication experiments. Another
assumption used in the design of this experiment was that the aggregate process,
having provided the best results for previous tasks, would be the best choice for this
task as well.
4.2.5.1 Subjects
The same subjects were used in this experiment as in the previous three experiments. Having performed the three previous experiments, by this point subjects
were more experienced and also quite comfortable with the exploration of the tactile
materials being used.
84
4.2.5.2 Materials
For this experiment, 10 images were selected from the original set of 30, based
on a diversity of content and shape. Each image was processed using the aggregate
process, and placed onto a sheet of Matsumoto Kosan microcapsule paper for raising.
Due to the results of previous experiments which indicated no interaction based on
type of paper used, use of this paper was deemed sucient.
Associated with each of the 10 processed images was a brief one- or twosentence description of the image and four questions designed to test a subject's
comprehension of the image's content (see Appendix D). Questions were of the
\True or False," \Multiple Choice" and \Locate the (ll in the blank)" variety, with
the number of choices limited to two.
The questions were designed to be of the sort that generally would be easily
answered by a sighted person viewing the original image. Some questions asked the
subject to locate some feature in the image, such as, \Locate the tail n of the space
shuttle." Other questions concerned understanding some feature in the image. For
example, associated with the image of a space shuttle in the process of landing, one
question was, \Is the shuttle landing from left to right, or right to left?" Finally,
the third and most dicult form of question asked the subject to reason about and
draw some conclusion about the content of the image; for example, with an image
of a desktop computer the question was, \Is the computer on or o?"
4.2.5.3 Procedure
For each of the 10 tactile images, subjects were rst read the brief description
of the image while the subject explored the image. As the subject continued to
85
explore the image freely, each one of the four questions, together with the two
possible answers for each, concerning the image was read aloud. For questions with a
verbal response, the experimenter recorded the subject's reply on the data-collection
sheet. For questions in which a subject was asked to locate a specic feature in the
image, the experimenter observed the movement of the subject's hand and noted
both the nal location the subject indicated and the subject's verbal reply.
Also recorded were any unsolicited remarks or comments made by the subject and observations made by the experimenter during the experimental procedure.
Comments often referred to the diculty a subject may be having with a particular
question or tactile image, some interesting discovery that had been made by the
subject regarding the image, or the reasoning used by the subject in reaching a particular conclusion. Observations made by the experimenter included noting initial
reactions of the subject, exploratory movements used, and any other reactions that
seemed noteworthy.
4.2.5.4 Results
Results of the tactile image comprehension experiment are shown in the following two tables. The rst, Table 4.11, displays subject performance for the three
comprehension subtasks as well as overall performance. Analyses of variance comparing these results with chance (50%) indicate signicant interaction, suggesting
little possibility that the successful performance of subjects was random.
Table 4.12 compares how blind and sighted subjects performed in this experiment. Analyses of variance comparing subjects based on level of vision indicate
probable interaction for the location task. No interaction is indicated when comparing subjects by level of vision and the understanding and reasoning tasks.
86
The disparity in performance on the location task may be due to dierences
in visual memory, with sighted subjects possessing more familiarity with visual
material in general than blind subjects [77]. As a result, sighted subjects are more
aware of relative size and position of objects as represented in an image. This
dierence in positional awareness could account for the dierences in performance
of blind subjects versus sighted subjects for the location task. The lack of signicant
dierences in performance for the understanding and reasoning tasks could indicate
that a more developed visual memory is not necessary to these tasks.
Table 4.11: Summary of results of comprehension task for three subtasks and
overall comprehensibility of tactile images prepared using Aggregate
process.
Comprehension Task Correct Replies Pct. Correct
p
Feature location
108/130
83.08%
6.53e-08
Feature understanding
132/160
82.50%
3.49e-09
Content reasoning
87/110
79.09%
4.45e-14
Overall
327/400
81.75%
1.44e-15
Table 4.12: Summary of percentage of correct responses comparing results of blind
versus sighted subjects performing comprehension task.
Comprehension Task
Blind Sighted
p
Feature location
69.23% 89.01% 5.30e-03
Feature understanding 89.58% 79.46% 1.37e-01
Content reasoning
78.78% 79.22% 8.96e-01
Overall
80.00% 82.50% 4.27e-01
4.2.6 Signicance of Results
The analyses of variance for all experiments did not indicate interaction between groupings of subjects based on level of vision. This result is expected based
87
on results of previous studies that found no signicant dierence between the tactile
abilities of blind and sighted persons [56]. For groupings based on output medium,
analyses of variance again did not indicate interaction. Since the characteristics
of the two types of papers are similar in most respects, this result is not remarkable. However, the two forms of paper do vary in the property of stiness, with
Matsumoto-Kosan paper being signicantly stier than the Repro-Tronics FlexiPaper, which is exible by design. It appears that stiness alone is not a signicant
factor in any of the tactual perception abilities we measured. In spite of the lack of
statistical dierences, some subjects indicated a preference for the stier MatsumotoKosan paper over the Flexi-Paper, noting its \better clarity" or \nicer feel." These
personal reactions did not appear to translate into dierences in the performance of
subjects.
Comparing mean performance on the various tasks versus chance performance reveals an apparent trend of improvement based in some measure on the
degree of simplication. More formally, analyses of variance based on type of processing showed signicant interaction between each form of processing used when
compared directly with unprocessed originals. These analyses repeatedly indicated
that the application of simplifying image processing techniques in the translation of
visual images to tactile images improved performance of subjects in discrimination,
identication and comprehension tasks. This result is quite favorable, particularly
when compared with subject performance on similar tasks using unprocessed tactile
images.
Equally as important as these statistically signicant results are the observational and anecdotal evidence gathered during these experiments. The signicance
of that evidence, in light of the results from these experiments, will be discussed in
the following chapter.
88
Chapter 5
OBSERVATIONS, DISCUSSION AND CONCLUSIONS
In the course of evaluating TACTICS, a number of general observations were
made that anecdotally enhance the raw tabulated results. These observations and
results are discussed, and conclusions are drawn, regarding the eectiveness of TACTICS as a method for providing blind persons with tactile access to visual information.
5.1 Observations
While conducting these experiments, the experimenter recorded observations
in addition to the raw response data. These observations included unsolicited comments from the subjects, notes regarding exploratory techniques used by subjects,
and other actions and remarks made by the subjects during the experiments. The
observations are summarized here for each of the four experiments followed by more
general observations.
During the simple discrimination experiment, subjects typically took more
time for the rst few tasks while they became accustomed to exploration of the
tactile images and experimented with dierent techniques for exploring them. Most
subjects developed a two-handed approach to this discrimination task, using one
89
hand for each of the images in a pair and synchronizing the movements of the two
hands. While using this technique, subjects usually rst attempted to determine
the general shape of the image. Then, subjects performed further exploration, again
in tandem, to examine details of the image.
One striking observation made during the timed discrimination experiment
is that, compared with the simple discrimination experiment, subjects often seemed
much more condent of their answers in spite of the limited time given for exploration. Some of this condence may have been the result of experience gained during
the previous experiment and perhaps condence in the techniques developed as a
result. One technique developed by many subjects was the use of a brushing motion,
drawing the ngertips of the hand the length of each tactile image. This technique
was fast, and seemed to provide enough basis to discriminate between tactile images.
This brushing technique was performed with two hands in tandem or with a single
hand on each image individually approximately equally as often.
The identication experiment proved to be the most challenging for subjects
based on their reactions during the experiment. They often expressed frustration at
not being sure about which category was the best match for a given image. Subjects
often used a process of elimination to narrow down possible choices from among the
four categories given for each tactile image. Another technique subjects used was
to explore an image four times, basing each exploration on the assumption that the
image was one of the four categories. Guessing was a common strategy used by
subjects when they could not determine a category for an image. Guessing occurred
most frequently with the relatively feature-free unprocessed images.
Subjects seemed especially to enjoy the nal experiment measuring tactile
image comprehension. The combination of the experience gained in the previous
90
three experiments and the up-front description of each tactile image seemed to give
subjects a great deal of condence in exploring the images and answering the questions. It was not uncommon for subjects in the process of answering one question
to make comments and remarks about the contents of the images that turned out
to be the answer to later questions. For example, while exploring a tactile image of
the Space Shuttle, a number of subjects indicated the locations of the nose and tail
of the vehicle while answering a question about its direction of travel. An image
of an astronaut working on the surface of the Moon was the most dicult for all
subjects. Visual observations made while the subjects were exploring this particular
image indicated that the presence of an edge denoting the horizon as well as the
busy pattern of edges generated by the texture of the surface of Moon made image
comprehension dicult. Subjects frequently mistook a U.S. ag in the image for
the astronaut's backpack, and that miscalculation caused incorrect answers to other
questions about the astronaut's position and activity.
In general, each subject tended to have relative ease or diculty with the
same images and processing that other subjects had ease or diculty with, respectively. Subjects also tended to gain condence in performing the various tasks as
they gained experience with exploring the tactile images. The general technique
that was arrived at by each of the subjects was one of exploring rst the overall
shape and size of a tactile image, then feeling for details.
Blind subjects had more diculty with some of the more visual concepts, particularly with images of large objects such as planets, the Space Shuttle and very
small objects, such as the Streptococcus bacteria and Ebola virus. Blind subjects often expressed more apprehension at the outset than sighted subjects, although blind
subjects had more experience relying on the sense of touch than sighted subjects.
91
For subjects who performed the two discrimination tasks using Flexi-Paper,
there was an initial reaction to the improved clarity provided in the third experiment
which was conducted using the Matsumoto Kosan microcapsule paper. Subjects did
not appear to have more diculty in similar tasks using one variety of paper versus
the other. Another comment regarding Flexi-Paper was that some pairs of images
seemed to be expanded to diering heights. Interestingly, careful measurements
taken in the laboratory using a mill-meter revealed that heights of the tactile images
referred to were identical. Possible explanations are that the tactile acuity for the
left and right hands may have varied slightly for some subjects, or that the dierence
in stiness of the two papers may have aected the outcome.
5.2 Discussion
Comparison of various results of these experiments provides further insight
into the degree of eectiveness of TACTICS for automatic generation of tactile
images. Comparing mean percentage of matches for the simple and timed discrimination experiments reveals that performance degraded fairly uniformly, and even
then only slightly, when going from the untimed to timed task. For unprocessed tactile images, discrimination for both tasks was about chance (50%). Blind subjects
successfully discriminated between tactile images about 10% less frequently than
sighted subjects, although it must be noted that various analyses of variance did
not indicate a statistical signicance for this observation. One possible explanation
for this slight dierence in performance is a lower level of pre-experiment condence
among the blind subjects, who tended to be somewhat more apprehensive about
how well they would perform in the experiments.
With the exception of segmentation, a comparison of results from the simple discrimination experiment based on the two types of microcapsule paper was
92
unrevealing. Ability to discriminate segmented tactile images was signicantly better for Flexi-Paper than Matsumoto Kosan paper. For the timed discrimination
experiment, Flexi-Paper produced a signicantly higher percentage of successful
discriminations than the Matsumoto Kosan paper for segmentation and the aggregate process, and slightly higher for Sobel edge detection. An anecdotal explanation
for this result may be that subjects reported a more positive, albeit subjective, reaction to the stiness of Flexi-Paper over Matsumoto Kosan paper. As previously
mentioned, specications and actual measurements comparing the expanded characteristics of the two papers do not provide empirical evidence for any dierence.
For the identication and comprehension experiments, it is likely that there
would be little dierence in resulting ability to identify and comprehend tactile
images based solely on use of dierent microcapsule paper. There was no element
of time pressure imposed in the conducting of these two experiments; it was time
pressure which seemed to produced degraded ability to discriminate in the timed
discrimination experiment.
In judging the eectiveness of various forms of processing on original images
to produced tactile images, the conclusions from the Pilot Study are supported. In
general, simplication to any degree produces improvement in discrimination rate.
For the discrimination experiments, when no processing was applied at all, success
rates tended to be at about chance. Subjects discriminated correctly between tactile
images about 75% of the time for images processed using Sobel edge detection alone,
and slightly better than that for images that were prepared using segmentation. The
aggregate process allowed subjects to correctly discriminate from 85% to 90% of the
time, and some subjects performed perfectly.
93
It is dicult to say what the eect of experience with exploration of tactile images was on the results of these experiments. One blind subject had some
experience using tactile materials to aid his study for a college course; but those
materials were strictly expanded versions of unprocessed line drawings, and his experience with them did not seem to have a noticeable eect on his performance in
the experiments. None of the subjects had experience with automatically generated
tactile images of the form used in these experiments prior to participation. Tallied
and calculated results aside, there is some reason for optimism regarding improvement with experience. Without exception, subjects were observed becoming more
comfortable and condent with the tactile representations during the course of the
experiments. Additionally, many began to recognize the overall shape or features
of some tactile images they had explored earlier in the experiment. It is important
to note that, although subjects \recognized" some tactile images, they had no understanding of the content. Recognition in this case was simply feeling some tactile
pattern or shape that had been felt on an earlier sheet and commenting aloud on
that observation.
The identication experiment proved to be the most dicult for subjects
when compared with the mean percentage of matches for the other experiments. The
aggregate process again produced the best rate of success, followed by segmentation
and Sobel edge detection, with no processing trailing far behind. Although blind and
sighted subjects performed nearly identically on segmented and on edge detected
images, sighted subjects performed better when exploring images prepared using the
aggregate process, correctly identifying image content more than 90% of the time
while blind subjects were successful 75% of the time. This dierence may be due
to the visual nature of the images and the content therein, and perhaps due to an
unintentional visual bias in preparation of the experimental materials.
94
For the comprehension experiment, subjects generally performed quite well
on all tactile images and questions, with the exception of one particular image of an
astronaut working on the surface of the Moon. The featured content of most images
was photographed either straight-on or in prole, producing tactile images that were
straightforward to explore and comprehend. The astronaut image was captured at
somewhat of a downward angle, producing a confusing horizon line crossing the
image at the level of the astronaut's neck. Further research is needed to determine
what image processing techniques exist, or can be developed, to handle adequately
potentially confusing information, such as horizon lines, within the framework of
automatic conversion to tactile representation.
Overall, blind and sighted subjects performed about the same on all tasks.
Blind subjects tended to have more diculty than sighted subjects in locating specic features within images, perhaps due to a less developed visual memory or lack
of experience with characteristics of visual representation from which the tactile
images were generated. Blind subjects performed better than sighted subjects with
tasks involving understanding the content of tactile images, perhaps due to more
experience in relying on the sense of touch to gather and interpret information.
5.3 Conclusions
The objective of this work was to provide meaningful access to computerbased visual information to blind persons, and to do so automatically. Image processing techniques were applied to images to produce simplied versions of the originals, appropriate for output as tactile graphics. These image processing algorithms,
and the aggregate processes resulting from various combinations, were selected based
on eects that were analogous to principles of psychophysics and the science of tactual perception. The result was a system that converts a visual image into a tactile
95
image in an automatic, timely and comprehensible fashion, as supported by results
of evaluative experiments.
Although this individual study did not test nor does it fully support the
theory that experience with use of tactile images will improve over time, the observations made during experimental evaluation provide testimony in favor of this
possibility. There is no reason to think that the expression \practice makes perfect"
applies everywhere except for tactile images, as indicated by subject recognition of
shapes and patterns that were encountered in an earlier task.
The signicance of the development of this prototype system is that it makes
it clear that reasonable and comprehensible access to visual information can be provided to blind persons, and done so without the intervention of a sighted facilitator.
Thus a blind computer user, for instance, could \surf the web" unaided and at a
much better level of comprehension than possible with text alone.
This increased access to visual material can facilitate broader educational
and professional opportunities, particularly in areas with a strong tendency toward
visual presentation of information. For example, persons with disabilities, including
blindness, are currently underrepresented in science-, engineering- and mathematicsrelated disciplines. The techniques developed in this system can translate the visual
information from these elds into tactile form, providing students and professionals with better access to diagrams, graphs and images ranging in scale from the
microscopic to the cosmic.
96
Chapter 6
FUTURE DIRECTIONS
The eectiveness of TACTICS at converting visual information into comprehensible tactile information lends credence to the possibility of future investigation
in this and related areas. Among the possibilities are:
6.1 Development of End User Application
TACTICS can be developed further into a stand-alone application. Such an
application would be invokable from the command line, perhaps being called in place
of a print routine from a web browser. If a blind computer user desired to explore a
tactile version of an image, the application would automatically handle processing
of the image. At present, there are extra steps involved that may require assistance
from a sighted person, namely:
Retrieving the printout
Loading microcapsule paper into a photocopier machine
Photocopying the tactile image onto the microcapsule paper
Raising the tactile image using a device such as the Tactile Image Enhancer
97
6.2 Extension to Refreshable Tactile Display
The use of image processing appears to be a natural and eective method for
production of simplied images suitable for output in tactile form. With such eective pre-processing available, the task of expedient output becomes more important.
There is a denite need for real-time dynamic tactile display technology that could
display tactile images eciently.
The techniques developed in this thesis for converting visual information
into tactile information lend themselves to use as a front-end to such a real-time,
dynamic, tactile-display device. Such a display would overcome the reliance on a
sighted person that a blind person might experience when utilizing microcapsule
paper as an output method. One limitation of past technology developed to display
tactile graphics was that its eectiveness was determined by the relative simplicity
of the material being displayed. Using image processing techniques, as in TACTICS,
visual information could be prepared readily for meaningful display in tactile form.
6.3 Multimodal Interface
Simplied tactile representations of images, maps and other infrequently
changing visual items could be combined with touch-screen technology to create
a multimodal interface. With some initial conguration, positions on an image or
map could be associated with audio feedback, as with the Nomad (see page 29). The
advantage of this approach would be the speed with which tactile materials could
be prepared, and the exibility oered by the automatic simplication techniques
of TACTICS.
98
6.4 Mapping Color to Texture
Segmentation divides a two-dimensional visual representation into regions
based on related colors or intensity levels. The result of such a segmentation could
be used subsequently to associate the color of each region with a distinct texture,
thus providing a blind person with more complete access to the original content of
the visual information.
One long-standing problem of graph theory was the four-color conjecture,
the notion being that any planar graph, for our purposes a two-dimensional visual
representation such as a map or photograph, could be segmented into regions and
those regions colored using only four colors, and with no two adjacent regions being
assigned the same color [19, 75]. Originally posed in 1852 by Francis Guthrie,
the four-color conjecture was nally proved in 1977 [2, 3], although nding a fourcoloring is not necessarily fast. Given that four colors is sucient, relaxing the
coloring to some reasonably small number (say 10) would allow a very fast coloring to
be performed. Thus, a tactile image, simplied using TACTICS, could be segmented
and colored quickly using any of a number of simple graph-coloring algorithms.
Textures are produced using simple patterns that produce palpable textures when
raised. By uniquely mapping colors to these textures, it may be possible to preserve
much of the original visual information.
Even simpler would be to apply a K -means segmentation to an image, with
K = desired number of colors, and apply the color-texture mapping to the result.
This method might not provide as good a texture mapping as a more computationally expensive technique, but it would certainly be fast and may be sucient for
enabling comprehension of tactile images, which is the goal.
99
BIBLIOGRAPHY
[1] P. Apkarian-Stielau and J.M. Loomis. A comparison of tactile and blurred
visual form perception. Perception and Psychophysics, 18(5), 1975.
[2] K. Appel and W. Haken. Every planar map is four colorable, part I: Discharging. Ill. J. Math., 21, 1977.
[3] K. Appel, W. Haken, and J. Koch. Every planar map is four colorable, part
II: Reducibility. Ill. J. Math., 21, 1977.
[4] G.R. Arce, N.C. Gallagher, and T.A. Nodes. Median lters: Theory and
applications. In T.S. Huang, editor, Advances in Computer Vision and Image
Processing, volume 2. JAI Press, 1986.
[5] N.C. Barraga. Sensory perceptual development. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and
Youth: Theory and Practice. American Foundation for the Blind, 1986.
[6] K.L. Beauchamp, D.W. Matheson, and L.A. Scadden. Eect of stimuluschange method on tactile-image recognition. Perceptual and Motor Skills, 33,
1971.
[7] P.J. Benson, D.I. Perrett, and D.N. Davis. Towards a quantitative understanding of facial caricatures. In V. Bruce and M. Burton, editors, Processing
Images of Faces. Ablex Publishing Corporation, Norwood, New Jersey, 1992.
[8] B. Betts, D. Burlingame, G. Fischer, J. Foley, M. Green, D. Kasik, S.T. Kerr,
D. Olsen, and J. Thomas. Goals and objectives for user interface software.
Computer Graphics, 21, 1987.
[9] I. Biederman. Human image understanding: Recent research and a theory.
Computer Vision, Graphics and Image Processing, 32, 1985.
[10] J.C. Bliss, M. Katcher, C.H. Rogers, and R.P. Shepard. Optical-to-tactile
image conversion for the blind. IEEE Transactions on Man-Machine Systems,
MMS-11, 1970.
100
[11] L.H. Boyd, W.L. Boyd, and G.C. Vanderheiden. The graphical user interface:
Crisis, danger and opportunity. Journal of Visual Impairment and Blindness,
84, 1990.
[12] R.D. Boyle and R.C. Thomas. Computer Vision: A First Course. Blackwell
Scientic Publications, London, 1988.
[13] J. Bradley. XV Online Documentation. University of Pennsylvania, 3.10 edition, 1994.
[14] F.P. Brooks, M. Ouh-Young, J.J. Batter, and P.J. Kilpatrick. Project grope haptic displays for scientic visualization. In 17th Annual ACM Conference on
Computer Graphics and Interactive Techniques - SIGGRAPH '90, volume 24
of Computer Graphics, New York, August 1990. ACM.
[15] D. Burger. Improved access to computers for the visually handicapped: New
prospects and principles. IEEE Transactions on Rehabilitation Engineering,
2(3), 1994.
[16] P.A. Carpenter and P. Eisenberg. Mental rotation and the frame of reference
in blind and sighted individuals. Perception and Psychophysics, 23(2), 1978.
[17] C.C. Collins. Tactile television - mechanical and electrical image projection.
IEEE Transactions on Man-Machine Systems, MMS-11, 1970.
[18] S. Coren and L.M. Ward. Sensation and Perception (3rd Edition). Harcourt
Brace Jovanovich, San Diego, 1989.
[19] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms.
MIT Press, Cambridge, Massachusetts, 1990.
[20] J.C. Craig. Vibrotactile pattern perception: Extraordinary observers. Science,
196, 1977.
[21] J.C. Craig. Some factors aecting tactile pattern perception. International
Journal of Neuroscience, 19, 1983.
[22] J.C. Craig and C.E. Sherrick. Dynamic tactile displays. In W. Schi and
E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University
Press, 1982.
[23] D. Crystal. The Cambridge Encyclopedia of Language. Cambridge University
Press, Cambridge, 1987.
101
[24] F. Deconinck and P. Verschueren. TIDE project 103 GUIB: A model of the
understanding of graphical information by blind people. Technical report,
Final Report, June 1993.
[25] P.K. Edman. Tactile Graphics. American Foundation for the Blind, New York,
1992.
[26] E. Foulke. Reading braille. In W. Schi and E. Foulke, editors, Tactual
Perception: A Sourcebook. Cambridge University Press, 1982.
[27] J. Fricke and H. Baehring. Design of a tactile graphic I/O tablet and its integration into a personal computer system for blind users. Electronic Proceedings
of the 1994 EASI High Resolution Tactile Graphics Conference, Available from
http://www.rit.edu/easi/, 1994.
[28] S.F. Frisken-Gibson, P. Back-Y-Rita, W.J. Thompkins, and J.G. Webster. A
64-solenoid, four-level ngertip search display for the blind. IEEE Transactions on Biomedical Engineering, BME-34(12), 1987.
[29] J.P. Fritz and K.E. Barner. Design of a haptic graphic system. Proceedings of
the RESNA '96 Annual Conference, 1996.
[30] J.P. Fritz, T.P. Way, and K.E. Barner. Haptic representation of scientic data
for visually impaired or blind persons. In Proceedings of the CSUN Conference
on Technology and Disability, 1996.
[31] L.H. Goldish and H.E. Taylor. The optacon: A valuable device for blind
persons. New Outlook for the Blind, 68(2), 1974.
[32] D. Grith. Computer access for persons who are blind or visually impaired:
Human factors issues. Human Factors, 32(4), 1990.
[33] C.J. Hasser and J.M. Weisenberger. Preliminary evaluation of a shape-memory
alloy tactile feedback display. Advances in Robotics, Mechatronics and Haptic
Interfaces, 49, 1993.
[34] R. Hinton. First introduction to tactiles. The British Journal of Visual Impairment, 9(3), 1991.
[35] K. Hirota and M. Hirose. Simulation and presentation of curved surface in
virtual reality environment through surface display. In Proceedings - Virtual
Reality Annual International Symposium '95, Los Alamitos, California, 1995.
IEEE Computer Society Press.
102
[36] E.D. Hirsch, J.F. Kett, and J. Trel. The Dictionary of Cultural Literacy.
Houghton Miin Company, Boston, 1988.
[37] L.T. Hoshmand. Blindisms: Some observations and propositions. Education
of the Visually Handicapped, May 1975.
[38] T.S. Huang, G.J. Yang, and G.Y. Tang. A fast two dimensional median ltering algorithm. Proceedings of the IEEE Conference on Pattern Recognition
and Image Processing, 1978.
[39] K.O. Johnson and J.R. Phillips. Tactile spatial resolution: Two-point discrimination, gap detection, grating resolution, and letter recognition. Journal of
Neurophysiology, 46, 1981.
[40] P. Jubinski. VIRTAC, a virtual tactile computer display. Proceedings of the
Johns Hopkins National Search for Computing Applications to Assist Persons
with Disabilities, 1992.
[41] J.M. Kennedy. Haptic pictures. In W. Schi and E. Foulke, editors, Tactual
Perception: A Sourcebook. Cambridge University Press, 1982.
[42] R.L. Klatzky. Human Memory: Structures and Processes. W.H. Freeman and
Company, New York, 2nd edition, 1980.
[43] R.L. Klatzky, S.J. Lederman, and V.A. Metzger. Identifying objects by touch:
An \expert system". Perception and Psychophysics, 37(4), 1985.
[44] R.L. Klatzky, S.J. Lederman, and C. Reed. There's more to touch than meets
the eye: The salience of object attributes for haptics with and without vision.
Journal of Experimental Psychology: General, 116, 1987.
[45] K.J. Kokjer. The information capacity of the human ngertip. IEEE Transactions on Systems, Man, and Cybernetics, SMC-17(1), 1987.
[46] S.M. Kosslyn. Image and Brain: The Resolution of the Imagery Debate. MIT
Press, Cambridge, Massachusetts, 1994.
[47] L.E. Krueger. The psychophysics of touch. In W. Schi and E. Foulke, editors,
Tactual Perception: A Sourcebook. Cambridge University Press, 1982.
[48] Z. Kuc. A bidirectional vibrotactile communication system: Tactual display
design and attainable data rates. VLSI and Computer Peripherals, 1989.
COMPEURO '89 - 3rd Annual European Computer Conference.
[49] M. Kurze, L. Reichert, and T. Strothotte. Access to business graphics for
blind people. Proceedings of the RESNA 17th Annual Conference, 1994.
103
[50] R.H. LaMotte and J. Whitehouse. Tactile detection of a dot on a smooth
surface: Peripheral neural events. Journal of Neurophysiology, 56(4), 1986.
[51] A. Lev, S.W. Zucker, and A. Rosenfeld. Iterative enhancement of noisy images.
IEEE Transactions on Systems, Man and Cybernetics, SMC-7(6), 1976.
[52] C.A. Lindley. Practical Image Processing in C. John Wiley and Sons, Inc.,
New York, 1991.
[53] J.M. Loomis. On the tangibility of letters and braille. Perception and Psychophysics, 29, 1981.
[54] J.M. Loomis. Tactile pattern perception. Perception, 10, 1981.
[55] J.M. Loomis and S.J. Lederman. Tactual perception. In K.R. Bo, L. Kaufman, and J.P. Thomas, editors, Handbook of Perception and Human Performance. John Wiley and Sons, Inc., 1986.
[56] B. Lowenfeld. Eects of blindness on the cognitive functions of children. In
B. Lowenfeld, editor, Berthold Lowefeld on Blindness and Blind People. American Foundation for the Blind, New York, 1981.
[57] B. Loweneld. The Changing Status of the Blind: From Separations to Integration. Charles C. Thomas, Springeld, Illinois, 1975.
[58] Matsumoto Kosan Co. LTD. Stereo copying system for the blind. Product
handbook, 1990.
[59] T. Massie and K. Salisbury. The PHANToM haptic interface: a device for
probing virtual objects. In Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, ASME Winter Annual Meeting, 1994.
[60] B.S. Miller and W.H. Miller. Extinguishing `blindisms': A paradigm for intervention. Education of the Visually Handicapped, Spring 1976.
[61] M. Minsky, M. Ouh-Young, O. Steele, F. Brooks, and M. Behensky. Feeling
and seeing: Issues in force display. In Proceedings of the Symposium on 3D
Real-Time, 1990.
[62] M.D.R. Minsky. Computational Haptics: The Sandpaper System for Synthesizing Texture for a Force-Feedback Display. PhD thesis, Massachusetts Institute
of Technology, June 1995.
[63] A.H. Mitwalli, S.B. Leeb, T. Tanaka, and U. Sinha. Polymer gel actuators
- status report. In Proceedings of the 29th Universities Power Engineering
Symposium, Galway, Ireland, 1994.
104
[64] G.J. Monkman. An electrorheological tactile display. Presence, 1(2), 1992.
[65] E.D. Mynatt and G. Weber. Nonvisual presentation of graphical user interfaces: Contrasting two approaches. In Proceedings of the CHI'94 Conference
on Human Factors in Computer Systems. ACM, 1994.
[66] V.S. Nalwa. A Guided Tour of Computer Vision. Addison-Wesley Publishing
Company, Reading, Massachusetts, 1993.
[67] T.N. Pappas. An adaptive clustering algorithm for image segmentation. IEEE
Transactions on Signal Processing, 40(4), 1992.
[68] R. Peier. Possible applications of polymer and photopolymer technologies to high resolution tactile graphics. Electronic Proceedings of the
1994 EASI High Resolution Tactile Graphics Conference, Available from
http://www.rit.edu/easi/, 1994.
[69] L. Petrosino and D. Fucci. Temporal resolution of the aging tactile sensory
system. Perceptual and Motor Skills, 68, 1989.
[70] K.K. Pingle. Visual perception by a computer. In A. Grasselli, editor, Automatic Interpretation and Classication of Images. Academic Press, 1969.
[71] L.H.D. Poll and R.P. Waterham. Graphical user interfaces and visually disabled users. IEEE Transactions on Rehabilitation Engineering, 3(1), 1995.
[72] W.K. Pratt. Digital Image Processing. John Wiley and Sons, New York, 1991.
[73] Repro-Tronics Inc., Westwood, New Jersey. Setup and Operating Instructions
for the Tactile Image Enhancer, 1994. Product specications.
[74] E. Rich and K. Knight. Articial Intelligence. McGraw-Hill, Inc., New York,
2nd edition, 1992.
[75] F.S. Roberts. Applied Combinatorics. Prentice-Hall, Inc., Englewood Clis,
New Jersey, 1984.
[76] A. Rosenfeld and L.S. Davis. Image segmentation and image models. Proceedings of the IEEE, 67(5), 1979.
[77] J. Sardegna and T.O. Paul. The Encyclopedia of Blindness and Vision Impairment. Facts On File, New York, 1991.
[78] R.J. Schalko. Digital Image Processing and Computer Vision: An Introduction to Theory and Implementations. John Wiley and Sons, New York,
1989.
105
[79] G.T. Scholl. What does it mean to be blind. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and
Youth: Theory and Practice. American Foundation for the Blind, 1986.
[80] A.S. Schwartz, A.J. Perey, and A. Azulay. Further analysis of active and
passive touch in pattern discrimination. Bulletin of the Psychonomic Society,
6(1), 1975.
[81] R.S. Schwertfeger. Making the GUI talk. Byte Magazine, December 1991.
[82] C.E. Sherrick and J.C. Craig. The psychophysics of touch. In W. Schi and
E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University
Press, 1982.
[83] Telesensory Systems, Inc., Palo Alto, California. OPTACON Owner's Manual:
Model R1D, 1978.
[84] Telesensory Systems, Inc., Palo Alto, California. Optacon Announcement,
Available from http://www.telesensory.com, 1996.
[85] J.A. Terry and H. Hsiao. Tactile feedback in a computer mouse. Proceedings
of the 14th Northeast Conference on Bioengineering, 1988.
[86] J.P. Thomas. JAWS User's Guide and Reference Manual, Second Edition.
Henter-Joyce, Inc., St. Petersburg, FL, 1994.
[87] C.M. Thompson and L. Shure. Image Processing Toolbox: For Use with MATLAB. The Math Works, Inc., Natick, Massachusetts, 1995.
[88] J.H. Todd. Resources, media, and technology. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and
Youth: Theory and Practice. American Foundation for the Blind, 1986.
[89] J.T. Tou and R.C. Gonzalez. Pattern Recognition Principles. Addison-Wesley
Publishing Company, Reading, Massachusetts, 1974.
[90] B. Tversky and D. Baratz. Memory for faces: Are caricatures better than
photographs. Memory and Cognition, 13(1), 1985.
[91] G.C. Vanderheiden. Systems 3 - an interface to graphic computers for blind
users. Proceedings of the RESNA 13th Annual Conference, 1990.
[92] G.C. Vanderheiden. Dynamic and static strategies for nonvisual presentation of graphic information. Electronic Proceedings of the 1994
EASI High Resolution Tactile Graphics Conference, Available from
http://www.rit.edu/easi/, 1994.
106
[93] M.E. Ward. The visual system. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and Youth: Theory and
Practice. American Foundation for the Blind, 1986.
[94] T. Way and K. Barner. Towards automatic generation of tactile graphics.
Proceedings of the RESNA '96 Annual Conference, 1996.
[95] T.P. Way and K.E. Barner. Automatic visual to tactile translation, part I:
Human factors, access methods and image manipulation. IEEE Transactions
on Rehabilitation Engineering, 5:81{94, March 1997.
[96] T.P. Way and K.E. Barner. Automatic visual to tactile translation, part
II: Evaluation of the tactile image creation system. IEEE Transactions on
Rehabilitation Engineering, 5:95{105, March 1997.
[97] S. Weinstein. Intensive and extensive aspects of tactile sensitivity as a function
of body part, sex, and laterality. In D.R. Kenshalo, editor, The Skin Senses.
Charles C. Thomas, Springeld, IL, 1968.
[98] B.W. White, F.A. Saunders, L. Scadden, P. Bach-y Rita, and C.C. Collins.
Seeing with skin. Perception and Psychophysics, 7, 1970.
[99] S.F. Wiker, G. Vanderheiden, S. Lee, and S. Arndt. Development of tactile
mice for blind access to computers: Importance of stimulation locus, object
size and vibrotactile display resolution. Proceedings of the Human Factors
Society 35th Annual Meeting, 1991.
[100] D.H. Willis. Relationship between visual acuity, reading mode, and school
systems for blind children: A 1979 replication. American Printing House,
1979.
107
Appendix A
LISTING OF IMAGES
These images are available for experimental purposes via the World Wide
Web at http://www.asel.udel.edu/sem/research/tactile/appendix.html.
A.1 Pilot Study Images
1.
2.
3.
4.
5.
6.
7.
8.
Close-up of President Bill Clinton
Close-up of a researcher (Tom Way)
Close-up of Albert Einstein
Hot air balloon
Chimney end of a house
Notebook computer
Diagram of a human heart
Space shuttle launch
A.2 TACTICS Evaluation Images
1.
2.
3.
4.
5.
6.
7.
Desktop computer
Desktop computer (another angle)
Notebook computer
Astronaut taking soil sample
Astronaut planting ag pole
Space shuttle landing (left to right)
Space shuttle landing (right to left)
108
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
Double-layer plume nuclear mushroom cloud
Single-layer plume nuclear mushroom cloud
Micrograph of the eyeball of Drosophiliaeye (house y)
Electron micrograph of a Streptococcus bacteria (96,000x)
Planet Saturn
Planet Jupiter
Moon
Chocolate chip cookie
Close-up of President Ronald Reagan
Close-up of President Bill Clinton
Close-up of a researcher (Tom Way)
Close-up of Albert Einstein
Hot air balloon
Two-shot of Beavis and Butthead
Two-shot of Bill Clinton and Al Gore
Chinese student blocking tanks in Tiananmen Square
Chinese student blocking tanks in Tiananmen Square (another angle)
Golden Gate Bridge in San Francisco
Twin Towers in New York City
Tornado funnel cloud in Oklahoma
Electron micrograph of a cell shedding HIV particles
Electron micrograph of a Pinosyllis Heterocirrata worm
Electron micrograph of the Ebola virus
109
Appendix B
SIMPLE AND TIMED DISCRIMINATION
IMAGE PAIRINGS
B.1 Preparation
Note that each pair was processed four dierent ways, using the four image
processes under investigation.
B.2 Flexi-Paper Pairs
1. Desktop computer & Notebook computer
2. Double-layer plume nuclear mushroom cloud & Single-layer plume nuclear
mushroom cloud
3. Micrograph of the eyeball of Drosophiliaeye (house y) & Moon
4. Close-up of President Bill Clinton & Close-up of President Bill Clinton
5. Two-shot of Bill Clinton and Al Gore & Two-shot of Bill Clinton and Al Gore
6. Chinese student blocking tanks in Tiananmen Square & Chinese student blocking tanks in Tiananmen Square (another angle)
7. Twin Towers in New York City & Twin Towers in New York City
8. Tornado funnel cloud in Oklahoma & Tornado funnel cloud in Oklahoma
9. Electron micrograph of a cell shedding HIV particles & Electron micrograph
of a cell shedding HIV particles
10. Electron micrograph of a Pinosyllis Heterocirrata worm & Electron micrograph
of the Ebola virus
110
B.3 Matsumoto Kosan Paper Pairs
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Desktop computer (another angle) & Desktop computer (another angle)
Astronaut taking soil sample & Astronaut planting ag pole
Space shuttle landing (left to right) & Space shuttle landing (left to right)
Electron micrograph of a Streptococcus bacteria (96,000x) & Electron micrograph of a Streptococcus bacteria (96,000x)
Planet Saturn & Planet Saturn
Close-up of President Ronald Reagan & Hot air balloon
Moon & Chocolate chip cookie
Close-up of Albert Einstein & Close-up of Albert Einstein
Two-shot of Beavis and Butthead & Two-shot of Bill Clinton and Al Gore
Golden Gate Bridge in San Francisco & Twin Towers in New York City
111
Appendix C
IDENTIFICATION EXPERIMENT
IMAGES AND CATEGORIES
C.1 Preparation
Note that each image used was processed four dierent ways, using the four
image processes under investigation. The four categories associated with each image
were arbitrarily arranged for each of the four applied processes.
C.2 Listing of Images and Categories
1. Desktop computer
A. an oce building
B. a notebook computer
C. a desktop computer
D. a trampoline
2. Notebook computer
A. a painting hanging on a wall
B. an open cardboard box
C. a notebook computer
D. a desktop computer
3. Micrograph of the eyeball of Drosophiliaeye (house y)
A. the Moon
B. an oatmeal raisin cookie
112
4.
5.
6.
7.
8.
9.
10.
C. an eyeball of a y
D. a helicopter in ight
Electron micrograph of a Streptococcus bacteria (96,000x)
A. the face of Albert Einstein
B. a magnied Streptococcus bacteria
C. the end of a stethoscope
D. a punching bag
Planet Saturn
A. a Frisbee
B. a helicopter in ight
C. the planet Jupiter
D. the planet Saturn
Planet Jupiter
A. the planet Saturn
B. the planet Jupiter
C. a Frisbee
D. the face of President Clinton
Chocolate chip cookie
A. a baseball
B. a chocolate chip cookie
C. the Moon
D. the face of President Clinton
Close-up of President Ronald Reagan
A. the face of former president Ronald Reagan
B. a hot air balloon
C. a magnied Streptococcus bacteria
D. the Moon
Hot air balloon
A. a hot air balloon
B. a punching bag
C. the planet Saturn
D. an oatmeal cookie
Golden Gate Bridge in San Francisco
A. the Twin Towers and the New York City skyline
B. the twin spans of the Golden Gate Bridge
C. a helicopter in ight
D. a picket fence
113
Appendix D
COMPREHENSION EXPERIMENT
IMAGES, DESCRIPTIONS AND QUESTIONS
D.1 Preparation
Note that each image was processed solely using the aggregate image process.
D.2 Listing of Images, Descriptions and Questions
1. Desktop computer: This is a personal computer.
1. This computer is a:
A. desktop computer
B. notebook computer
2. This computer is:
A. on
B. o
3. This computer has a mouse that is visible.
A. true
B. false
4. Locate the keyboard.
A. (successful)
B. (not successful)
2. Notebook computer: This is a personal computer.
1. This computer is a:
A. desktop computer
B. notebook computer
114
2. This computer is:
A. on
B. o
3. This computer has a mouse that is visible.
A. true
B. false
4. Locate the keyboard.
A. (successful)
B. (not successful)
3. Astronaut planting ag pole: This is an astronaut dressed in a spacesuit,
working on the surface of the Moon.
1. The astronaut is:
A. using a short pole to collect a lunar soil sample
B. placing a ag atop a agpole on the Moon's surface
2. The astronaut is:
A. standing still
B. moving
3. The astronaut is facing to the:
A. left
B. right
4. Locate the the astronaut's feet.
A. (successful)
B. (not successful)
4. Space shuttle landing (right to left): This is the Space Shuttle Endeavor landing in the California desert at Edwards Air Force Base.
1. The shuttle is headed to the:
A. left
B. right
2. The landing gear have already touched the ground.
A. true
B. false
3. An Air Force ghter jet escort is plainly present in the scene.
A. true
B. false
4. Locate the tail n of the Space Shuttle.
A. (successful)
B. (not successful)
5. Double-layer plume nuclear mushroom cloud: This is a nuclear explosion,
complete with mushroom cloud.
1. How many layers of plumes are there on top of the cloud?
A. one
B. two
115
2. Clouds of dust have started to rise around the base of the explosion.
A. true
B. false
3. Locate the very top of the mushroom cloud.
A. (successful)
B. (not successful)
4. Locate ground zero, the likely spot where the actual bomb exploded.
A. (successful)
B. (not successful)
6. Chocolate chip cookie: This is a homemade cookie.
1. This cookie is a:
A. chocolate chip cookie
B. sugar cookie
2. Somebody has already taken a large bite out of this cookie.
A. true
B. false
3. How many chocolate chips are there?
A. 6 or fewer
B. more than 6
4. Some chips are small, others are large. Locate a large chocolate chip.
A. (successful)
B. (not successful)
7. Close-up of President Bill Clinton: This is President Bill Clinton.
1. This picture shows the President:
A. from the waist up
B. from the neck up
2. The President is wearing a brimmed hat.
A. true
B. false
3. Locate the President's mouth.
A. (successful)
B. (not successful)
4. Locate the President's eyes.
A. (successful)
B. (not successful)
8. Two-shot of Beavis and Butthead: This is a picture from MTV's cartoon
show \Beavis and Butthead," with the two stars of the show sitting on a
couch. Butthead is on the left and Beavis is on the right.
1. The one on the left, Butthead, is facing to the:
A. left
B. front
116
2. The one on the right, Beavis, is facing to the:
A. left
B. right
3. Which one has more hair?
A. Butthead, on the left
B. Beavis, on the right
4. One of the two has dark hair, the other has light hair. Locate the dark
hair.
A. (successful)
B. (not successful)
9. Tornado funnel cloud in Oklahoma: This is an active tornado funnel cloud
photographed recently in Oklahoma.
1. The tornado has already touched the ground.
A. true
B. false
2. There are buildings in the path of the tornado.
A. true
B. false
3. Locate the point where the tornado funnel merges with the general cloud
cover in the scene.
A. (successful)
B. (not successful)
4. Locate the point of the tornado that is closest to, or touching, the ground.
A. (successful)
B. (not successful)
10. Electron micrograph of the Ebola virus: This is a highly magnied electron
microscope picture of the deadly Ebola virus.
1. The overall shape of the virus is:
A. straight
B. curved
2. The ends of the Ebola virus are identical.
A. true
B. false
3. The head end of the virus has 3 loops, while the tail end is a single strand.
Locate the head end.
A. (successful)
B. (not successful)
4. Locate the tail end.
A. (successful)
B. (not successful)
117
Appendix E
COLLECTED TACTICS PARAMETERS
Table E.1: Summary of parameters relevant to TACTICS and tactile image perception.
Factor
Ratio of tactual to visual bandwidths
Minimum discernible separation of two
points (static)
Minimum discernible displacement of a
point on a smooth surface
Height of braille dot
Minimum discernible separation of
groves in grating (dynamic)
Resolution of laser printer
Resolution of microcapsule paper
(expanded)
Expanded displacement of microcapsule
paper
Resolution of human ngertip
Resolution of ngertip compares with:
Human memory organization
Congenital blindness
Adventitious blindness
Blind population (worldwide)
Blind population (U.S.)
Braille uency (U.S. blind population)
Best size for tactile image
118
Parameters
1:10000
2.5mm
0.002mm
0.2-0.5mm
1.0mm
7620-15240 dots/mm (300-600 dpi)
1-5 capsules/mm
0.2-1.0mm
1 dot/mm
very blurry vision
Hierarchical: general to specic
onset up to age 5
onset after age 5
30-40 million
500,000
<16%
3-5in on a side
Appendix F
HUMAN SUBJECTS REVIEW BOARD EXEMPTION
119
Appendix G
TACTILE IMAGE EXAMPLES
Figure G.1: Electron micrograph of Ebola Zaire virus before and after processing
with TACTICS. (CDC)
120
Figure G.2: Figure G.1 expanded on microcapsule paper.
121
Figure G.3: Image of space shuttle Challenger landing before and after processing
with TACTICS. (NASA)
122
Figure G.4: Figure G.3 expanded on microcapsule paper.
123
Figure G.5: Image of moon before and after processing with TACTICS. (NASA)
124
Figure G.6: Figure G.5 expanded on microcapsule paper.
125
Figure G.7: Image of a face before and after processing with TACTICS. (US Govt)
126
Figure G.8: Figure G.7 expanded on microcapsule paper.
127
Figure G.9: Image of a desktop computer before and after processing with
TACTICS. (public domain)
128
Figure G.10: Figure G.9 expanded on microcapsule paper.
129
Figure G.11: Image of a tornado in Oklahoma before and after processing with
TACTICS. (public domain)
130
Figure G.12: Figure G.11 expanded on microcapsule paper.
131
Figure G.13: Image of Emma before and after processing with TACTICS.
(personal)
132
Figure G.14: Figure G.13 expanded on microcapsule paper.
133