automatic generation of tactile graphics
Transcription
automatic generation of tactile graphics
AUTOMATIC GENERATION OF TACTILE GRAPHICS by Thomas P. Way A thesis submitted to the Faculty of the University of Delaware in partial fulllment of the requirements for the degree of Master of Science in Computer and Information Sciences Fall 1996 c 1996 Thomas P. Way All Rights Reserved AUTOMATIC GENERATION OF TACTILE GRAPHICS by Thomas P. Way Approved: Approved: Approved: Approved: Kenneth E. Barner, Ph.D. Professor in charge of thesis on behalf of the Advisory Committee Errol L. Lloyd, Ph.D. Professor in charge of thesis on behalf of the Advisory Committee Errol L. Lloyd, Ph.D. Chairman of the Department of Computer and Information Sciences John C. Cavanaugh, Ph.D. Interim Associate Provost for Graduate Studies ACKNOWLEDGMENTS Work on this project was performed at the University of Delaware's Applied Science and Engineering Laboratories, operated jointly with and located at the Alfred I. duPont Institute, in Wilmington, Delaware. As part of the \Science, Engineering and Mathematics" project, funding was provided by the National Science Foundation, grant number HRD-9450019. Additional funding was provided by the Nemours Research Programs. I thank Dr. Barner and Dr. Richard Foulds for their wisdom and insight, and for providing me with the opportunity to perform graduate work at the Applied Science and Engineering Laboratories. Thanks to Dr. Lloyd for his generous support and encouragement, particularly as the deadline approached. Heartfelt appreciation goes to Dr. Lori Pollock for her guidance and moral support. Special thanks is extended to my colleagues in the SEM project and at the Applied Science and Engineering Laboratories for their suggestions, criticism and praise. Sincere appreciation is extended to the fourteen brave souls who gave of their time to serve as \human lab rats" in the two incarnations of my experiments. This thesis and the hours of research and intense study it represents would not have been possible without the encouragement and nancial support of my parents, Stan and Laurie Way, my grandmothers, Ellen B. Way and Margaret M. Pelland, and my mother-in-law, Mary B. Larsen. The caring and support of my iii family, from my brother John, sisters Melinda and Julie, and brother-in-law Ray, to my extended family that is sprinkled about the country in California, Connecticut, Illinois, Maine, Maryland, Michigan, Virginia, and Washington, has been a great source of strength throughout this endeavor. Thanks also to my cats Bess, George, Harriet and Onessa for always making sure that my clothing was coated with plenty of cat hair at the start of each day. Thanks to my daughter Emma for reminding me daily that there are things far more important than research, reading lists and report cards. Finally, a special thank you is long overdue to my wife Laura, for her love, understanding and strength in the face of this unpredictable and arduous journey through graduate school. I appreciate it more than you can know. iv TABLE OF CONTENTS LIST OF FIGURES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : x LIST OF TABLES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xiv ABSTRACT : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : xvi Chapter 1 INTRODUCTION : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1 2 BACKGROUND : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10 2.1 The Human Sensory System : : : : : : : : : : : : : : : : : : : : : : : 11 2.1.1 Information and Sense : : : : : : : : : : : : : : : : : : : : : : 11 2.1.2 Bandwidth Comparison : : : : : : : : : : : : : : : : : : : : : 11 2.2 Tactual Perception : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 2.2.1 2.2.2 2.2.3 2.2.4 Cutaneous Sensing : : : : Spatial Sensing : : : : : : Tactile Pattern Perception Aiding Comprehension : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 13 14 15 16 2.3 The Blind Population : : : : : : : : : : : : : : : : : : : : : : : : : : : 18 2.3.1 Denition of Terms : : : : : : : : : : : : : : : : : : : : : : : : 18 2.3.2 Misconceptions : : : : : : : : : : : : : : : : : : : : : : : : : : 18 v 2.3.3 The Blind Computer User : : : : : : : : : : : : : : : : : : : : 19 2.4 Access Technology for Blind Computer Users : : : : : : : : : : : : : : 20 2.4.1 Static Tactile Graphics : : : : : : : : : : : : : : : : : : : : : : 21 : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 23 23 23 23 24 24 25 28 28 Auditory Interfaces : : : : : : : : : : : : : : : : : : : Dynamic Tactile Interfaces : : : : : : : : : : : : : : : Haptic Interfaces : : : : : : : : : : : : : : : : : : : : Dynamic Tactile Display Research : : : : : : : : : : : Moving Toward Eective Tactile Display of Graphics : : : : : : : : : : : : : : : : : : : : : : : : : 28 30 33 34 35 2.4.1.1 2.4.1.2 2.4.1.3 2.4.1.4 2.4.1.5 2.4.1.6 2.4.1.7 2.4.1.8 2.4.1.9 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6 Raised-line drawing boards Tactile-experience pictures : Buildup displays : : : : : : Embossed paper displays : : Braille graphics : : : : : : : Vacuum-forming method : : Microcapsule paper : : : : : Other methods : : : : : : : Summary : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2.5 Representation of Images : : : : : : : : : : : : : : : : : : : : : : : : : 36 2.5.1 Quantization : : : : : : : : : : : : : : : : : : : : : : : : : : : 36 2.5.2 Computerized Representation : : : : : : : : : : : : : : : : : : 37 2.6 Image Processing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37 2.6.1 Applicability to Tactual Perception and TACTICS : : : : : : 39 3 TACTICS: TACTILE IMAGE CREATION SYSTEM : : : : : : : 40 3.1 Automatic Generation of Tactile Graphics : : : : : : : : : : : : : : : 40 3.2 Genesis of TACTICS : : : : : : : : : : : : : : : : : : : : : : : : : : : 41 3.3 Image Processing Algorithms : : : : : : : : : : : : : : : : : : : : : : 42 3.3.1 3.3.2 3.3.3 3.3.4 Notation : : : : Edge Detection Blurring : : : : Segmentation : : : : : : : : : : : : : : : : : : : : : : : : : vi : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 42 44 46 47 3.3.5 Negation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 50 3.3.6 Median Filtering : : : : : : : : : : : : : : : : : : : : : : : : : 51 3.4 Image Processing Tools : : : : : : : : : : : : : : : : : : : : : : : : : : 52 3.5 Tactile Imaging : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52 3.5.1 Description : : : : : : : : : : : : : : : : : : : : : : : : : : : : 52 3.5.2 Development : : : : : : : : : : : : : : : : : : : : : : : : : : : 53 3.5.3 Sequencing of Algorithms : : : : : : : : : : : : : : : : : : : : 53 3.6 Tactile Output : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 54 3.6.1 Microcapsule Paper : : : : : : : : : : : : : : : : : : : : : : : : 54 3.6.2 Tactile Image Enhancer : : : : : : : : : : : : : : : : : : : : : 55 3.6.3 Additional Equipment : : : : : : : : : : : : : : : : : : : : : : 56 3.7 Experimental Procedure for Tactile Image Creation : : : : : : : : : : 56 3.7.1 Acquisition of Images : : : : : : : : : : : : : : : : : : : : : : : 56 3.7.2 Simplication : : : : : : : : : : : : : : : : : : : : : : : : : : : 57 3.7.3 Tactilization : : : : : : : : : : : : : : : : : : : : : : : : : : : : 57 4 EVALUATION OF TACTICS : : : : : : : : : : : : : : : : : : : : : : 58 4.1 Overview of Experimental Protocol : : : : : : : : : : : : : : : : : : : 58 4.1.1 4.1.2 4.1.3 4.1.4 Selection of Subjects : : : : : : : : : : : : : : : : : : : : Production of Materials : : : : : : : : : : : : : : : : : : Aggregate Image Processes : : : : : : : : : : : : : : : : : Psychophysics and Experimental Procedure Justication 4.1.4.1 4.1.4.2 4.1.4.3 4.1.4.4 Detection : : : Discrimination Identication : Comprehension : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 59 59 60 67 : : : : : : : : : : : : 67 67 68 68 4.2 Experiments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 69 4.2.1 Pilot Study : : : : : : : : : : : : : : : : : : : : : : : : : : : : 69 4.2.1.1 Subjects : : : : : : : : : : : : : : : : : : : : : : : : : 70 vii 4.2.1.2 4.2.1.3 4.2.1.4 4.2.1.5 Materials : : : : : : : : Procedure : : : : : : : : Results : : : : : : : : : Discussion of pilot study : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70 71 71 72 4.2.2 Simple Discrimination Experiment : : : : : : : : : : : : : : : 74 4.2.2.1 4.2.2.2 4.2.2.3 4.2.2.4 Subjects : Materials Procedure Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 74 75 75 76 4.2.3 Timed Discrimination Experiment : : : : : : : : : : : : : : : : 78 4.2.3.1 4.2.3.2 4.2.3.3 4.2.3.4 4.2.3.5 Subjects : : : : : : : : : : : : : : : : : : Materials : : : : : : : : : : : : : : : : : Procedure : : : : : : : : : : : : : : : : : Results : : : : : : : : : : : : : : : : : : Comparison with simple discrimination : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 78 79 79 79 81 4.2.4 Identication Experiment : : : : : : : : : : : : : : : : : : : : 81 4.2.4.1 4.2.4.2 4.2.4.3 4.2.4.4 Subjects : Materials Procedure Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 82 82 82 83 4.2.5 Comprehension Experiment : : : : : : : : : : : : : : : : : : : 84 4.2.5.1 4.2.5.2 4.2.5.3 4.2.5.4 Subjects : Materials Procedure Results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 84 85 85 86 4.2.6 Signicance of Results : : : : : : : : : : : : : : : : : : : : : : 87 5 OBSERVATIONS, DISCUSSION AND CONCLUSIONS : : : : : 89 5.1 Observations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 89 5.2 Discussion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92 5.3 Conclusions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95 viii 6 FUTURE DIRECTIONS : : : : : : : : : : : : : : : : : : : : : : : : : : 97 6.1 6.2 6.3 6.4 Development of End User Application : : Extension to Refreshable Tactile Display Multimodal Interface : : : : : : : : : : : Mapping Color to Texture : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97 98 98 99 BIBLIOGRAPHY : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 100 Appendix A LISTING OF IMAGES : : : : : : : : : : : : : : : : : : : : : : : : : : : 108 A.1 Pilot Study Images : : : : : : : : : : : : : : : : : : : : : : : : : : : : 108 A.2 TACTICS Evaluation Images : : : : : : : : : : : : : : : : : : : : : : 108 B SIMPLE AND TIMED DISCRIMINATION IMAGE PAIRINGS 110 B.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110 B.2 Flexi-Paper Pairs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110 B.3 Matsumoto Kosan Paper Pairs : : : : : : : : : : : : : : : : : : : : : : 111 C IDENTIFICATION EXPERIMENT IMAGES AND CATEGORIES : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112 C.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 112 C.2 Listing of Images and Categories : : : : : : : : : : : : : : : : : : : : 112 D COMPREHENSION EXPERIMENT IMAGES, DESCRIPTIONS AND QUESTIONS : : : : : : : : : : : : : : : : : 114 D.1 Preparation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 114 D.2 Listing of Images, Descriptions and Questions : : : : : : : : : : : : : 114 E COLLECTED TACTICS PARAMETERS : : : : : : : : : : : : : : : 118 F HUMAN SUBJECTS REVIEW BOARD EXEMPTION : : : : : 119 G TACTILE IMAGE EXAMPLES : : : : : : : : : : : : : : : : : : : : : 120 ix LIST OF FIGURES 1.1 Astronaut Edwin E. Aldrin, Jr. poses beside a deployed U.S. ag on the surface of the moon. (NASA) : : : : : : : : : : : : : : : : : : : 2 1.2 First ever electron micrograph of Ebola Zaire virus, taken by Dr. F. A. Murphy at the Centers for Disease Control in 1976. Diagnostic specimen in cell culture at 160,000X magnication. (CDC) : : : : : 2 1.3 Figure 1.1 after image was processed using TACTICS. : : : : : : : 9 1.4 Figure 1.2 after image was processed using TACTICS. See Appendix G for samples of expanded tactile images. : : : : : : : : : 9 2.1 Microcapsule paper (enlarged view) showing layer of polystyrene microcapsules on polyethylene or paper transport medium. : : : : : 25 2.2 Microcapsule paper after image is axed to the surface by photocopying or ink drawing. : : : : : : : : : : : : : : : : : : : : : 27 2.3 Simplied view of the Tactile Image Enhancer, showing internal workings of the device for expanding previously exposed microcapsule paper. : : : : : : : : : : : : : : : : : : : : : : : : : : 27 2.4 Microcapsule paper after exposure in image enhancer, showing expanded capsules. Note that capsules may not expand fully when only partially covered by printing, although this degree of expansion is unpredictable. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 27 2.5 Telesensory's Optacon II in action [83]. User places index nger of one hand on vibrotactile pin array and guides scanner across material to be viewed with other hand. (Telesensory) : : : : : : : : 30 x 2.6 Layout of the vibrotactile pin matrix display of the Optacon. : : : : 31 2.7 Active pin matrix display of the Optacon, demonstrating display of the capital letter S. : : : : : : : : : : : : : : : : : : : : : : : : : : : 32 2.8 Tactile Vision Substitution System (TVSS ) [98]. : : : : : : : : : : : 33 3.1 Format of two-dimensional image. : : : : : : : : : : : : : : : : : : : 42 3.2 Before and after Sobel edge detection algorithm. (public domain) : 44 3.3 Image before and after application of blurring algorithm. : : : : : : 46 3.4 Image before and after application of K -means segmentation algorithm, with K = 2. : : : : : : : : : : : : : : : : : : : : : : : : : 47 3.5 Image before and after application of an adaptive K -means segmentation. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 49 3.6 Image before and after application of negation algorithm. : : : : : : 50 3.7 A noisy processed image before and after the application of median ltering. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 51 3.8 Tactile Image Enhancer. (Repro-Tronics) : : : : : : : : : : : : : : : 55 4.1 Original unprocessed grayscale image of the chimney end of a house. (public domain) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 61 4.2 Image of house before and after processing using Sobel edge operator with thresholding. : : : : : : : : : : : : : : : : : : : : : : 61 4.3 Image of house before and after processing using Sobel edge operator without thresholding. : : : : : : : : : : : : : : : : : : : : : 61 4.4 Image of house before and after processing using K-means adaptive segmentation algorithm. : : : : : : : : : : : : : : : : : : : : : : : : 63 4.5 Image of house before and after processing using Sobel edge operator without thresholding followed by K-means segmentation. : 63 xi 4.6 Comparison of eect of Sobel edge detection using xed thresholding from Figure 4.2 (left) with Sobel edge detection utilizing adaptive K-means segmentation (for thresholding) from Figure 4.5 (right). : 63 4.7 Image of house before and after processing using K-means segmentation followed by Sobel edge detection. : : : : : : : : : : : 65 4.8 Images of a face demonstrating the dierence between two sequences of processing. From left to right: Original image, image after Sobel edge detection without thresholding followed by K-means segmentation, and image after K-means segmentation followed by Sobel edge detection. (US Govt) : : : : : : : : : : : : : : : : : : : 65 4.9 Image of house before and after processing using the aggregate sequence of processes: blurring, Sobel edge detection without thresholding, K-means segmentation and median ltering. : : : : : 66 4.10 Comparison of image of house using the aggregate process from Figure 4.9 (left) and the same aggregate sequence of processes with the exception of the initial blurring step (right). : : : : : : : : : : : 66 G.1 Electron micrograph of Ebola Zaire virus before and after processing with TACTICS. (CDC) : : : : : : : : : : : : : : : : : : : : : : : : 120 G.2 Figure G.1 expanded on microcapsule paper. : : : : : : : : : : : : : 121 G.3 Image of space shuttle Challenger landing before and after processing with TACTICS. (NASA) : : : : : : : : : : : : : : : : : : 122 G.4 Figure G.3 expanded on microcapsule paper. : : : : : : : : : : : : : 123 G.5 Image of moon before and after processing with TACTICS. (NASA) 124 G.6 Figure G.5 expanded on microcapsule paper. : : : : : : : : : : : : : 125 G.7 Image of a face before and after processing with TACTICS. (US Govt) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 126 G.8 Figure G.7 expanded on microcapsule paper. : : : : : : : : : : : : : 127 xii G.9 Image of a desktop computer before and after processing with TACTICS. (public domain) : : : : : : : : : : : : : : : : : : : : : : 128 G.10 Figure G.9 expanded on microcapsule paper. : : : : : : : : : : : : : 129 G.11 Image of a tornado in Oklahoma before and after processing with TACTICS. (public domain) : : : : : : : : : : : : : : : : : : : : : : 130 G.12 Figure G.11 expanded on microcapsule paper. : : : : : : : : : : : : 131 G.13 Image of Emma before and after processing with TACTICS. (personal) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 132 G.14 Figure G.13 expanded on microcapsule paper. : : : : : : : : : : : : 133 xiii LIST OF TABLES 2.1 Summary of information bandwidth limitations for three senses [45]. 12 2.2 Reading modes used by a group of 7,987 totally blind students. : : 20 4.1 Summary of per subject average results of the tactile image matching task for ve image processes [94]. : : : : : : : : : : : : : 72 4.2 Summary of overall results of simple discrimination task for four image processes. The Aggregate Process is comprised of blurring, Sobel edge detection without thresholding, K-means adaptive segmentation, and median ltering, applied in that order. : : : : : 77 4.3 Summary of percentage of correct responses comparing eects of two varieties of microcapsule paper on simple discrimination task. : : : 77 4.4 Summary of percentage of correct responses comparing results of blind versus sighted subjects performing simple discrimination task. 78 4.5 Summary of overall results of timed discrimination task for four image processes. : : : : : : : : : : : : : : : : : : : : : : : : : : : : 80 4.6 Summary of percentage of correct responses comparing eects of two varieties of microcapsule paper on timed discrimination task. : : : : 80 4.7 Summary of percentage of correct responses comparing results of blind versus sighted subjects performing timed discrimination task. 80 4.8 Summary of percentage of correct responses comparing results of all subjects on simple discrimination versus timed discrimination tasks. 81 xiv 4.9 Summary of overall results of identication task for four image processes. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 83 4.10 Summary of percentage of correct responses comparing results of blind versus sighted subjects performing identication task. : : : : 83 4.11 Summary of results of comprehension task for three subtasks and overall comprehensibility of tactile images prepared using Aggregate process. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 87 4.12 Summary of percentage of correct responses comparing results of blind versus sighted subjects performing comprehension task. : : : 87 E.1 Summary of parameters relevant to TACTICS and tactile image perception. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 118 xv ABSTRACT Access to visual information by blind and visually impaired persons is often achieved through its manual translation into tactile form. This conversion is a timeconsuming eort involving the use of glue, string, scissors, cardboard and other craft materials, tracing paper and marking pens, or computer-aided drawing packages, to produce a tangible representation of the original image. Although worthwhile, such an approach is neither timely nor easily reproducible, and clearly necessitates the involvement of a specially skilled sighted individual in the process. Computers excel at displaying information via multiple media, including the CDROM and ubiquitous Internet. This omnipresence of the computer in everyday life provides ready availability to a myriad of graphical, textual and auditory information for sighted and blind individuals alike. For blind computer users, text-based information is output as synthesized speech or as braille via a special purpose printer or display. The surging prevalence of the graphical user interface (GUI), however, introduces severe impediments for the blind community, resulting from a forsaking of the textual in favor of the visual. This trend toward visual display techniques means that, in the midst of the dawning \Information Age," a blind person has reduced access to information. Pictures, drawings, video and animation are not directly accessible to the blind computer user. xvi This thesis develops a composite software/hardware system for automatic translation of electronic images into tactile form. In this system, an aggregate process comprised of a sequence of image processing algorithms is applied to an image to produce a simplied version of the original. This caricaturized image is subsequently output in a raised tactile graphic form on microcapsule paper, suitable for display to a blind person. To motivate the techniques used in this system, topics in human perception, tactile graphics production and image processing are explored. To provide access to visual information for blind persons, an understanding of how we as humans interface with the world around us and how tactile graphics are produced is vital. A summary of pertinent background regarding human factors and perceptual issues, particularly as they relate to blindness, is provided. The technologies and techniques for tactile graphic production are reviewed, as is current research in this area. The use of image processing techniques for purposes of simulating some aspects of the visual system is justied, and applicable algorithms for such processing are discussed. Presented next are the specic techniques used in this system to produce tactile images from visual ones rapidly and automatically. The ecacy of these techniques is examined in terms of recognizability, classiability and comprehensibility as measured in a series of experiments. Finally, future directions in which this work may lead are discussed. xvii Chapter 1 INTRODUCTION \One picture is worth a thousand words [36]." So reads the well-worn age-old adage. The professor describes a particularly challenging concept to the class and nds herself awash in a sea of blank stares. What does she do next? She puts chalk to slate and illustrates the dicult topic with a diagram. The glazed looks are replaced with light bulbs. A man is strolling along a busy city sidewalk. He reads the newspaper as he walks, oblivious to where he is stepping. We see a banana peel on the walkway ahead. Instantly, we anticipate what will happen next. Moments later, a shoe lands squarely on the peel and the newspaper ies into the air as the man tumbles to the pavement. Consider the remarkable live television image of a human being standing on the surface of the moon in 1969 (Figure 1.1). Consider the consequences of the sinister microscopic Ebola virus, so deadly it kills its prey in weeks or even days (Figure 1.2). Words alone cannot express the full impact of such images in the way that the pictures can. Visual information, whether to illustrate a point, make us chuckle or inspire us, is all around us and speaks to us in a most powerful way. Suppose that you do not have the sense of sight. All of those pictures are now virtually inaccessible to you. A diagram on the chalkboard is nothing more than a series of bone-jangling squeaks. A walk down a city sidewalk is a cacophony of footsteps, car horns and 1 Figure 1.1: Astronaut Edwin E. Aldrin, Jr. poses beside a deployed U.S. ag on the surface of the moon. (NASA) Figure 1.2: First ever electron micrograph of Ebola Zaire virus, taken by Dr. F. A. Murphy at the Centers for Disease Control in 1976. Diagnostic specimen in cell culture at 160,000X magnication. (CDC) 2 street vendors. The television screen is just a slightly curved piece of glass that crackles with static electricity when you brush your ngers across it. A photograph is simply a slick piece of paper. Blindness eliminates access to the myriad of visual information that many of us rely on to make our way in the world each day. For persons who are blind, the answer to this impaired access is to rely upon the senses of hearing and touch. In 1834, Louis Braille perfected an embossed-dot code for sightless reading and writing, regarded as one of the most signicant contributions to the education of blind persons and replacing a less eective embossed letter system [57]. With braille, letters of the alphabet, numbers and other symbols are represented by raising various combinations of dots in a six dot (2 3) or eight dot (2 4) rectangular array. Braille is the standard method for producing books that blind persons can read [26]. Computers can convert the written word into speech, so that any text printed on the computer screen can be spoken aloud. Neither braille nor audio output, however, can yet provide good access to raw visual information [32]. Future applications may one day include an articially intelligent image-to-text converter that would examine a computerized picture and generate a textual description that could subsequently be output as speech. This problem, known as Image Understanding, remains unsolved due to (1) information lost when a two-dimensional image is created from the three-dimensional world, (2) object occlusion, and (3) the eects of inter-reaction of visual phenomena on the value of each pixel [9, 12, 74, 78]. Specic applications of image understanding techniques are currently in use in the areas of mobile robot navigation, complex manufacturing tasks, medical image processing, and analysis of satellite images [74]. However, solving this dicult problem of articial vision in the general case is not likely to happen soon. The richness and variety of the spatial information contained in the visual domain requires thousands upon thousands of words to describe the complete content adequately, a method which is of questionable practicality and far beyond 3 the state of the art. This absence of practical image description techniques points out the necessity for eective tactile rendering. One straightforward method of providing access to an image for blind people is to associate with the image a brief textual description that can be accessed at will. Such a description necessitates special preparation by a sighted person [8, 25]. Furthermore, this preparation must be done for every image to assure complete accessibility. For limited applications this method might be feasible, but it is not practical in the general case. For example, with the literally millions of images on the Internet already, and a constant stream of new ones pouring onto the network daily, it is dicult to envision providing a textual description for each. There are certain images that, because of their particular timeliness or rapid change, would be quite dicult to describe adequately using text or audio. Consider images of a satellite weather map that may be updated once a second, twenty-four hours a day. Clearly, inclusion of an individually written textual description of such images, even given the current state of the art, is improbable at best. The sense of touch is relied upon frequently by blind persons in lieu of sight. One common method of presenting visual images in a touchable or tactile fashion is through use of tactile graphics. The term tactile refers to the sense of touch [55]. Tactile graphics provide a raised representation of such visually useful materials as maps, graphs and other simple drawings. By current practice, these are prepared by a sighted person individually and by hand. This preparation is neither timely nor ecient. Timeliness, however, is not a major issue for infrequently changing items, such as maps [25]. Many blind persons rely upon the computer as a pipeline connecting them to a deep well of easily accessible textual information. The dawning of the so-called 4 \Information Age" has brought with it a shift from textual to graphical representation of information. Everywhere one goes on the Internet, glitzy icons, images and animation are replacing words. This explosive growth of reliance upon graphics as the choice for information presentation, which includes the dominance of the graphical user interface (GUI), has had a signicant positive impact on sighted computer users and a drastic negative one on blind computer users [11, 92]. The volume of graphical information residing on the Internet and present by the very nature of the GUI paradigm makes it impractical to include a textual description with each and every graphic. Some barriers are overcome with new commercial GUI-friendly screen review and speech synthesis software and hardware. These systems can be combined with well-developed technologies such as embossed braille printers and braille cell displays to provide limited access. Directory navigation and text-based tasks such as word processing in the GUI environment can be handled by keeping an o-screen model of the on-screen graphics [65, 71]. In such a model, words are drawn on the computer screen as pictures of these words, collections of pixels set to the right color and intensity in the right positions on the screen. Meanwhile, each word of the original text is kept in a location in the computer memory, and is associated with its picture on the screen. In this way, words can be provided to a speech synthesizer or braille display, thus giving access to a person who cannot see the screen [15, 32]. One commercial device, the Optacon (see page 30), can produce a vibrotactile representation of whatever words or drawings pass underneath its hand-held scanner. The Optacon display is a ngertip-sized matrix of tiny pins that vibrate individually in response to an object, such as text or a simple drawing, viewed by the scanner [83]. Thus, an a feels like a vibrating letter a to the nger. Unfortunately, this a may also feel like similarly shaped letters such as c, e, o, s or u. This inherent ambiguity 5 means that even with quite a bit of training, reading with an Optacon is slow [10, 21, 22]. Other experimental means, including an Optacon-based device, have proven successful in aording limited tactile access to very simple symbols, providing a means to distinguish between, for instance, a circle, a triangle and an X [99]. These methods, however, fall short in their ability to provide access to complex visual information, such as photographs. A photograph is a two-dimensional depiction of a three-dimensional view. By complex we mean an image with the qualities of a typical photograph. These qualities include having many shades of color and levels of intensity, shadows and other depth cues such as overlap and relative size, and complicated shapes. We glean clues about orientation of, and relative positions among, objects in a photograph from shadows, shape and size [18]. Presenting such complex information as a tactile image is nontrivial, to say the least. Tactile imaging is the process of turning a visual item, such as a picture, into a touchable raised version of the image so that this tactile rendition faithfully represents the original information. Properly done, tactile imaging provides access for blind persons to visual information that is inaccessible via other means such as audio or textual description. Tactual perception, the physiological capabilities of the human sensory system to explore and discern via the sense of touch, is well understood. Factors such as the size and shape of the ngertip, temporal and spatial response of the nerve receptors in the skin, and incorporation of kinesthetic, or haptic, cues must be considered. These factors limit the size and detail of tactile images to within the response ranges of these various factors [55, 82]. The way in which the mind perceives and classies images is a well-studied area, one in which a number of theories have developed. Among these, perhaps the most accepted view is that of human memory being arranged hierarchically 6 from general to specic in terms of one or more qualities of the object being perceived. Whether the information is visual or tactile, the brain uses this same general framework for classication [18, 42, 46]. Thus, producing usable tactile images from photographs is a challenge requiring a careful balance of resolution, size, shape and detail. Having too much detail in a tactile image will result in much of its content being lost, actually degrading its clarity and utility due to an information overload of sorts. This overload results from limitations of tactual perception, particularly the physiological disparity between the resolution of the human eye and ngertip. Including too little detail will result in a tactile image that may not feel like anything more than a simple shape, not adequately representing the original image at all [43]. This ambiguity is due to the manner in which the brain categorizes what it perceives, in this case classifying tactually indistinguishable items as the same, even though the unprocessed visual originals may have been quite dierent. In this thesis, one major step toward creating access to complex visual images is considered. The well-studied areas of tactual perception, the human sensory system in general, image processing techniques, and tactile graphics are discussed. To justify a heretofore unexplored combination of factors and theories from these areas, a broad array of necessary background information is provided. This information plays a formative and vital role in the motivation of this research. These techniques, taken individually, are common, general and well-known. Taken as a whole and in very specic combinations, the results are unique and noteworthy. Specically, the content begins with a brief review of the human sensory system, focusing on how we interface with our world. Relevant statistics regarding the blind population are presented, and an overview of blind computer user interface technology is provided. Perception at the tactual and mental levels and related human performance parameters are discussed, and the visual and tactual senses are 7 contrasted. This content propels a further discussion of image processing techniques that can roughly simulate the abilities of these perceptual systems. This background information then is used to motivate a discussion of a new method for the automatic translation of visual images into tactile images called the TACTile Image Creation System (TACTICS) [95, 96]. This prototype system provides access to previously inaccessible visual information using image processing and tactile graphics production techniques. The goal of this system is to free the blind computer user from reliance upon a sighted individual to prepare custom tactile graphics, or tactics [30], and to overcome the considerable time delay in doing so. Tactile images (Figure 1.3 & Figure 1.4) of photographic images are produced by TACTICS in seconds or minutes as opposed to hours or days. The components and mechanics of the system, including the image processing algorithms, output medium and overall procedure for conversion of images from visual to tactile, are described as well. Justication for the techniques used by TACTICS is presented in the form of results of a series of experiments. These experiments explore a range of tasks from simple discrimination to image content comprehension. From a careful analysis of the results, conclusions are drawn regarding the eectiveness of TACTICS, and future extensions to the system, as well as related areas of future work, are proposed. It is hoped that the reader of this thesis will have a \Why didn't I think of that?" reaction to the research and results presented here. Although the scientic underpinnings of this system are comprehensive and complex, the system itself is straightforward, elegant and intuitive. If the eventual eect of this research is to aord better access to visual information for blind persons, then perhaps it will be judged to be a valuable contribution to science. 8 Figure 1.3: Figure 1.1 after image was processed using TACTICS. Figure 1.4: Figure 1.2 after image was processed using TACTICS. See Appendix G for samples of expanded tactile images. 9 Chapter 2 BACKGROUND The ecacy of a method for automatically converting visual information into tactile information necessarily is dependent upon a variety of factors, which are reviewed in this chapter. To guide the design of such a system, an understanding of the human factors of sensation and perception, including how the sense of touch compares to the sense of sight, is important. There are lessons to be learned from past and current techniques for tactile graphic production and other non-visual methods used by blind persons to access computer-based information. The medium for the description of visual information that is under consideration in this thesis is the computerized image. How such images are represented and the techniques that can be used to operate upon them are explored, and their correspondence to human tactual perception is considered. The background provided in this chapter will be used to motivate the prototype system and experimental protocol detailed in following chapters. 1 1 See Appendix E for a summary of various parameters related to tactual perception and aecting the design of TACTICS. 10 2.1 The Human Sensory System The fundamental issue in presenting visual information in a meaningful tactile form is the understanding of some basics of human sensory perception. By reviewing how the human sensory system collects and comprehends information and what the limits are to the type and amount of information the senses can process, it may be possible to identify factors that can play a role in the conversion of information intended for one sense to a form suitable for another sense. 2.1.1 Information and Sense Humans receive all of their information about the world around them using one or more of ve senses [18]. The Gustatory Sense provides information on taste qualities such as sweet, salty, sour and bitter. Often working in conjunction with taste is the Olfactory Sense, which provides smell information. The Auditory Sense, our hearing, allows us to receive auditory information such as music, speech and noise. The Tactual Sense is comprised of touch and kinesthesis, providing information about such physical world qualities as temperature, perception of texture, position and motion. Finally, the Visual Sense, our sense of sight, is how we receive visual information including color, brightness, depth of eld, and motion. 2.1.2 Bandwidth Comparison The bandwidth of a sense refers to the capacity of that sense to receive and perceive information. Studies show that vision, as one might intuitively expect, is our highest bandwidth sense, followed by hearing and touch (Table 2.1) [45]. The 11 Table 2.1: Summary of information bandwidth limitations for three senses [45]. Sense Modality Limit bits=sec Skin (vibrotactile)2 102 Ear 104 Eye 106 Visual Sense is two orders of magnitude better at carrying information than the Auditory Sense, which is two orders of magnitude better than the Tactual Sense. The Gustatory and Olfactory Senses are much more prone than the others to the eects of adaptation, and are not ecient at carrying information at a rate anywhere near that of even the Tactual Sense. Adaptation refers to the tendency of a sense to grow accustomed to a stimulus, thereby becoming less sensitive to it over time. Taste and smell are prone to adaptation and have comparatively slow recovery times, while the other three senses have speedier recovery times that are roughly proportional to their bandwidths. As the highest bandwidth and most resilient sense, vision is clearly of the greatest importance among the senses, and therefore the hardest to do without. By comparison, the other senses have lower to much lower information capacities which makes the problem of sensory substitution for vision a dicult one to address [18, 43]. The implications for development of a vision substitution system are signicant by virtue of this large bandwidth disparity. Visual information cannot simply be mapped directly to the auditory or tactual domains, but clearly must be reduced by some bandwidth correlated scaling factor. Further, this scaling must preserve the meaning of the original visual information to be useful. It is this information reduction task which forms the basis for the system we develop in this thesis. 2 The results of previous research indicated that the human ngertip processes vibrotactile signals at a rate no more than 10 bits=sec [20, 31, 48]. 1 12 2.2 Tactual Perception Tactual perception primarily refers to active exploratory and manipulative touch. Study of the physiological factors involved in tactual perception is important if one is to gain an understanding of how best to create tactile images. For a tactile image to be useful, a blind person must be able to explore it with the sense of touch, usually the ngers, and extract some content information. Thus, limits to tactual perception, such as resolution of the human ngertip, image scale as a factor of comprehension, and how the mind processes such information are important considerations [54, 55]. 2.2.1 Cutaneous Sensing The basic physiology of the human skin denes limits to the ability of our sense of touch. Of particular importance to tactile graphics are the dierence limen and its relation to temporal response thresholds and masking phenomena. The dierence limen is the minimum statically discernible displacement between two points such that the points are distinct. In eect, this is tactile resolution, which for the skin of the ngertip is approximately 2.5mm. When statically felt, two points closer than this distance tend to feel like one point, whereas two points farther apart than this feel like two distinct points [82]. This gure indicates that the resolution of the ngertip is much lower than the human eye. Therefore, we can safely say that tactile images require lower resolution than visual images. The denitive work on this two-point threshold, including its use as an indicator of the relative spatial resolution as a function of body locus, is in [97]. 13 2.2.2 Spatial Sensing Spatial sensing incorporates what we know about static sensing, embellished with further measurements of sensory abilities taken during motion of the nger [80]. Related to the two-point dierence limen is the minimum discernible displacement of a point on a surface. For highly smooth surfaces and under carefully controlled laboratory conditions, a 2-micron high point can be felt using active touch [50]. The height of a braille dot, an easily discernible object, is in the range 0.02 - 0.05cm [26]. This is a generally acceptable range of heights for tactile graphics, with heights at the upper end of the range naturally providing relative improvements in perceptibility [55], much as brighter lighting or higher volume can improve perceptability in the visual and auditory domains. The limiting factor for the height of tactile graphics is inherent in the media in which they are produced. Spatial tactile discrimination has been measured using square-wave gratings of varying groove amplitudes and separations under conditions of active exploration [39, 55, 82]. Sequences of gratings were presented to the distal pad of the right index nger in both the same and orthogonal orientations to the axis of the nger. Observers noted dierences in orientation of the grooves, which revealed the distance at which orientation of grooves became indiscriminable. This study demonstrated that the minimum tactually discernible grating resolution is 1.0mm, and that such discrimination improves linearly as the grating width increases above 1.0mm. This result is due to the forward masking eect of one stimulus upon perception of subsequent stimuli. The cutaneous receptors in the skin require a period of time to recover after cessation of one stimulus before correct sensing of a subsequent stimulus can begin [54]. 14 Taken together, these factors appear to indicate that the resolution of a tactile image should be somewhat ner than 1 dot=mm to produce a relatively smooth feel to the image, while resolutions much lower than this seem to provide little or no benet to tactile perceptibility. For comparison, a resolution of 1 dot=mm equals 25.4 dots=inch, and the resolution of a standard laser printer is at least as ne as 300 dots=inch. This resolution is sucient and signicant, since the system developed in this thesis relies on a laser printer in the tactile graphics production process. 2.2.3 Tactile Pattern Perception The visual sense responds well to minute dierences in stimulus, while the sense of touch tends to need greater variation in stimulus patterns to succeed in perceptual tasks [44, 55]. Although touch can discriminate and recognize complex tactile patterns [43], such perception involves a number of complicated cognitive processes [47]. There is strong basis for the supposition that spatial information, which includes graphics, is stored in the visual cortex portion of the brain [46]. This mechanism is similar for sighted and blind persons, regardless of whether this information is gathered using the sense of sight or touch. Research indicates that the ability to store and subsequently retrieve tactually perceived spatial information can vary greatly from individual to individual. This variation depends to a signicant degree on the level of visual memory (see page 18) a blind person possesses, as often determined by the age of the onset of blindness. There is comparatively little variation in such ability among the sighted population [77]. The storage and retrieval of spatial information is believed to be organized in a hierarchical fashion in the brain, which classies information based on gross characteristics rst, followed by 15 detailed characteristics [9, 90]. Although the resolution of the sense of touch degrades slowly with age [69], which unfortunately equates with a statistical rise in blindness [77, 79], experience with tactile graphics can make up for this slight loss of touch sensitivity [43, 93]. The method typically used by a blind person to explore a tactile graphic tends to support the hierarchical view of human spatial memory. The exploration by a blind person of a tactile graphic generally is performed in two stages. First, the entire image is explored as a whole, providing a general tactile overview. Second, the details of the tactile image are explored. Research has veried this methodology [34] and has shown that this technique is used by blindfolded sighted persons as well. These results indicate that the concept of a hierarchical structure of the human spatial memory is a reasonable assumption. It is important to note that the acuity of the touch sense is comparable to blurred vision in similar tasks [1, 53]. The signicance of this relationship is that any tactile representation of visual information, based on what we already know about tactual perception, should be suciently simple to make up for this reduced level of acuity [21, 25, 55, 82]. This result supports our choice of pursuing methods of image simplication in producing tactile images from their visual counterparts. 2.2.4 Aiding Comprehension Comprehension of a tactile display is increased when the reader is somehow clued in to what will be felt [25]. Just as one expects photographs in a newspaper to have an associated caption, so too would one reasonably expect that the comprehensibility of a tactile image would be enhanced by including some associated textual information. This enhancement can be accomplished using standard techniques, 16 such as by incorporating braille text with an image or by using speech output from a computer speech synthesizer to add information and increase comprehension. In a photograph, information about the relative depth within the eld of view of objects is provided by masking, shadows and size [18, 46]. This information is not readily discernible in a tactile format and is a factor which can inhibit the comprehensibility of a tactile image. One surprising side eect of congenital blindness (see page 18) on comprehension is the relative insensitivity to orientation of the tactile graphic being touched. Where blindfolded sighted subjects in one study were confused by a rotated or non-upright tactile graphic representation of a known object, blind subjects suered little confusion. These blind subjects were quite facile at mentally rotating the spatial information perceived from the graphic representation, performing much better at comprehension tasks than the sighted subjects under the same conditions [16]. Representing depth and perspective in a tactile image is dicult, if not impossible, using a two-dimensional tactile display medium. Further, the congenitally blind individual lacks a visual frame of reference for interpretation of such inherently three-dimensional information when it is mapped onto a two-dimensional display [55]. This shortcoming of two-dimensional tactile graphics display methods can be handled by some of the up-and-coming haptic display technologies (see page 33). 17 2.3 The Blind Population 2.3.1 Denition of Terms The American Foundation for the Blind recommends that the term blind be reserved for individuals with no usable sight whatsoever, while low vision, visually impaired or partially sighted can be used to describe those with some usable vision. These terms coincide with standard medical diagnostic guidelines which divide visual impairment into two classications: no light perception (NLP) and light perception (LP). An individual with corrected visual acuity of 20=200 in the better eye or a visual eld of 20 degrees or less in the better eye is considered legally blind. A blind person is either congenitally blind , being blind from birth or during the rst ve years of life and possibly lacking visual memory, or adventitiously blind, with blindness beginning after the age ve and with the probable presence of visual memory. Visual memory means the ability to classify and remember objects we perceive in terms of visual characteristics, such as shape, size, color, position and perspective [77]. 2.3.2 Misconceptions There exist numerous misconceptions regarding blind persons [37, 60, 77]. Positive misconceptions are that blind people are exceptionally musical, possess extraordinary senses of hearing and touch, and are highly intelligent. Negative misconceptions include suppositions of helplessness, dependence, laziness and lack of intelligence. Of particular relevance is the supposed increased sense of touch. Touch sensitivity varies little from person to person, with no statistical dierence between the sighted and blind population [56]. However, it does seem reasonable that a blind person may be more accustomed to relying on the sense of touch and interpreting tactual information [5, 43]. 18 2.3.3 The Blind Computer User Statistics released by the World Health Organization in 1987 estimate that there are 30- to 40-million blind people in the world [77]. According to 1989 statistics from the National Society to Prevent Blindness, approximately 500,000 U.S. residents are legally blind [77]. Of those gures, roughly ten percent are totally without sight [79]. The increase in the general population's reliance upon the computer carries over to the blind population as well [11]. As the number of computer users continues to grow quite rapidly, any precise count of users would obviously be out of date even before it was written down. However, what is certain is that this number is suciently large to support an assertion that blind computer users make up a sizable group. It is worth noting that the availability and aordability of synthetic speech output via computer has broadened access to information for this population as compared to braille access to the same information. According to the American Printing House for the Blind (APH), of the blind population residing in the United States and of reading age, fewer than 16 percent are uent in braille, while worldwide the gure is lower still [93]. Another study cites the braille uency rate among blind and visually impaired computer users at 10 percent [32]. While these low braille literacy rates are discouraging, there is some reason for optimism in the future. In a study of school systems for blind children, more than one third of the students were found to be uent in braille, although audio output, either in the form of recorded books or speech synthesis, was still the mode of choice at the time of the study (Table 2.2) [93, 100]. 19 Table 2.2: Reading modes used by a group of 7,987 totally blind students. Method Percentage 3 Aural 61 Braille 37 Braille & Large Type 1 Large Type 1 The size of the blind population in proportion to the general population is expected to remain steady [77, 79]. The portion of the visually impaired population that has some residual sight, and that can access computers using sight-enhancement techniques such as screen magniers, will not necessarily be helped by the research in this thesis. While the theories and methods developed here have wide applications, including the elds of telecommunications, rehabilitation engineering and computer vision, the focus here will be on providing access to those blind and visually impaired persons who cannot benet from currently existing sight enhancement technology. For purposes of this thesis, this group will be referred to as blind computer users. 2.4 Access Technology for Blind Computer Users Blind persons have a great many means for accessing textual and visual information [10, 14, 15, 17, 24, 25, 30, 29, 32, 49, 61, 88]. A number of these methods already do or can be adapted to provide blind computer users with access to graphical information. Many traditional methods of access, such as braille output in one form or another, are, and continue to be, widely used. Their ecacy is unquestioned. Some relatively recent developments, such as speech output, are also 3 Note that a small percentage (approximately 2%) possessed enough residual sight to make use of Large Type, either alone or in combination with Braille writing, although due to either extremely low acuity or a narrow eld of view these students were classied as totally blind [100]. 20 eective and quickly merging with traditional methods to create new standards for access. Research is active in the development of dynamic and refreshable tactile displays [15, 24]. Innovations in the materials and techniques used to display visual information in a non-visual fashion are achieving some success [27, 92]. These new methods show promise, although technology continues to lag behind concept. The task of accessing visual information is one of mapping information from the visual domain to that of one of the other senses. Knowing that this is essentially an information volume-reduction problem, given that the bandwidth of each of the other four senses is signicantly lower than that of vision, it is helpful to look at some of the more successful approaches to tackling this problem before developing additional solutions. These methods fall into the general categories of Static Tactile Graphics, Auditory Interfaces, Dynamic Tactile Graphics and Haptic Interfaces. In addition to these available means, there is active research in this area that is worth reviewing as well. Note that there is no current technology available for mapping vision to the senses of taste or smell. 2.4.1 Static Tactile Graphics Methods for production of static tactile graphics are varied and usually require the intervention of a sighted person in their preparation [25, 88]. This active participation is a consequence of the diculty of converting visual information into tactile information, the Image Understanding Problem. Clearly a picture on a at computer screen is of no use to a blind person, necessitating the involvement of a sighted individual should access to such a picture's content be desired. The process of converting computer graphics to tactile graphics can be a labor-intensive and time-consuming one. There are three important steps in this 21 process: (1) editing, (2) transferral and (3) production. Consider any original twodimensional graphic, such as a pencil sketch, ink drawing, graph, diagram, illustration or printed picture. For a tactile graphic display to be comprehensible, it must not contain too much information. General design guidelines, developed through years of practical application and renement of technique, suggest that a tactile graphic should contain the least amount of information possible to convey the content of the image successfully. Clutter or an overabundance of detail in a tactile image can detract from its usability and hamper one's ability to understand its content [44, 55]. Thus, it is important to simplify complex images in the editing step of the process of converting them to tactile images. Experience shows that a tactile graphic that is too large or too small detracts from comprehensibility as well [99]. The size of a tactile image should be kept within a hand span, or roughly 3in to 5in on a side. Transferral entails placing the image onto some tactile output medium. A picture is rst traced on tracing paper, and then is transferred to the tactile display material using carbon paper and retracing. Other methods for transferral include the pantograph, which is an instrument consisting of four arms jointed in parallelogram form. It is adjustable to produce tracings of smaller, the same, or larger sizes. Using grids to scale images is also a common technique, as is use of the enlargement capabilities of modern photocopier machines. The production step is where the physical tactile graphic is produced. There are numerous methods considered standard; without exception, all require the intervention of a sighted person to translate a visual image into a tactile one. There are a number of commonly used methods for tactile graphic production [24, 25, 88], including the following: 22 2.4.1.1 Raised-line drawing boards Designed to be used by blind persons for producing raised-line drawings, this common tool is also useful for fast production of tactile versions of visual originals. A stylus produces a raised line when drawn over a plastic lm, giving an instant tactile representation. 2.4.1.2 Tactile-experience pictures This method is often used for young children. Pictures are constructed of a variety of materials, including wood, plastic, cloth, sandpaper, fur, and metal, which are glued to a sti cardboard backing. This method involves individually fashioning each piece out of the desired material and assembling the resulting pieces into the tactile picture. 2.4.1.3 Buildup displays Similar in method to tactile-experience pictures, buildup displays rely on multiple layers of paper to build up a raised drawing. Additional materials, such as wire, string and even staples, may be added to enhance the drawing. 2.4.1.4 Embossed paper displays This technique reproduces a drawing on heavy paper using a collection of embossing tools. A reverse view of a sketch is rst transferred to the back of a sheet of embossing paper. The tools are then used to trace the sketch, embossing it as a series of raised dots. 23 2.4.1.5 Braille graphics Graphics embossing can be produced more simply and speedily using a standard braille printer connected to a computer. Operating in graphics mode, the printer maps pixels (see page 37) of the original image to braille dots to produce the embossed version of the picture. The resolution of this method is low; to be eective, the original image must be a simple line drawing. This method has two distinct advantages: many blind computer users have access to a braille printer and no sighted intervention is required for its use. Hence, with the proper processing techniques applied to images, as will be described in the discussion of TACTICS (see page 41), it may be possible to utilize such a printer to produce adequate tactile representations of pictures. 2.4.1.6 Vacuum-forming method This method, also known as \thermoforming," excels at producing multiple copies of a tactile graphic in a very durable format. It requires a raised master made of stable or unpliable material. Next, the master is placed on a perforated tray in the vacuum-forming machine. A sheet of thin plastic is fastened over the master such that it forms an airtight cover. A heating unit is placed over the plastic as air is sucked out from below the master, deforming the now pliant plastic over the master. Once cooled, the plastic sheet is a durable replica of the original. This process can take as little as one minute, which is acceptable for producing multiple copies. 24 2.4.1.7 Microcapsule paper Referred to variously as \capsule paper," \swell paper" or \pu paper," this is a quick and economical way to produce tactile graphics. It is paper that has been coated with microscopic capsules of polystyrene (Figure 2.1), each being 100m in diameter. There are two types of microcapsule paper available on the international market. Flexi-Paper is a polyethylene-based paper manufactured by Repro-Tronics, in Westwood, New Jersey [73]. It is tan in color and is quite durable under conditions of folding and crumpling. The Matsumoto Kosan Company of Osaka, Japan, produces a paper-based version [58], white in color, that provides for blind persons a more familiar sti feel resembling that of heavy braille embossing paper while being less resistant to the eects of folding than Flexi-Paper. Both are comparable in price ($1.00 U.S. per sheet). With an unexpanded capsule diameter of 100m, the unexpanded resolution of both brands is therefore 10 capsules=cm (2.54 10 capsules=in). The capsules expand upward and outward consistently to a diameter (height) of 0.2mm to 1.0mm, yielding an expanded resolution of 10 to 50 capsules=cm (25 to 127 capsules=in). In practical observations in the laboratory, the typical expanded diameter is 0.3mm and typical expanded height is 1.0mm. 4 4 Microcapsules Figure 2.1: Microcapsule paper (enlarged view) showing layer of polystyrene microcapsules on polyethylene or paper transport medium. 25 To benet from this expanded resolution, a printer should have a resolution of at least 127 dots=inch, the best possible resolution of expanded microcapsule paper based on manufacturers' specications. Printing at a higher resolution will not produce a gain in tactile image resolution since the polystyrene capsules expand both upward and outward, meeting to create a contiguous surface with other expanded capsules within the range of the above noted resolution. Thus, a typical laser printer with a resolution of 300 dots=inch is entirely adequate for initial output of the image to be expanded. The amplitude of this expansion is aected by the temperature of the heating element, with higher temperatures producing slightly more pronounced expansion. Original graphics are photocopied onto the microcapsule paper using a standard oce copy machine (Figure 2.2). Graphics can also be applied to the microcapsule paper using ink pens, markers and other drawing implements. The only requirement is that the graphic be rendered in black. Once the image is applied to the microcapsule paper, it is inserted image side up into a heating machine, referred to as the Tactile Image Enhancer (Figure 2.3). For expanding multiple pages, each exposed sheet of microcapsule paper must be fed individually into the Enhancer. When exposed to a heat source of 120-125 degrees Celsius (248-257 degrees Fahrenheit), portions of the paper that are printed in black expand. The microcapsules beneath the black lines of a diagram absorb more heat than the other microcapsules and expand in diameter, raising the drawing from the background (Figure 2.4). An added benet is that one can draw directly on the microcapsule paper, which then can be raised immediately. The time taken to raise one drawing already on a sheet of microcapsule paper is approximately ten seconds. Even accounting for 26 Image Figure 2.2: Microcapsule paper after image is axed to the surface by photocopying or ink drawing. Heating element Paper path Transport rollers Figure 2.3: Simplied view of the Tactile Image Enhancer, showing internal work- ings of the device for expanding previously exposed microcapsule paper. Expanded capsules Figure 2.4: Microcapsule paper after exposure in image enhancer, showing ex- panded capsules. Note that capsules may not expand fully when only partially covered by printing, although this degree of expansion is unpredictable. 27 printing from a computer, photocopying onto the microcapsule paper, and subsequent raising, the entire process is still reasonably fast. Instant raised lines can be produced on microcapsule paper using a new heat-pen device developed by ReproTronics. 2.4.1.8 Other methods Numerous other methods exist for producing tactile graphics, although none are widely used. For purposes of completeness we mention only their names here. These additional methods include relief maps, cork maps and graphs, nongurative pictures, sewing-machine diagrams, embossed aluminum-foil displays, movable-parts displays, annel-board diagrams, magnetic-board diagrams, electroforming processing, nyloprint, silk screening, the solid-dot process, foam-ink printing, storm relief printing, and screen drawings. Exhaustive coverage of all of the above techniques are available in a variety of sources, including [15, 24, 25, 88]. 2.4.1.9 Summary These static display methods typically produce long-lasting, eective displays of static visual information. For dynamic information, such as material displayed on a computer screen, other access methods are more appropriate. 2.4.2 Auditory Interfaces This thesis focuses on the production of tactile graphic output of information of a primarily graphical or visual nature, but it is worth noting that auditory output is the method of choice for display of textual information for blind computer 28 users [15, 32]. While there is a wide variety of methods for production of tactile graphics, output of computer-generated speech is more generic. Screen review software is used by the blind computer user to explore the textual material and to select the desired passage. Typically, the software sends the text it encounters to a hardware device, such as a speech-synthesis card added as an enhancement to a computer, for conversion from text to speech [86]. There are many such software programs and hardware devices on the market and in wide availability. The usability of the user interface and quality of the produced speech in such software and hardware varies from manufacturer to manufacturer. One big benet of speech output is that users who cannot read braille can use it; in addition, it is generally quite aordable. Reliable speech synthesizers are available for most computers, and the quality of speech is typically quite good. Perhaps the most attractive feature of the screen review and speech synthesis output method is adjustable speaking speed, enabling a blind person to listen at 300 words=minute or more [15, 81, 88], a speed that is quite competitive with typical sighted-reading speeds of 250 to 500 words=minute [23]. The Nomad is an example of a multimodal device, combining static tactile graphics with audio output. A tactile graphic, such as a map, is produced and axed to the display surface of the Nomad. This surface is addressable via computer; and each region can be mapped to sounds that will play in response to the associated region being touched. The Nomad is well suited to museum displays and shoppingmall maps but requires assistance from a sighted person for conguration [24]. 29 2.4.3 Dynamic Tactile Interfaces Currently, the only dynamic tactile display device in wide use is the Optacon (Figure 2.5). It is a vibrotactile display, comprised of a ngertip-sized matrix of 144 vibrating pins, arranged in a 24-row, 6-column format (Figure 2.6). This display is contained in a portable case (8in 6in 2in, 4.0 lbs) and is powered by one 5volt, rechargeable, nickel-cadmium battery. Vibration is caused by piezoelectric lm bimorphs, which vibrate with varying amplitude at 230Hz in response to varying levels of current. Its use involves placing the nger of one hand onto the vibrotactile display pad and using the other hand to pass a scanning device over the desired text or image. Figure 2.5: Telesensory's Optacon II in action [83]. User places index nger of one hand on vibrotactile pin array and guides scanner across material to be viewed with other hand. (Telesensory) 30 Figure 2.6: Layout of the vibrotactile pin matrix display of the Optacon. The Optacon was designed as an alternative to braille for reading printed text; but reading speeds are slower (50 words=minute after months of training and practice) than with braille (104 words=minute), and much slower than with synthesized speech output (300+ words=minute) [24, 26, 91]. The price of a new Optacon, in the neighborhood of $4,000.00 U.S., is also an issue for some [24, 83]. As of the publication of this thesis, the company which produces the Optacon, Telesensory, plans to discontinue production; and negotiations are underway with other companies to continue production in the future [84]. During use, the pins of the Optacon display react independently in a oneto-one mapping of pixels, or groups of pixels, to pins in response to an image or text passed under the lens of the scanner. Black regions of the scanned item cause pins to vibrate while white regions inhibit vibration. Thus, a letter, line or picture feels like a vibrating replica of the original [83] (Figure 2.7). However, the vibrating display produces a noticeable amount of buzzing noise, and the vibration itself tends to temporarily dull the sense of touch on the nger resting on the display after a period of use. 31 S Denotes active or vibrating pins Figure 2.7: Active pin matrix display of the Optacon, demonstrating display of the capital letter S. Precursor to the Optacon was the Tactile Vision Substitution System (TVSS ) (Figure 2.8), which used a similar technique to display a vibrating representation of an image on a user's back [6, 98]. The image was captured by a television camera and sent to a more widely spaced array of vibrating pins. The idea of the system was eventually to produce a system by which a blind person could wear a video camera and backpack display and actually maneuver through the world using the vibrating representation of what the camera saw for guidance. The technique may have been ahead of its time, being bulky and noisy, even by early 1970's standards. Modern technology may yet produce such a system for independent, walk-around vision replacement [10, 17, 22]. Producing a dynamic tactile display is an active area of research. In a subsequent section we review some prominent research in this area. 32 Figure 2.8: Tactile Vision Substitution System (TVSS ) [98]. 2.4.4 Haptic Interfaces The term haptic refers to the proprioceptive, or positional, sense, which is an extension of touch [41]. Thus, a haptic interface can represent three or more dimensions, whereas a tactile display provides only two dimensions. Haptic interfaces are an important display method in virtual reality systems, capable of reproducing a sense of position in space, interaction of forces, and even textures. Of course, the original information must be multidimensional as well, often generated by mathgraphing packages or custom graphing software. Examples of this highly active area of research include development of a method for display of graphs of mathematical functions and scientic data using a three-degree of freedom device called the PHANToM [30, 29, 59], protein molecule docking simulations [14], three dimensional volume haptization [35], and successful experiments in simulating textures with an enhanced joystick device [61, 62]. 33 These devices are generally very expensive ($10,000.00 U.S. and up) and so are still relegated to a small number of research facilities. It is hoped that eventually aordable haptic interfaces will be readily available, providing blind computer users with an even greater ability to explore traditionally visual information physically. An in-depth study of haptic interfaces is beyond the scope of this work, although progress in this area is clearly important to note. An extensive bibliography on this topic is available in [62]. 2.4.5 Dynamic Tactile Display Research Enabling blind persons to access visual data on a computer meaningfully is an area of vigorous research. Some of the more pertinent projects from the present and near past include: A virtual tactile tablet incorporating a vibrotactile display module demonstrated that increasing a graphic's size and its display resolution improved recognition, while merely varying the complexity of a graphic's geometric shape did not dramatically eect object recognition [99]. Experiments with a single-pin tactile mouse revealed that immediate tactile feedback improved response times in GUI navigation tasks [85]. The use of nickel-titanium shape-memory allow (SMA) to provide actuation of a tactile display shows promise as the basis for a lightweight and portable display, although the power consumed and the heat produced by such a display are still high. Further, current shape-memory alloy suers from brittleness, slow response and recovery times, and lack of long-term durability [33]. 34 A 64-solenoid, four-level, pin-based ngertip display, used to investigate tactual comprehension improvement through representation of levels of graphics image intensity by varying pin heights on the display [28]. A virtual tactile computer display which uses electromechanically actuated pins in a rectangular tactile array comparable in size to the sensing area of the ngertip [40]. The use of polymer gels, or electrorheological uids, for fabrication of actuators which then conceivably could be used in the development of a tactile display. Such uids become rm when current is passed through them and could also serve as the basis for a direct-touch, deformable tactile display [27, 63, 64, 68]. Past research delved into electrocutaneous stimulators, which delivered tiny electrical shocks to the skin, and air jet stimulators, which replaced the pin array with an arrangement of tiny holes where pus of air are aimed at the skin [22]. Neither of these methods was particularly successful; these two methods are generally accepted by the mainstream research community as unworthy of further consideration. 2.4.6 Moving Toward Eective Tactile Display of Graphics Audio output is not a solution for most graphics problems because of the diculty of the Image Understanding Problem. In order for synthesized speech output to provide adequate access to an image, the image would rst have to be understood by the computer, an unlikely occurrence at present. The most promising direction for research is toward creation of a refreshable tactile display. Such a display would be the tactile equivalent of a standard computer screen, or cathode ray tube, providing direct access to the graphical contents of the computer. 35 For such a dynamic display to be usable by blind persons, attention must be paid to how graphic material is to be displayed. Clearly, the ngertip possesses a much lower resolution than the eye, so complex visual information must be simplied somehow. Developing a system for performing such simplication, including factors related to method, eectiveness, usability, and future applicability, is the scope and direction of this thesis work. 2.5 Representation of Images An image is an alternative representation of some visual scene [52, 78]. These representations include sketches, drawings, photographs, computerized graphics and pictures, and motion picture lm and videotape. For purposes of this thesis, we can safely restrict the discussion to computerized images. 2.5.1 Quantization In order to create a computer image from some other type, some form of quantization is performed. In this process, samples of the image are taken using a scanner or digital camera at some regular interval and size, based on the desired resolution of the nal quantized image. Each sample is assigned a discrete value, or set of values, that represent the intensity or color of the sample as closely as possible. In the process of performing this discretization, some resolution and clarity of the original is necessarily lost, at least with any practical system. This loss is due to sampling round-o error when mapping the analog real-world into the digital world of the computer and allows the image to be processed by computer [72, 78]. 36 2.5.2 Computerized Representation The basic unit of the computerized image is the picture element or pixel [52]. For images represented solely as shades of gray, each pixel is assigned a single value, typically an 8-bit integer. Thus, such an 8-bit grayscale image has an intensity range of 256 levels of gray, with 0 typically indicating black and 255 indicating white. Similarly, color images have three such 8-bit intensity levels associated with each pixel, one each for the Red, G reen and B lue components. Each pixel in this 24-bit color RGB image therefore can represent over 16 million (256 ) colors. Conceptually, and physically, an image is stored in a two- or three-dimensional array (see page 42) in the computer's memory. 3 For purposes of this research, we consider primarily complex computer images, quantized representations of photographs, electron micrographs, individual video images, etc., as these present the greatest diculties when creating a tactile representation. Simple images, such as sketches, diagrams, and line drawings often can be converted straightforwardly into tactile form. Complex images are typically comprised of a broad and unpredictable mixture of shape, color, intensity, and other real-world complexities, presenting the most signicant challenges to access by the blind computer user. 2.6 Image Processing Image processing is a broad term describing the algorithmic transformation of an image from one form to another [72]. Processes are divided into general categories of point processes, area processes, frame processes and geometric processes [52]. Point processes are the simplest and most frequently used of the image processing operations. A point process is an algorithm that modies a pixel's value in an image 37 based solely upon that single pixel's value or location. Common point processes are image brightening, negative images, image thresholding, image contrast stretching and image pseudocoloring. Area processes use groups of pixels surrounding a central pixel of interest to derive information about an image. This group of pixels, often referred to as a neighborhood, is examined in some algorithmic fashion as a group. This examination, for instance, can determine the brightness trend information or spatial frequency, with the result utilized in determining a new value applied to the central pixel of the neighborhood. Examples of area processes include edge enhancement and detection, image sharpening, smoothing and blurring, and removal of random noise. An area-process algorithm typically involves the convolution of some weighting factors contained in a convolution kernel. Convolution (see page 45) can be thought of as a weighted summation process, which produces a new value for a central pixel based on some function of the values of a number of its neighbors. Frame processes use information from two or more images, or video frames, together with a combination function to produce a new image. Among the many practical applications of frame processes are motion detection, background removal, image-quality enhancement and image combination. Geometric processes change the spatial positioning or arrangement of pixels within an image based upon some geometric transformation. Typical operations performed by geometric processes include image scaling, sizing, rotation, translation and mirror imaging. Example uses include spatial aberration correction, image composition and special eects. 38 2.6.1 Applicability to Tactual Perception and TACTICS Production of tactually perceivable tactile images bears some similarity to the challenges of the eld of computer vision. The aim of computer vision is automatically to provide analysis of an image on which some decision can be based [12, 66]. Image processing techniques are invariably used in this task to transform an image in such a way as to produce some form of useful output. Similarly, the aim of TACTICS is to present a visual image in a tactile format such that it is useful in some way to an observer. Image processing techniques would appear to be a natural approach to use. The limits to tactile resolution, and the understood importance of reducing to an essential minimum the information presented to the ngertip, clearly calls for a simplifying transformation of complex images. Many image processing algorithms are known for accomplishing various simplifying transformations on an image [7, 9, 52, 70, 72, 76, 87]. We can reduce a photograph to line information only, remove noise, caricaturize a human face, reduce resolution or separate an image into distinct regions. These techniques, and others, are motivated and applied in a tactile graphics creation system called TACTICS. Viewed in terms of computer vision, the aim of this prototype system is to process images automatically images such that the result can be output as useful, in this case comprehensible, tactile graphics. 39 Chapter 3 TACTICS: TACTILE IMAGE CREATION SYSTEM Converting visual information into tactile information in an automatic, timely and ultimately comprehensible fashion is the force propelling development of this prototype system. The lessons learned from the areas of tactual perception, tactile graphic production and the applicability of image processing techniques to tactile graphic generation are extended to and applied in the creation of this system. The details of the system, including the justication for its development, the specic algorithms used for image simplication, the software and hardware utilized, and the complete procedure for acquiring, transforming and tactilizing visual information, are discussed. 3.1 Automatic Generation of Tactile Graphics The production of tactile graphics, as we have seen, can be a time-consuming process of careful translation from visual to tactile form necessitating the involvement of a sighted person. Cost and timeliness prevent most blind persons from having ready access to the abundant high-quality computer images available on the Internet and elsewhere. With an automatic method for performing such translations, increased access to the wealth of computerized graphical information could be provided. Such information is, at present, essentially inaccessible, requiring the 40 intervention of a sighted person to perform conversion from visual to tactile form. Automatic computerized conversion can be accomplished aordably, using readily available or easily adaptable technology, combined with the appropriate image processing techniques. A technique for the automatic generation of tactile graphics involves acquiring an image, performing some simplifying processing, and displaying the result on a tactile output medium, such as microcapsule paper or a dynamic, real-time tactile display. 3.2 Genesis of TACTICS The TACTile Image Creation System (TACTICS) is an attempt to further the state-of-the-art of research in the area of automatic tactile graphic generation. This prototype system is made up of software and hardware components, making use of available image processing packages and static tactile graphic production techniques. The impetus behind the development of this experimental system was a perceived lack of research being performed in addressing accessibility issues related to complex image information. The focus of much of the research in computer access to graphical information for blind persons is restricted to narrow categories of information, such as mathematical formulae, iconic navigation, or better auditory access to text. Our aim is to provide a general method for providing access to photographic and other visual information that is in electronic form. It is hoped that this thesis will serve as a starting point in what is an exciting and heretofore uncharted area of research, rich with implications and possibilities. 41 3.3 Image Processing Algorithms There are a great many algorithms that process images to produce a wide variety of eects. In this thesis we are concerned with the eect more than with the specic means. For a thorough understanding of how the classes of algorithms we have chosen operate on images, and how they relate to our goal of image simplication, we present a brief and somewhat simplied introduction to each of them. For purposes of this discussion, we assume that an image is grayscale, although these algorithms have forms that work equally well for color images. Since we are concerned neither with moving images nor geometric transformations, we do not consider frame or geometric processes; rather, we restrict coverage to a number of point and area processes. Detailed theoretical treatment of image processing techniques is available in [72], while an implementation-oriented approach is given in [52]. 3.3.1 Notation For clarity, the notation used within this thesis to describe images and image processing algorithms is dened here. A grayscale image X of overall width w and height h can be represented by a two-dimensional array of points, each of which has a certain value, denoted by Xm;n, representing the brightness or intensity of that point (Figure 3.1). X X ::: X w X X ::: X w X X ::: X w ::: ::: ::: ::: Xh Xh : : : Xhw Figure 3.1: Format of two-dimensional image. 11 12 1 21 22 2 31 32 3 1 2 42 A color image has a set of three intensity values, one each for the red, green and blue components of each pixel, associated with each position in the array . Formally, an 8-bit grayscale image is described by: 1 X = 1 m w; 1 n h; Xm;n 2 f0; 1; : : : ; 255g (3.1) The set of points N in a square region of width w0 surrounding a given point is the neighborhood of that point. For points that are closer than w ? points to an image boundary, the neighborhood will include only those points falling within the image. The neighborhood of a point Xm;n is denoted by the set: 0 1 2 Nm;n = f w0 is odd; max(m ? w ? ; 1) i min(m + w ? ; w); max(n ? w ? ; 1) j min(n + w ? ; h) : Xi;j 0 0 1 2 0 1 2 0 1 2 2 1 (3.2) g An algorithm a is represented by a mathematical function Fa that transforms an image X into a processed image Y , as follows: Y = Fa (X ) 1 (3.3) Although color images are often represented in this RGB format, numerous other representational schemes exist. Among the most common of these methods are: Cyan, Magenta and Yellow (CMY), Hue, Saturation and Value (HSV), Hue, Saturation and Lightness (HLS), Hue, Saturation and Intensity (HSI), and Hue, Chrominance and Intensity (HCI). 43 3.3.2 Edge Detection An edge detection algorithm attempts to locate and highlight edges in an image (Figure 3.2). These edges are simply the portions of an image where there is a rapid change in intensity. The faster such a transition is made from light to dark, or vice versa, the more likely an edge detection algorithm is to consider the center of such a transition as an edge. Each pixel that is found to be part of an edge is set to the color white, while non-edge pixels can be left alone or assigned the color black using some thresholding function. A common version of this algorithm is the Sobel edge detector, which accomplishes edge detection by using the scaled average of one of a 3 3 pixel neighborhood's horizontal or vertical directional derivative, as rst described in [70]. The Sobel edge detection function makes use of two matrices, or masks, one each for the vertical and horizontal directions: V 2 66 = 666 4 ?1 0 1 ?2 0 2 ?1 0 1 3 7 7 7 7 7 5 H 2 6 6 = 666 4 1 2 1 0 0 0 ?1 ?2 ?1 3 7 7 7 7 7 5 (3.4) Figure 3.2: Before and after Sobel edge detection algorithm. (public domain) 44 These masks are convolved over an image. Generally speaking, convolution is a linear-only algorithm that involves passing over an input image pixel by pixel, applying some transformation to each point or to the neighborhood of a point to generate a new value, and then placing that new value at the same position in an output image. In the case of Sobel edge detection function FS , the two masks V and H are applied as follows for each point (m; n) in image X : Am;n = Nm;n V (3.5) Bm;n = Nm;n H (3.6) A0m;n = 0 = Bm;n X u2Am;n q X v2Bm;n u (3.7) v (3.8) 0 FS (Xm;n) = A0m;n + Bm;n 2 2 (3.9) This is a very computationally expensive operation to perform, particularly for larger images, due to the necessary 20 multiplications, 19 additions and 1 squareroot operation per pixel. There are numerous methods described in the literature that can speed up this process. In the system implemented for this thesis, the technique used is a combination of shortcuts and a simple comparison and thresholding. Note that one-third of the elements in each mask are 0s, so a third of the multiplications can be eliminated. Rather than multiply elements by -1 or 2, a unary negative sign or left-shift-by-one-bit operation is used, respectively. The computational cost of these two modications is equivalent to an addition rather than multiplication step. Since the maximum intensity value of a pixel is 255, squared values above 65025 (255 255, which is precomputed one time only) can be merely assigned 255. Finally, for the remaining computations, the square root is taken as in Equation 3.9. 45 With these modest modications, the number of operations performed is reduced to 2 multiplications (the squaring operations), 17 additions and 1 square root. 3.3.3 Blurring Often referred to in the literature as low pass ltering, blurring reduces the detail in an image by removing the high frequency component [78]. It accomplishes this by using the values of all pixels in a neighborhood, assigning some function of those values to the center pixel. Application of either a Gaussian or averaging function are two common techniques to accomplish blurring. Averaging is the most straightforward and fastest technique and, considering the low resolution of the human ngertip, is sucient. The blurring function FB is described as: P FB (Xm;n) = jvN2Nm;nj m;n v (3.10) Applying this function to all pixels in an image produces a blurry version of the original image (Figure 3.3). Figure 3.3: Image before and after application of blurring algorithm. 46 This is also described as the convolution over X by a blurring mask or kernel. For example, the blurring algorithm used in this research is accomplished with the following 3 3 kernel B : 3 2 1 1 17 6 7 6 6 (3.11) B = 66 1 1 1 777 5 4 1 1 1 3.3.4 Segmentation Images are generally comprised of one or more regions, dened as sections or segments of an image whose members are closely related by color or intensity. A common technique for locating segments is called K -means segmentation [51, 76, 89] (Figure 3.4). In this algorithm, each pixel is assigned to one of some number K of dierent groups, based on its own intensity level. This technique divides pixels with closely related intensities into like groups or clusters, producing an image that is segmented by intensity. A similar segmentation can be performed based on color. Algorithmically, the K -means segmentation applied to image X is described as follows [89]: Figure 3.4: Image before and after application of K -means segmentation algorithm, with K = 2. 47 Step 1. Choose K initial cluster centers z (1); z (1); : : : ; zK (1). These can 1 2 be chosen arbitrarily as, say, the intensity values of the rst K pixels in X , or evenly spaced across the range 0 ? 255 as is implemented in the system described in this thesis. Step 2. At the kth iterative step, distribute the intensity values fXm;ng among the K cluster domains, using the relation: Xm;n 2 Sj (k); if jXm;n ? zj (k)j < jXm;n ? zi (k)j (3.12) 8i = 1; 2; : : : ; K; i 6= j , where Sj (k) denotes the set of intensity values whose cluster center is zj (k). Step 3. From the results of Step 2, compute the new cluster centers zj (k + 1); j = 1; 2; : : : ; K , such that the sum of the squared distances from all points in Sj (k) to the new cluster center is minimized. This is simply the mean of Sj (k), given by: zj (k + 1) = P Xm;n 2Sj (k) Xm;n jSj (k)j ; j = 1; 2; : : : ; K (3.13) It is from this manner in which each of the K cluster centers are iteratively updated with the average value for each cluster that the name \K -means" is derived. Step 4. If zj (k + 1) = zj (k) for j = 1; 2; : : : ; K , the algorithm has converged and can be terminated. Otherwise, go back to Step 2 and continue. The fundamental drawback of this general statistical analysis of, or histogrambased approach to, image segmentation is the inherent disregard for spatial coherence [67]. Adaptive segmentation attempts to take into account a smaller portion 48 of an image, producing a segmentation based only on that portion. The eect of this process can be to retain more of the original image information, producing a segmentation which more closely resembles the original (Figure 3.5). This result often is achieved at some computational expense and many times produces a result only marginally better than a straightforward segmentation algorithm for purposes of image simplication and automatic tactile graphics generation. As implemented for this thesis, the adaptive version of the algorithm performs the same steps as the K -means segmentation algorithm, with the dierence being that it operates to convergence on each pixel in X before moving to the next pixel. Thus, the K -means algorithm is performed on some subset or window of, and in complete isolation from, the image as a whole. Inspiration for this implementation is drawn from portions of an adaptive segmentation algorithm that uses a Gibbs random eld model and a hierarchical approach described in [67]. Figure 3.5: Image before and after application of an adaptive K -means segmentation. 49 3.3.5 Negation The negation of an image is produced by inverting the intensity of each pixel in the image (Figure 3.6). This process involves inverting the intensity of each pixel in turn, reassigning this new value to each. Negation is described by this simple function: FN (Xm;n) = 255 ? Xm;n (3.14) Every home photographer is familiar with the negatives that are returned with developed lm. The negation of a computerized image is just such a negative image. Negation often is applied in conjunction with another algorithm. In the case of a strictly black and white or binary image with more black than white, subtracting the intensity of each pixel from the maximum reverses the eld and, it is hoped, makes foreground features such as edges black. This negation improves the legibility of a tactile image, specically when it is output on microcapsule paper, since the black portion of the image raises while the white portion remains at. Figure 3.6: Image before and after application of negation algorithm. 50 3.3.6 Median Filtering Median ltering is a method for removal of noise from an image [72]. Generally, noise in an image is described as an individual pixel of greatly diering intensity, or outlier, compared to the typical pixel in a neighborhood. Dierentiating noise from minute detail, or ltering out noise while leaving the desired image intact, is not always so straightforward [4], particularly when an image is complex. Performing edge detection on an image, as is often applied in our TACTICS processing, tends to accentuate these outliers, whether noise or detail. The median ltering algorithm sorts the intensity values of pixels in a neighborhood, assigning the median value of the neighborhood to the center pixel. This is repeated for all pixels in the image, with the eect being a reduction in the number of outliers while preserving edges and non-noisy portions of the image (Figure 3.7). An especially fast version of the median ltering algorithm can be found in [38]. The function FM for median ltering is described as: FM (Xm;n) = Median(Nm;n) (3.15) Figure 3.7: A noisy processed image before and after the application of median ltering. 51 3.4 Image Processing Tools The software for our prototype system for automatic generation of tactile images is implemented in the C programming language as an extension to the Xwindows image processing application XV, developed at the University of Pennsylvania [13]. As of publication, the complete source code for this package is readily available via anonymous ftp at ftp.cis.upenn.edu in the directory pub/xv. The license fee is quite reasonable for this user-friendly software, and it was found to be easily extended to include additional image processing algorithms. The extended version is available via ftp at ftp.asel.udel.edu in pub/sem/xv-mod.tar.Z. Instructions on how to add additional algorithms to the XV package are in the le xvalg.c. Some preliminary experimentation with various image processing algorithms was performed using MATLAB's Image Processing Toolbox [87]. The exibility of MATLAB, combined with its wide acceptance and availability, made this an attractive and practical development platform. 3.5 Tactile Imaging 3.5.1 Description Tactile imaging is the conversion of a visual image into a form that is perceivable using the sense of touch. This conversion can be accomplished using a variety of techniques. TACTICS performs this conversion automatically by applying image processing algorithms to a complex image, such as photographic and other visual information from the areas of science, engineering, mathematics, medicine, art and others. This conversion is done entirely in software, developed as an extension to 52 XV. This prototype system involves a number of experimenter-selected sequences of algorithms applied in a controlled fashion, although this process easily could be implemented to run entirely unsupervised, and in fact some trials were conducted to verify this conclusion. 3.5.2 Development Development of the software package involved acquiring the source code for XV, conducting a search of image processing literature to determine the techniques best suited for the purposes of image transformation, and implementing a number of these algorithms as extensions to XV. The algorithms that were chosen represent some of the most widely used or standard techniques, although some attention was paid initially to more sophisticated methods. It was found that elegant and sophisticated algorithms, while of considerable benet in the visual domain, produced little if any benet when used to produce tactile images. This absence of benet is due primarily to the lower resolution of the ngertip, which cannot take advantage of details ner than its physiologically imposed limits. Generally, simpler was found to be better in the course of this research. 3.5.3 Sequencing of Algorithms When more than one image processing algorithm is applied to an image, the sequence of application can greatly eect the outcome. For example, applying edge detection to an original image followed by segmentation produces a relatively simple and smooth outline of the original, while applying segmentation followed by edge detection produces a more complex and jagged outline of the original. Another example of the eect of sequenced algorithms is in the use of a blurring algorithm. 53 By blurring an original image before applying edge detection the resulting edges are thicker and the occurrence of falsely identied edges is lesser. These examples illustrate the importance of considering the interactions among image processing algorithms when attempting to convert an original image into a simplied version suitable for tactile exploration. In the next chapter a number of pertinent algorithm sequences will be discussed and their use in TACTICS will be motivated. 3.6 Tactile Output 3.6.1 Microcapsule Paper Microcapsule paper was chosen as an output medium due to its wide availability, relatively low cost, and ability to render tactile graphics quickly. We compared the two known brands of paper on the market, Repro-Tronics Flexi-Paper and a paper imported by the Matsumoto Kosan Company. A comparison of the manufacturers' specications for the two types of paper reveals that there is very little dierence in the vital qualities of resolution, response time, cost and displacement. One laboratory observation, as measured using a mill-meter, is that the displacement achievable with the Matsumoto Kosan paper tends to be more consistent in practice than the Repro-Tronics paper. Measurements reveal this to be the case, but also show that typical displacement is approximately 1mm for both varieties of paper. The signicant dierence between the two appears to be the durable nature of the Flexi-Paper, which is highly resistant to folding and crumpling. The stier Matsumoto paper is more familiar in feel to the blind community, being similar to the heavy paper used by embossing braille printers, but is prone to cracking and creasing under adverse conditions. For purposes of our experiments, we used both 54 types of paper and discovered that subjects often preferred the slightly stier feel of the Matsumoto paper versus the spongier feel of the Flexi-Paper. 3.6.2 Tactile Image Enhancer To develop, or pu up, the microcapsule paper we used a Repro-Tronics Tactile Image Enhancer (Figure 3.8). The device has a motor-driven roller which passes the paper face up underneath a tubular light bulb. The heat from the lamp is absorbed by the dark regions of printing on the paper, causing the polystyrene microcapsules in those areas to expand but leaving the unprinted regions at. The time taken to develop a single sheet of either type of microcapsule paper is approximately ten seconds. Figure 3.8: Tactile Image Enhancer. (Repro-Tronics) 55 3.6.3 Additional Equipment Original and processed computerized images were rst printed out on a commercial 600dpi oce laser printer. Next, they were copied onto microcapsule paper using a typical oce photostatic copier machine. Other than these devices, the Tactile Image Enhancer, and the computer itself, the only additional material needed in the prototype system was a large supply of both varieties of microcapsule paper. 3.7 Experimental Procedure for Tactile Image Creation The procedure for producing a tactile image from a visual one is straightforward. The involvement of a sighted person is necessary in the current stage of our research system. Future versions of TACTICS could be made to operate in an unsupervised manner, eliminating the need for a sighted person to be involved. The procedure involves three phases: Acquisition, Simplication, and Tactilization. 3.7.1 Acquisition of Images Images were acquired in a fairly random manner from standard image processing benchmark collections, scientic data acquisition, and from a wide array of sources available on the World Wide Web. Every attempt was made to select a representative sampling of the available images (see Appendix A). We also looked for candidates from similar classes of images, for example faces, or more generally rounded images, which could prove dicult to distinguish from one another once simplied, as a way to test how ambiguity is dealt with by our prototype system when used by experimental subjects. 56 3.7.2 Simplication The preparation of simplied images was achieved using a number of diering, aggregate, image processing sequences. An image was rst loaded into XV. Then, the applicable sequence of image processing algorithms was applied. Finally, this processed image was printed on a laser printer in preparation for expansion in the subsequent phase. 3.7.3 Tactilization The printed version of the processed image was photocopied onto one of the two types of microcapsule paper. The microcapsule paper was then fed through the Tactile Image Enhancer, creating the raised tactile image. This procedure was repeated for all images using the variety of image processing algorithm sequences as specied in the experiment protocol. 57 Chapter 4 EVALUATION OF TACTICS The primary goal of the procedures used by TACTICS to convert visual information automatically into tactile information is to provide meaningful access to previously inaccessible content. A series of experiments was conducted to evaluate the eects of this prototype system upon a subject's ability to (1) discriminate, (2) identify and (3) comprehend tactile representations of visual information. A general accounting of subject selection and experimental material production, including the use of various image processing techniques, is provided. The selection of the specic aggregate image processes for use in these experiments is discussed and justication is given linking these processes with theories of psychophysics. For each experiment conducted as part of this evaluation, descriptions of the subjects, materials used and experimental procedures are provided. The results of each experiment, including data comparing results based on types of microcapsule paper and the level of vision of subjects, are reported and analyzed. 4.1 Overview of Experimental Protocol The protocol used in these experiments was designed to evaluate the eect of TACTICS upon the accessibility of visual information in a tactile form. Every 58 attempt was made to acquire a diverse sample of subjects and images and to assure that experimental materials were produced in an automatic and uniform fashion free from the aesthetic biases of a sighted person. 4.1.1 Selection of Subjects Blind, low-vision and sighted subjects were used in the following experiments. As previously noted, the tactile acuity of blind and sighted persons, whether male or female, is essentially identical [56], although blind persons tend to have more experience making active use of the sense of touch [93], while sighted subjects generally have a more highly developed visual memory [77]. Any dierence in the performance of blind and sighted subjects is noted and discussed. 4.1.2 Production of Materials As mentioned earlier, images were gathered electronically from a variety of sources and were prepared rst by grayscaling any that were color images to achieve uniformity. This homogeneity was necessary because microcapsule paper expands only in response to the color black. Depending on the experiment, one or more image processing algorithms were then applied in a specic order to each image. Once the images were processed, they were printed out on a standard oce laser printer, photocopied onto sheets of microcapsule paper, and expanded using the Tactile Image Enhancer. Both types of microcapsule paper (see page 25) were used in the production of experimental materials. 59 4.1.3 Aggregate Image Processes The ve experiments conducted made use of image simplication techniques selected from a collection of seven aggregate image processes, dened here as: 1. No Processing: A tactile image is produced directly from the original grayscaled version of the image (Figure 4.1). Experimentally, these images serve as a benchmark upon which the eectiveness of further processing can be measured. The unprocessed image represents the visual information in its raw form, the state in which it is currently available without the intervention of a sighted person. 2. Edge Detection (with thresholding): Emphasizing the edge information in an image might be all the simplication that is needed. Much of the theory previously discussed indicates that converting an image into a simpler sketch or line-drawing representation should enhance recognition. The Sobel edge detection operator is used here (Figure 4.2), as it is widely used and considered to be eective for general purpose edge detection, although any one of a number of edge detectors could quite easily be substituted. Note that thresholding is performed on the image, with edge points being set to one intensity value while non-edge points are set to a second value. In this way, a binary edge-only version of the original is produced. Depending on the implementation of the thresholding conducted in association with a given edge detection algorithm, it may be necessary to apply negation to the result (see page 50). 3. Edge Detection (without thresholding): By eliminating the thresholding inherent in the standard Sobel edge detection algorithm, and instead merely replacing each point in an image with the raw output of the Sobel operator for that point, a slightly dierent result is produced (Figure 4.3). Note that 60 Figure 4.1: Original unprocessed grayscale image of the chimney end of a house. (public domain) Figure 4.2: Image of house before and after processing using Sobel edge operator with thresholding. Figure 4.3: Image of house before and after processing using Sobel edge operator without thresholding. 61 the edges are still highlighted but there is quite a bit of background noise remaining in the image. In this form the image is not particularly useful for purposes of tactile graphics because most of it would still expand when developed on microcapsule paper; but when coupled with a subsequent Kmeans segmentation, an adaptive thresholding technique, the resulting image has a more complete edge detection than the edge detector that uses xed thresholding (see \Edge Detection (with thresholding) and Segmentation" below). The reason for this is that strict thresholding tends to disregard some of the less dened edge information, while in this case that information is left behind potentially to be recognized by the more sophisticated adaptive thresholding as performed by K-means segmentation algorithm. 4. Segmentation: Performing a segmentation divides an image into regions. In this application, we perform a binary segmentation via adaptive thresholding using the K-means segmentation algorithm, which produces regions of white and black only (Figure 4.4). This representation is modeled on the way it is believed that the mind classies and stores image information, namely in some hierarchical fashion, from general characteristics to specic [9, 90]. In the case of segmentation, general characteristics are emphasized. Note that in some instances negation (see page 50) was applied following an application of segmentation to emphasize content rather than background. 5. Edge Detection (with thresholding) and Segmentation: As mentioned above, performing a segmentation on a previously unthresholded, edge detected image serves further to enhance edge information (Figure 4.5) that might normally be ignored by a standard edge detector that uses xed thresholding. By using this aggregate process, the dubious result of applying edge detection without thresholding is actually advantageous in that a more completely edge detected image is produced (Figure 4.6). 62 Figure 4.4: Image of house before and after processing using K-means adaptive segmentation algorithm. Figure 4.5: Image of house before and after processing using Sobel edge operator without thresholding followed by K-means segmentation. Figure 4.6: Comparison of eect of Sobel edge detection using xed thresholding from Figure 4.2 (left) with Sobel edge detection utilizing adaptive Kmeans segmentation (for thresholding) from Figure 4.5 (right). 63 6. Segmentation and Edge Detection: A comparison with the reverse procedure, namely application of segmentation rst followed by edge detection with thresholding, is revealing (Figure 4.7). Note that, while the simplied image still resembles the original, the representation is more discontinuous and noisy, the result of extracting segmented region edges. Since thresholding is performed by the initial K-means segmentation, the two varieties of the Sobel edge detector described here produce identical results. For more complex images, such as faces, the results are even more noticeable (Figure 4.8). 7. Blurring, Edge Detection, Segmentation and Median Filtering: This aggregate process takes into account as much of the previously discussed cognitive and perceptual theory as possible to produce a result that, at least visually, appears to be quite simple (Figure 4.9) while still clearly resembling the original. The blurring step represents the lower bandwidth capabilities of the ngertip as compared with the eye. The result of this blurring has a potentially benecial side-eect, thicker edges, which appears during the subsequent edge detection. Without the initial blurring step, the resulting lines in the nal representation tend to be thinner and sometimes less continuous (Figure 4.10). When the edge detector is applied without thresholding to the blurred image, edges appear thicker due to the slight spreading or softening of rapid intensity changes in the original. The segmentation step, as before, cleans up the result of the edge detector. The nal median ltering step removes any stray noise that was not removed by the segmentation. In fact, there is a proportion of noise that is enhanced rather than removed by the adaptive thresholding of the segmentation step. Median ltering counteracts much of that eect. 64 Figure 4.7: Image of house before and after processing using K-means segmentation followed by Sobel edge detection. Figure 4.8: Images of a face demonstrating the dierence between two sequences of processing. From left to right: Original image, image after Sobel edge detection without thresholding followed by K-means segmentation, and image after K-means segmentation followed by Sobel edge detection. (US Govt) 65 Figure 4.9: Image of house before and after processing using the aggregate se- quence of processes: blurring, Sobel edge detection without thresholding, K-means segmentation and median ltering. Figure 4.10: Comparison of image of house using the aggregate process from Figure 4.9 (left) and the same aggregate sequence of processes with the exception of the initial blurring step (right). 66 4.1.4 Psychophysics and Experimental Procedure Justication To evaluate the eectiveness of this processing for automatic generation of tactile images from visual images, ve sets of experiments were performed. These experiments were designed to measure performance on a basic psychophysical level. The eld of psychophysics, the study of physical and psychological aspects of perception and their interrelationships, identies four basic perceptual tasks: (1) detection, (2) discrimination, (3) identication, and (4) comprehension [18]. As with all the senses, these four attributes apply to tactual perception, which is a major concern regarding the methods put forth in this thesis. 4.1.4.1 Detection Measuring detection using the sense of touch involves designing a task that addresses the question, \Is there anything there?" As previously discussed, many limits of the physical detection abilities of the ngertip are known. Since the properties of microcapsule paper produce tactile graphics that are well within the range of such touch perception, any experiment designed here would be trivial. Thus, it is safe to accept as an assumption that TACTICS produces tactile images that are detectable. Thus, no experiments were performed to measure detection of tactile images, as all experiments relied on the implicit ability of subjects to detect the raised tactile images. 4.1.4.2 Discrimination The ability to discriminate is an important perceptual task for any of the senses. Discrimination answers the question, \Is this stimulus dierent from that 67 one?" For the sense of touch, discrimination tells us simply whether two tactile objects are the same or dierent. The experiments to measure the eectiveness of TACTICS to aid in discrimination involved a task similar to the traditional matching game of Concentration. In the study, subjects felt one of a closed set of similarly processed tactile images and then attempted to locate the identical tactile image from among a randomly arranged duplicate set. In further experiments, subjects felt a series of arbitrarily paired tactile images for a period of time and then reported whether or not each of the pairs felt similar or dissimilar. 4.1.4.3 Identication Being able to identify what something is by its perceived characteristics is another basic perceptual task. Identication as it applies to tactile images involves the eectiveness of a representational technique to allow a person to answer, \What is it?" The cognitive load imposed by identication is higher than that for detection or discrimination, so the experiment to measure it is also more involved. In the experiment to assess this factor, subjects felt a series of tactile images, and for each image were given four categories and asked to identify into which category each stimulus belonged. 4.1.4.4 Comprehension Comprehension means that questions regarding the content of an image should be answerable. Comprehension is generally accepted as a key consideration it the eectiveness of any perceptual event and therefore is an important factor to explore in the design of an interface to a GUI environment for blind computer users. This experiment measured how well a selected TACTICS aggregate image process 68 aected comprehension of tactile images. Subjects were provided with a brief description of each image and then were asked a number of questions regarding the content of each image. 4.2 Experiments Five experiments were conducted to evaluate the eectiveness of the system. The rst was a pilot study, which measured simple discrimination of tactile images and was aimed at determining whether or not further exploration of these techniques was worth pursuing. More rigorous tests were then performed to examine simple and timed discrimination and tactile image identication and comprehension. Note that approval to conduct these experiments was obtained from the University of Delaware Human Subjects Review Board (see Appendix F). All experiments were conducted by this author. 4.2.1 Pilot Study A pilot study was conducted to determine whether the use of image simplication for purposes of automatic generation of tactile graphics was a worthwhile technique to explore further [94]. A set of eight digital images was collected. This set purposely included some ambiguity of overall shape. The set was comprised of rounded images, three faces and a hot air balloon, and square-shaped images, the chimney of house, a notebook computer, a space-shuttle launch and a diagram of a human heart. A matching task, described in more detail below, was performed to measure each subject's ability to discriminate among the tactile images. Results were 69 recorded for each of the subjects regarding successful versus unsuccessful matches for each of the images and processes applied. The results for the various processes were compared, and some interesting anecdotal evidence was noted. 4.2.1.1 Subjects A group of four sighted subjects, two male and two female, all in the 20- to 40year-old range, was used in the study. The subjects all participated voluntarily and were co-workers of the author at the Applied Science and Engineering Laboratories, a joint research facility of the University of Delaware and the A.I. duPont Institute. For the experiment, subjects were blindfolded and given no information regarding the content of the tactile images. 4.2.1.2 Materials Each image was processed in ve ways: (1) using grayscaling alone (for uniformity), (2) K-means adaptive segmentation, (3) Sobel edge detection, (4) K-means segmentation followed by Sobel edge detection, and (5) Sobel edge detection followed by K-means segmentation. For each combination of processing, the eight resulting images were printed out in the same size (2.5in x 2.5in), and arranged in an arbitrary order on a single blank sheet of paper. A second sheet was prepared using the same processed images arranged in a dierent random order. Subsequently, these sheets were photocopied onto microcapsule paper (Repro-Tronics) and raised using the enhancer device. Thus, the resulting experimental materials consisted of pairs of sheets of raised images, one 70 pair for each of the ve processing sequences (see Appendix B). Each sheet contained all eight images, and each member of a pair had the eight images arranged in a dierent order from its mate. 4.2.1.3 Procedure Subjects were asked to perform a basic matching task using each pair of sheets of processed tactile images. For each of the ve types of aggregate processes the appropriate pair of sheets was placed on a table in front of the seated and blindfolded subject. First, the subject's hand was placed onto one processed image on one sheet, and the subject was allowed to explore the image freely. Then, the subject's hand was guided to an arbitrary location on the second sheet and the subject attempted to locate the identical object on this second sheet. This task was repeated for each of the eight images on each of the ve pairs of identically processed sheets. 4.2.1.4 Results Table 4.1 contains the results of the pilot experiment. The table columns indicate the type and order of image processing used, the average number of matches out of eight per subject, the average percentage of matches per subject and analysis of variance for each of the algorithm combinations used. Note that analysis of variance was used to gauge interaction between the group of unprocessed Grayscale versions of images with each of the groups of various other processing used. Analysis of variance was also used to explore interaction of the results of various processing versus the results expect by chance (12.5% or ). 1 8 71 Table 4.1: Summary of per subject average results of the tactile image matching task for ve image processes [94]. Mean Mean Pct. Image process Matches Matched Grayscale 2.25/8 28% and K-means 6.25/8 78% and Sobel 4.75/8 59% and K-means & Sobel 5.75/8 72% and Sobel & K-means 7.75/8 97% p p (vs. Grayscale) (vs. Chance) 1.00e+00 2.85e-05 4.01e-04 3.60e-03 4.47e-06 2.50e-03 7.60e-07 5.53e-06 2.29e-04 1.71e-07 4.2.1.5 Discussion of pilot study The results of the pilot study indicate that even a modest amount of simplication yields a marked improvement in tactile image discrimination. Comparison of the means shows that all image processing techniques used increased the subjects' chances for correctly locating matching tactile images. Images that were simpler at the outset were recognized more easily in all cases. In particular, the illustration of the human heart chambers and a photograph of an opened notebook computer tended to be distinguishable even with no processing, probably due to a white background and simple initial representation. There was often confusion among the three images of human faces and a hot air balloon, each of which had an essentially oval shape. We observed a general tendency among subjects for discrimination ability to increase as tactile images became simpler. Analysis of variance indicates statistically signicant interaction between each of the forms of processing used when compared with the unprocessed Grayscale originals. The processing that utilized edge detection followed by segmentation showed the greatest interaction in addition to the best mean performance for the matching task. The other forms of processing also showed strong interaction, indicating simplication of various forms had a noticeable eect. 72 Compared with results expected by chance, there are strong indications of interaction for all forms of processing with the exception of Grayscale alone. Thus, it is fair to say that it is quite probable that the improvement in subject performance is not merely the result of random chance. Some interesting anecdotal evidence was gathered. A number of subjects reported, upon feeling the processed images, that they thought there was more than one face among the images, though none had any idea ahead of time as to the content of the images. This content identication was not reported upon feeling the original unprocessed images. After an initial period of trying various exploratory techniques, each of the subjects independently arrived at the same method for exploring the images. The tendency was to use the outside edges of an image for gross classication and comparison. Once this gross comparison was made, the details of the interior of an image were explored, seemingly to dierentiate among those with similar overall shapes. The crucial result of this pilot study was that simplication techniques, applied automatically to electronic images using computerized image processing, improved discrimination for tactile images. When combined with the anecdotal accounts of the content identication that was apparently facilitated by image simplication, this result strongly indicated that this method was valid and deserved further investigation. Because of the strength of these preliminary ndings, improvements were made to the prototype system and four additional experiments were designed and conducted. These four experiments tested simple discrimination, timed discrimination, identication, and comprehension. 73 4.2.2 Simple Discrimination Experiment Two forms of tactile discrimination experiments were conducted. In this rst experiment, subjects were allowed to explore freely the initial and secondary tactile images for a total of one minute per pair. This provided the subjects with enough time to glean some information about both the general shape of the image and some of the more prominent internal details. As described below, a matching task was conducted to measure the eectiveness of the four image processing techniques under consideration when applied strictly for purposes of discrimination. This task is roughly analogous to that of a sighted person leisurely browsing through photographs in a magazine or on the Internet, for instance. In addition to measuring the eects of processing on discrimination, a study was conducted to determine the eect on discrimination of one form of microcapsule paper versus the other. Results from this comparison of microcapsule papers will be extrapolated to other more complex tactual perception tasks of identication and comprehension. 4.2.2.1 Subjects Ten subjects ranging in age from 22 to 60 were used in this experiment. The subjects participated voluntarily and came from a variety of backgrounds, including college students, homemakers, computer programmers, and a retired chemist. All subjects were educated to at least the four-year college level. Seven subjects were male, three were female. Three subjects were blind, seven were sighted. The two male blind subjects were adventitiously blind, one at age 19, the other at age 39. The one female blind subject was congenitally blind. Additionally, one male subject was classied as low vision. Subjects had little to no experience with tactile images and 74 microcapsule paper, although one blind male subject used similar tactile materials as study aids for a college course. 4.2.2.2 Materials Materials were produced on both types of microcapsule paper using identical tactile images for each set. Each set consisted of 40 sheets, each with a pair of raised tactile images per sheet, one on either side of a raised line that divided each sheet in half (see Appendix B). Each tactile image was limited to four inches in width, which is within the width of one hand span. The height of each image followed proportionally from the scaling of the width and also stayed well within the height of one hand span. Samples were drawn from the set of original images (see Appendix A) to prepare the testing materials, which were comprised of image pairs. Half of the pairs consisted of identical images, and half were not identical. Each pair was prepared using each one of the four processes under consideration, and the same processing was applied to both images on a sheet. The four processes used were (1) no processing, (2) K-means adaptive segmentation, (3) Sobel edge detection with thresholding, and (4) an aggregate process of blurring, Sobel edge detection without thresholding, K-means segmentation and median ltering. 4.2.2.3 Procedure Subjects were asked to perform a discrimination task using one complete set of 40 tactile-image pairs. Subjects were seated at a table, blindfolded if sighted, and presented with each of the 40 sheets from a given set in an arbitrary sequence. For each sheet, subjects freely explored the pair of tactile images on the sheet for a period of time totaling one minute and were then asked to report whether the 75 images felt the same or dierent. Subjects also could reply that they could not say one way or the other, although this reply was rarely used. During this procedure, responses were recorded, as were any unsolicited comments made by the subject in reaction to the materials or procedure. Subjects were given neutral feedback after each matching task along the lines of, \Good. Now here is the next one." Overall, each complete set of 40 tactile image sheets was used with ve (onehalf) of the subjects, so that some comparison could be made of the two forms of microcapsule paper under identical experimental conditions. The same set of 40 sheets of testing materials that was used for a subject in this simple discrimination experiment was randomly reordered and used in the following timed discrimination experiment for the same subject. Note that half of the subjects completed the simple and timed discrimination tasks using the Repro-Tronics paper, and the other half the Matsumoto Kosan paper. With the exception of the type of paper on which the materials were prepared, the two complete sets of testing materials were identical in every respect. 4.2.2.4 Results The results of the simple discrimination experiment are summarized in the following three tables. Table 4.2 provides an overview of how subjects performed on average for each of the four image processes applied. Analyses of variance indicate signicant interaction between the unprocessed original and any of the other processing performed. Compared with results expected by chance, analyses of variance indicate signicant interaction for all forms of processing. In the case where no processing was used, analysis of variance does not indicate interaction. 76 Table 4.3 compares the results of using one type of microcapsule paper versus the other. Table 4.4 compares the performance of blind versus sighted subjects. In these tables, analyses of variance does not indicate any interaction between groups of subjects based on the dierent processes applied, whether compared by output medium or level of vision. In these tables, Matches refers to the sum of all correct responses in the discrimination task made by the 10 subjects in all trials. The Mean Pct. Matched is the computed average percentage of these correct responses. The results of analyses of variance are denoted by p and compare results for each of the forms of processing with those for the unprocessed originals, and with chance (50%). Table 4.2: Summary of overall results of simple discrimination task for four image processes. The Aggregate Process is comprised of blurring, Sobel edge detection without thresholding, K-means adaptive segmentation, and median ltering, applied in that order. Mean Pct. p p Image process Matches Matched (vs. None) (vs. Chance) No Processing 50/100 50.00% 1.00e+00 1.00e+00 K-means Segmentation 83/100 83.00% 2.07e-05 3.49e-07 Sobel Edge Detection 81/100 81.00% 4.70e-06 1.53e-09 Aggregate Process 95/100 95.00% 4.40e-08 1.93e-11 Table 4.3: Summary of percentage of correct responses comparing eects of two varieties of microcapsule paper on simple discrimination task. Image process Flexi-Paper Matsumoto-Kosan p No Processing 48.00% 52.00% 6.41e-01 K-means Segmentation 90.00% 78.00% 2.60e-01 Sobel Edge Detection 80.00% 82.00% 3.05e-01 Aggregate Process 96.00% 94.00% 3.59e-01 77 Table 4.4: Summary of percentage of correct responses comparing results of blind versus sighted subjects performing simple discrimination task. Image process Blind Subjects Sighted Subjects p No Processing 53.33% 52.86% 2.94e-01 K-means Segmentation 83.33% 81.43% 6.01e-01 Sobel Edge Detection 63.33% 84.29% 6.43e-02 Aggregate Process 90.00% 95.71% 7.45e-01 4.2.3 Timed Discrimination Experiment The second experiment imposed a strict time limit of 10 seconds for exploration of each pair of images. This limited-time experiment was designed to measure the eectiveness of TACTICS image processing techniques in a situation reminiscent of a sighted person skimming or quickly scanning through a series of images, making quick determinations. The goal of this experiment was to test how use of these image simplication methods might aect the ability of a blind computer user to perform browsing and navigation tasks using touch in a GUI environment on a level comparable to a sighted computer user. 4.2.3.1 Subjects The same subjects were used for this experiment as in the previous simple discrimination experiment. Since this experiment was always performed immediately following the simple discrimination experiment, subjects had gained limited experience with and developed individual techniques for exploring the tactile image materials. 78 4.2.3.2 Materials Materials were the same as in the simple discrimination experiment (see Appendix B), with identical materials being used for the same subject for both the simple and timed discrimination experiments. Although the identical materials were used for a given subject, they were randomly reordered to counteract possible bias related to ordering. 4.2.3.3 Procedure The procedure for this experiment was identical to the previous discrimination task, with the single exception being that subjects were limited to 10 seconds per image-pair matching task. 4.2.3.4 Results The results of the timed discrimination experiment are summarized in the following three tables. Table 4.5 provides an overview for how subjects performed for each of the four image processes applied. Analyses of variance indicate signicant interaction between the unprocessed original and any of the other processing performed. Compared with results expected by chance, analyses of variance indicate some degree of interaction for all forms of processing. In the case where no processing was used, however, analysis of variance does not indicate interaction. Table 4.6 compares the results of using one type of microcapsule paper versus the other. Table 4.7 compares the performance of blind versus sighted subjects. Analyses of variance for these two tables show that there is no statistical evidence of 79 interaction between groups of subjects based on processing used, whether compared by output medium or level of vision. Table 4.5: Summary of overall results of timed discrimination task for four image processes. Image process No Processing K-means Segmentation Sobel Edge Detection Aggregate Process Matches 55/100 77/100 73/100 87/100 Mean Pct. p Matched (vs. None) 55.00% 1.00e+00 77.00% 1.80e-03 73.00% 2.90e-03 87.00% 4.67e-05 p (vs. Chance) 3.82e-02 1.34e-04 1.24e-04 3.24e-06 Table 4.6: Summary of percentage of correct responses comparing eects of two varieties of microcapsule paper on timed discrimination task. Image process Flexi-Paper Matsumoto-Kosan p No Processing 50.00% 58.00% 1.33e-02 K-means Segmentation 92.00% 68.00% 2.30e-01 Sobel Edge Detection 74.00% 72.00% 8.47e-01 Aggregate Process 96.00% 82.00% 2.30e-01 Table 4.7: Summary of percentage of correct responses comparing results of blind versus sighted subjects performing timed discrimination task. Image process Blind Sighted p No Processing 43.33% 60.00% 6.54e-01 K-means Segmentation 86.67% 72.86% 4.91e-01 Sobel Edge Detection 73.33% 72.86% 1.96e-01 Aggregate Process 93.33% 84.29% 7.47e-01 80 4.2.3.5 Comparison with simple discrimination Results of the performance of subjects in the simple and timed discrimination tasks are compared in Table 4.8. While the trend based on mean performance was for subjects to discriminate tactile images slightly less successfully under time pressure, analyses of variance did not indicate any interaction between the two discrimination modalities. This result is therefore inconclusive in regard to the eect of time pressure on tactile discrimination. Table 4.8: Summary of percentage of correct responses comparing results of all subjects on simple discrimination versus timed discrimination tasks. Image process Simple Timed p No Processing 50.00% 55.00% 2.85e-01 K-means Segmentation 83.00% 77.00% 4.03e-01 Sobel Edge Detection 81.00% 73.00% 1.61e-01 Aggregate Process 95.00% 87.00% 2.26e-01 4.2.4 Identication Experiment For the identication task, subjects explored a series of tactile images and for each attempted to classify it into one of four categories that varied for each image. This task was designed to provide some insight into the eectiveness of TACTICS to produce a tactile image that resembles the original in such a way that it is identiable given some small amount of pre-information. This is analogous to the visual task of identifying photographs based on some small amount of textual information, such as a caption. 81 4.2.4.1 Subjects The subjects used for this experiment were the same, though now somewhat more experienced with tactile image exploration and interpretation, as those used in the previous two discrimination experiments. 4.2.4.2 Materials For this experiment, 10 images were selected from the original set of 30 images that reected a diversity of shape and content. Each image was processed using each of the four processes under consideration, with the result of each process placed onto an individual sheet of the Matsumoto Kosan microcapsule paper. The result of this preparation was a set of 40 sheets, each with a tactile image that was processed in one of four ways. For each of the 10 images, four possible categories were dened (see Appendix C). Of these four categories, one correctly identied the content of the image, two identied objects that may closely resemble the content of the image, and one resembled the content of the image less closely. 4.2.4.3 Procedure For each of the 40 arbitrarily presented sheets, the experimenter verbally listed the four possible categories as the subject freely explored the tactile image. At the conclusion of a period of no more than 30 seconds, the subject was asked to state which category most closely matched the tactile image that was explored. The responses were recorded, and the procedure was similarly repeated for all tactile images in the set. 82 4.2.4.4 Results The results of the tactile image identication experiment are shown in the following two tables. Table 4.9 summarizes overall performance of subjects on the identication task for each of the four forms of image processing applied in the production of the tactile images. Analyses of variance indicate signicant interaction between the unprocessed original and any of the other processing performed. Signicant interaction is also indicated by analyses of variance comparing the various processing with results expected by chance (25%). Table 4.10 compares performance of blind versus sighted subjects in the same task and for the same four processes. Analyses of variance comparing sighted and blind subjects did not indicate interaction between the subject groups. Table 4.9: Summary of overall results of identication task for four image processes. Pct. p Image process Identications Identied (vs. None) No Processing 7/100 7.00% 1.00e+00 K-means Segmentation 55/100 55.00% 3.84e-09 Sobel Edge Detection 46/100 46.00% 6.36e-07 Aggregate Process 85/100 85.00% 5.05e-13 p (vs. Chance) 1.83e-06 2.24e-07 2.02e-04 8.93e-13 Table 4.10: Summary of percentage of correct responses comparing results of blind versus sighted subjects performing identication task. Image process Blind Sighted p No Processing 10.00% 5.71% 4.83e-01 K-means Segmentation 56.67% 54.29% 7.89e-01 Sobel Edge Detection 43.33% 47.14% 7.23e-01 Aggregate Process 76.67% 88.57% 1.13e-01 83 4.2.5 Comprehension Experiment Based on results of the Pilot study, which were further supported by the discrimination and identication experiments, the aggregate process of edge detection followed by segmentation was found to be best among those considered for improving performance in a basic tactile image discrimination task. We used this process as a foundation, enhancing its eect by adding an initial blurring step and following up with a median ltering step. By applying blurring initially to the original image, detail was reduced and edges generated by subsequent processes were thicker and more easily perceived. The application of median ltering as a post process removed the rare instances of undesired noise that remained. The applicability and eectiveness of the algorithms and sequencing chosen for this aggregate process was supported by results of the discrimination and identication experiments. This experiment measured the ability of subjects to comprehend tactile images prepared using the aggregate processing. The assumption was made that unprocessed images would be incomprehensible; and, indeed, this assumption was supported by results of the discrimination and identication experiments. Another assumption used in the design of this experiment was that the aggregate process, having provided the best results for previous tasks, would be the best choice for this task as well. 4.2.5.1 Subjects The same subjects were used in this experiment as in the previous three experiments. Having performed the three previous experiments, by this point subjects were more experienced and also quite comfortable with the exploration of the tactile materials being used. 84 4.2.5.2 Materials For this experiment, 10 images were selected from the original set of 30, based on a diversity of content and shape. Each image was processed using the aggregate process, and placed onto a sheet of Matsumoto Kosan microcapsule paper for raising. Due to the results of previous experiments which indicated no interaction based on type of paper used, use of this paper was deemed sucient. Associated with each of the 10 processed images was a brief one- or twosentence description of the image and four questions designed to test a subject's comprehension of the image's content (see Appendix D). Questions were of the \True or False," \Multiple Choice" and \Locate the (ll in the blank)" variety, with the number of choices limited to two. The questions were designed to be of the sort that generally would be easily answered by a sighted person viewing the original image. Some questions asked the subject to locate some feature in the image, such as, \Locate the tail n of the space shuttle." Other questions concerned understanding some feature in the image. For example, associated with the image of a space shuttle in the process of landing, one question was, \Is the shuttle landing from left to right, or right to left?" Finally, the third and most dicult form of question asked the subject to reason about and draw some conclusion about the content of the image; for example, with an image of a desktop computer the question was, \Is the computer on or o?" 4.2.5.3 Procedure For each of the 10 tactile images, subjects were rst read the brief description of the image while the subject explored the image. As the subject continued to 85 explore the image freely, each one of the four questions, together with the two possible answers for each, concerning the image was read aloud. For questions with a verbal response, the experimenter recorded the subject's reply on the data-collection sheet. For questions in which a subject was asked to locate a specic feature in the image, the experimenter observed the movement of the subject's hand and noted both the nal location the subject indicated and the subject's verbal reply. Also recorded were any unsolicited remarks or comments made by the subject and observations made by the experimenter during the experimental procedure. Comments often referred to the diculty a subject may be having with a particular question or tactile image, some interesting discovery that had been made by the subject regarding the image, or the reasoning used by the subject in reaching a particular conclusion. Observations made by the experimenter included noting initial reactions of the subject, exploratory movements used, and any other reactions that seemed noteworthy. 4.2.5.4 Results Results of the tactile image comprehension experiment are shown in the following two tables. The rst, Table 4.11, displays subject performance for the three comprehension subtasks as well as overall performance. Analyses of variance comparing these results with chance (50%) indicate signicant interaction, suggesting little possibility that the successful performance of subjects was random. Table 4.12 compares how blind and sighted subjects performed in this experiment. Analyses of variance comparing subjects based on level of vision indicate probable interaction for the location task. No interaction is indicated when comparing subjects by level of vision and the understanding and reasoning tasks. 86 The disparity in performance on the location task may be due to dierences in visual memory, with sighted subjects possessing more familiarity with visual material in general than blind subjects [77]. As a result, sighted subjects are more aware of relative size and position of objects as represented in an image. This dierence in positional awareness could account for the dierences in performance of blind subjects versus sighted subjects for the location task. The lack of signicant dierences in performance for the understanding and reasoning tasks could indicate that a more developed visual memory is not necessary to these tasks. Table 4.11: Summary of results of comprehension task for three subtasks and overall comprehensibility of tactile images prepared using Aggregate process. Comprehension Task Correct Replies Pct. Correct p Feature location 108/130 83.08% 6.53e-08 Feature understanding 132/160 82.50% 3.49e-09 Content reasoning 87/110 79.09% 4.45e-14 Overall 327/400 81.75% 1.44e-15 Table 4.12: Summary of percentage of correct responses comparing results of blind versus sighted subjects performing comprehension task. Comprehension Task Blind Sighted p Feature location 69.23% 89.01% 5.30e-03 Feature understanding 89.58% 79.46% 1.37e-01 Content reasoning 78.78% 79.22% 8.96e-01 Overall 80.00% 82.50% 4.27e-01 4.2.6 Signicance of Results The analyses of variance for all experiments did not indicate interaction between groupings of subjects based on level of vision. This result is expected based 87 on results of previous studies that found no signicant dierence between the tactile abilities of blind and sighted persons [56]. For groupings based on output medium, analyses of variance again did not indicate interaction. Since the characteristics of the two types of papers are similar in most respects, this result is not remarkable. However, the two forms of paper do vary in the property of stiness, with Matsumoto-Kosan paper being signicantly stier than the Repro-Tronics FlexiPaper, which is exible by design. It appears that stiness alone is not a signicant factor in any of the tactual perception abilities we measured. In spite of the lack of statistical dierences, some subjects indicated a preference for the stier MatsumotoKosan paper over the Flexi-Paper, noting its \better clarity" or \nicer feel." These personal reactions did not appear to translate into dierences in the performance of subjects. Comparing mean performance on the various tasks versus chance performance reveals an apparent trend of improvement based in some measure on the degree of simplication. More formally, analyses of variance based on type of processing showed signicant interaction between each form of processing used when compared directly with unprocessed originals. These analyses repeatedly indicated that the application of simplifying image processing techniques in the translation of visual images to tactile images improved performance of subjects in discrimination, identication and comprehension tasks. This result is quite favorable, particularly when compared with subject performance on similar tasks using unprocessed tactile images. Equally as important as these statistically signicant results are the observational and anecdotal evidence gathered during these experiments. The signicance of that evidence, in light of the results from these experiments, will be discussed in the following chapter. 88 Chapter 5 OBSERVATIONS, DISCUSSION AND CONCLUSIONS In the course of evaluating TACTICS, a number of general observations were made that anecdotally enhance the raw tabulated results. These observations and results are discussed, and conclusions are drawn, regarding the eectiveness of TACTICS as a method for providing blind persons with tactile access to visual information. 5.1 Observations While conducting these experiments, the experimenter recorded observations in addition to the raw response data. These observations included unsolicited comments from the subjects, notes regarding exploratory techniques used by subjects, and other actions and remarks made by the subjects during the experiments. The observations are summarized here for each of the four experiments followed by more general observations. During the simple discrimination experiment, subjects typically took more time for the rst few tasks while they became accustomed to exploration of the tactile images and experimented with dierent techniques for exploring them. Most subjects developed a two-handed approach to this discrimination task, using one 89 hand for each of the images in a pair and synchronizing the movements of the two hands. While using this technique, subjects usually rst attempted to determine the general shape of the image. Then, subjects performed further exploration, again in tandem, to examine details of the image. One striking observation made during the timed discrimination experiment is that, compared with the simple discrimination experiment, subjects often seemed much more condent of their answers in spite of the limited time given for exploration. Some of this condence may have been the result of experience gained during the previous experiment and perhaps condence in the techniques developed as a result. One technique developed by many subjects was the use of a brushing motion, drawing the ngertips of the hand the length of each tactile image. This technique was fast, and seemed to provide enough basis to discriminate between tactile images. This brushing technique was performed with two hands in tandem or with a single hand on each image individually approximately equally as often. The identication experiment proved to be the most challenging for subjects based on their reactions during the experiment. They often expressed frustration at not being sure about which category was the best match for a given image. Subjects often used a process of elimination to narrow down possible choices from among the four categories given for each tactile image. Another technique subjects used was to explore an image four times, basing each exploration on the assumption that the image was one of the four categories. Guessing was a common strategy used by subjects when they could not determine a category for an image. Guessing occurred most frequently with the relatively feature-free unprocessed images. Subjects seemed especially to enjoy the nal experiment measuring tactile image comprehension. The combination of the experience gained in the previous 90 three experiments and the up-front description of each tactile image seemed to give subjects a great deal of condence in exploring the images and answering the questions. It was not uncommon for subjects in the process of answering one question to make comments and remarks about the contents of the images that turned out to be the answer to later questions. For example, while exploring a tactile image of the Space Shuttle, a number of subjects indicated the locations of the nose and tail of the vehicle while answering a question about its direction of travel. An image of an astronaut working on the surface of the Moon was the most dicult for all subjects. Visual observations made while the subjects were exploring this particular image indicated that the presence of an edge denoting the horizon as well as the busy pattern of edges generated by the texture of the surface of Moon made image comprehension dicult. Subjects frequently mistook a U.S. ag in the image for the astronaut's backpack, and that miscalculation caused incorrect answers to other questions about the astronaut's position and activity. In general, each subject tended to have relative ease or diculty with the same images and processing that other subjects had ease or diculty with, respectively. Subjects also tended to gain condence in performing the various tasks as they gained experience with exploring the tactile images. The general technique that was arrived at by each of the subjects was one of exploring rst the overall shape and size of a tactile image, then feeling for details. Blind subjects had more diculty with some of the more visual concepts, particularly with images of large objects such as planets, the Space Shuttle and very small objects, such as the Streptococcus bacteria and Ebola virus. Blind subjects often expressed more apprehension at the outset than sighted subjects, although blind subjects had more experience relying on the sense of touch than sighted subjects. 91 For subjects who performed the two discrimination tasks using Flexi-Paper, there was an initial reaction to the improved clarity provided in the third experiment which was conducted using the Matsumoto Kosan microcapsule paper. Subjects did not appear to have more diculty in similar tasks using one variety of paper versus the other. Another comment regarding Flexi-Paper was that some pairs of images seemed to be expanded to diering heights. Interestingly, careful measurements taken in the laboratory using a mill-meter revealed that heights of the tactile images referred to were identical. Possible explanations are that the tactile acuity for the left and right hands may have varied slightly for some subjects, or that the dierence in stiness of the two papers may have aected the outcome. 5.2 Discussion Comparison of various results of these experiments provides further insight into the degree of eectiveness of TACTICS for automatic generation of tactile images. Comparing mean percentage of matches for the simple and timed discrimination experiments reveals that performance degraded fairly uniformly, and even then only slightly, when going from the untimed to timed task. For unprocessed tactile images, discrimination for both tasks was about chance (50%). Blind subjects successfully discriminated between tactile images about 10% less frequently than sighted subjects, although it must be noted that various analyses of variance did not indicate a statistical signicance for this observation. One possible explanation for this slight dierence in performance is a lower level of pre-experiment condence among the blind subjects, who tended to be somewhat more apprehensive about how well they would perform in the experiments. With the exception of segmentation, a comparison of results from the simple discrimination experiment based on the two types of microcapsule paper was 92 unrevealing. Ability to discriminate segmented tactile images was signicantly better for Flexi-Paper than Matsumoto Kosan paper. For the timed discrimination experiment, Flexi-Paper produced a signicantly higher percentage of successful discriminations than the Matsumoto Kosan paper for segmentation and the aggregate process, and slightly higher for Sobel edge detection. An anecdotal explanation for this result may be that subjects reported a more positive, albeit subjective, reaction to the stiness of Flexi-Paper over Matsumoto Kosan paper. As previously mentioned, specications and actual measurements comparing the expanded characteristics of the two papers do not provide empirical evidence for any dierence. For the identication and comprehension experiments, it is likely that there would be little dierence in resulting ability to identify and comprehend tactile images based solely on use of dierent microcapsule paper. There was no element of time pressure imposed in the conducting of these two experiments; it was time pressure which seemed to produced degraded ability to discriminate in the timed discrimination experiment. In judging the eectiveness of various forms of processing on original images to produced tactile images, the conclusions from the Pilot Study are supported. In general, simplication to any degree produces improvement in discrimination rate. For the discrimination experiments, when no processing was applied at all, success rates tended to be at about chance. Subjects discriminated correctly between tactile images about 75% of the time for images processed using Sobel edge detection alone, and slightly better than that for images that were prepared using segmentation. The aggregate process allowed subjects to correctly discriminate from 85% to 90% of the time, and some subjects performed perfectly. 93 It is dicult to say what the eect of experience with exploration of tactile images was on the results of these experiments. One blind subject had some experience using tactile materials to aid his study for a college course; but those materials were strictly expanded versions of unprocessed line drawings, and his experience with them did not seem to have a noticeable eect on his performance in the experiments. None of the subjects had experience with automatically generated tactile images of the form used in these experiments prior to participation. Tallied and calculated results aside, there is some reason for optimism regarding improvement with experience. Without exception, subjects were observed becoming more comfortable and condent with the tactile representations during the course of the experiments. Additionally, many began to recognize the overall shape or features of some tactile images they had explored earlier in the experiment. It is important to note that, although subjects \recognized" some tactile images, they had no understanding of the content. Recognition in this case was simply feeling some tactile pattern or shape that had been felt on an earlier sheet and commenting aloud on that observation. The identication experiment proved to be the most dicult for subjects when compared with the mean percentage of matches for the other experiments. The aggregate process again produced the best rate of success, followed by segmentation and Sobel edge detection, with no processing trailing far behind. Although blind and sighted subjects performed nearly identically on segmented and on edge detected images, sighted subjects performed better when exploring images prepared using the aggregate process, correctly identifying image content more than 90% of the time while blind subjects were successful 75% of the time. This dierence may be due to the visual nature of the images and the content therein, and perhaps due to an unintentional visual bias in preparation of the experimental materials. 94 For the comprehension experiment, subjects generally performed quite well on all tactile images and questions, with the exception of one particular image of an astronaut working on the surface of the Moon. The featured content of most images was photographed either straight-on or in prole, producing tactile images that were straightforward to explore and comprehend. The astronaut image was captured at somewhat of a downward angle, producing a confusing horizon line crossing the image at the level of the astronaut's neck. Further research is needed to determine what image processing techniques exist, or can be developed, to handle adequately potentially confusing information, such as horizon lines, within the framework of automatic conversion to tactile representation. Overall, blind and sighted subjects performed about the same on all tasks. Blind subjects tended to have more diculty than sighted subjects in locating specic features within images, perhaps due to a less developed visual memory or lack of experience with characteristics of visual representation from which the tactile images were generated. Blind subjects performed better than sighted subjects with tasks involving understanding the content of tactile images, perhaps due to more experience in relying on the sense of touch to gather and interpret information. 5.3 Conclusions The objective of this work was to provide meaningful access to computerbased visual information to blind persons, and to do so automatically. Image processing techniques were applied to images to produce simplied versions of the originals, appropriate for output as tactile graphics. These image processing algorithms, and the aggregate processes resulting from various combinations, were selected based on eects that were analogous to principles of psychophysics and the science of tactual perception. The result was a system that converts a visual image into a tactile 95 image in an automatic, timely and comprehensible fashion, as supported by results of evaluative experiments. Although this individual study did not test nor does it fully support the theory that experience with use of tactile images will improve over time, the observations made during experimental evaluation provide testimony in favor of this possibility. There is no reason to think that the expression \practice makes perfect" applies everywhere except for tactile images, as indicated by subject recognition of shapes and patterns that were encountered in an earlier task. The signicance of the development of this prototype system is that it makes it clear that reasonable and comprehensible access to visual information can be provided to blind persons, and done so without the intervention of a sighted facilitator. Thus a blind computer user, for instance, could \surf the web" unaided and at a much better level of comprehension than possible with text alone. This increased access to visual material can facilitate broader educational and professional opportunities, particularly in areas with a strong tendency toward visual presentation of information. For example, persons with disabilities, including blindness, are currently underrepresented in science-, engineering- and mathematicsrelated disciplines. The techniques developed in this system can translate the visual information from these elds into tactile form, providing students and professionals with better access to diagrams, graphs and images ranging in scale from the microscopic to the cosmic. 96 Chapter 6 FUTURE DIRECTIONS The eectiveness of TACTICS at converting visual information into comprehensible tactile information lends credence to the possibility of future investigation in this and related areas. Among the possibilities are: 6.1 Development of End User Application TACTICS can be developed further into a stand-alone application. Such an application would be invokable from the command line, perhaps being called in place of a print routine from a web browser. If a blind computer user desired to explore a tactile version of an image, the application would automatically handle processing of the image. At present, there are extra steps involved that may require assistance from a sighted person, namely: Retrieving the printout Loading microcapsule paper into a photocopier machine Photocopying the tactile image onto the microcapsule paper Raising the tactile image using a device such as the Tactile Image Enhancer 97 6.2 Extension to Refreshable Tactile Display The use of image processing appears to be a natural and eective method for production of simplied images suitable for output in tactile form. With such eective pre-processing available, the task of expedient output becomes more important. There is a denite need for real-time dynamic tactile display technology that could display tactile images eciently. The techniques developed in this thesis for converting visual information into tactile information lend themselves to use as a front-end to such a real-time, dynamic, tactile-display device. Such a display would overcome the reliance on a sighted person that a blind person might experience when utilizing microcapsule paper as an output method. One limitation of past technology developed to display tactile graphics was that its eectiveness was determined by the relative simplicity of the material being displayed. Using image processing techniques, as in TACTICS, visual information could be prepared readily for meaningful display in tactile form. 6.3 Multimodal Interface Simplied tactile representations of images, maps and other infrequently changing visual items could be combined with touch-screen technology to create a multimodal interface. With some initial conguration, positions on an image or map could be associated with audio feedback, as with the Nomad (see page 29). The advantage of this approach would be the speed with which tactile materials could be prepared, and the exibility oered by the automatic simplication techniques of TACTICS. 98 6.4 Mapping Color to Texture Segmentation divides a two-dimensional visual representation into regions based on related colors or intensity levels. The result of such a segmentation could be used subsequently to associate the color of each region with a distinct texture, thus providing a blind person with more complete access to the original content of the visual information. One long-standing problem of graph theory was the four-color conjecture, the notion being that any planar graph, for our purposes a two-dimensional visual representation such as a map or photograph, could be segmented into regions and those regions colored using only four colors, and with no two adjacent regions being assigned the same color [19, 75]. Originally posed in 1852 by Francis Guthrie, the four-color conjecture was nally proved in 1977 [2, 3], although nding a fourcoloring is not necessarily fast. Given that four colors is sucient, relaxing the coloring to some reasonably small number (say 10) would allow a very fast coloring to be performed. Thus, a tactile image, simplied using TACTICS, could be segmented and colored quickly using any of a number of simple graph-coloring algorithms. Textures are produced using simple patterns that produce palpable textures when raised. By uniquely mapping colors to these textures, it may be possible to preserve much of the original visual information. Even simpler would be to apply a K -means segmentation to an image, with K = desired number of colors, and apply the color-texture mapping to the result. This method might not provide as good a texture mapping as a more computationally expensive technique, but it would certainly be fast and may be sucient for enabling comprehension of tactile images, which is the goal. 99 BIBLIOGRAPHY [1] P. Apkarian-Stielau and J.M. Loomis. A comparison of tactile and blurred visual form perception. Perception and Psychophysics, 18(5), 1975. [2] K. Appel and W. Haken. Every planar map is four colorable, part I: Discharging. Ill. J. Math., 21, 1977. [3] K. Appel, W. Haken, and J. Koch. Every planar map is four colorable, part II: Reducibility. Ill. J. Math., 21, 1977. [4] G.R. Arce, N.C. Gallagher, and T.A. Nodes. Median lters: Theory and applications. In T.S. Huang, editor, Advances in Computer Vision and Image Processing, volume 2. JAI Press, 1986. [5] N.C. Barraga. Sensory perceptual development. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and Youth: Theory and Practice. American Foundation for the Blind, 1986. [6] K.L. Beauchamp, D.W. Matheson, and L.A. Scadden. Eect of stimuluschange method on tactile-image recognition. Perceptual and Motor Skills, 33, 1971. [7] P.J. Benson, D.I. Perrett, and D.N. Davis. Towards a quantitative understanding of facial caricatures. In V. Bruce and M. Burton, editors, Processing Images of Faces. Ablex Publishing Corporation, Norwood, New Jersey, 1992. [8] B. Betts, D. Burlingame, G. Fischer, J. Foley, M. Green, D. Kasik, S.T. Kerr, D. Olsen, and J. Thomas. Goals and objectives for user interface software. Computer Graphics, 21, 1987. [9] I. Biederman. Human image understanding: Recent research and a theory. Computer Vision, Graphics and Image Processing, 32, 1985. [10] J.C. Bliss, M. Katcher, C.H. Rogers, and R.P. Shepard. Optical-to-tactile image conversion for the blind. IEEE Transactions on Man-Machine Systems, MMS-11, 1970. 100 [11] L.H. Boyd, W.L. Boyd, and G.C. Vanderheiden. The graphical user interface: Crisis, danger and opportunity. Journal of Visual Impairment and Blindness, 84, 1990. [12] R.D. Boyle and R.C. Thomas. Computer Vision: A First Course. Blackwell Scientic Publications, London, 1988. [13] J. Bradley. XV Online Documentation. University of Pennsylvania, 3.10 edition, 1994. [14] F.P. Brooks, M. Ouh-Young, J.J. Batter, and P.J. Kilpatrick. Project grope haptic displays for scientic visualization. In 17th Annual ACM Conference on Computer Graphics and Interactive Techniques - SIGGRAPH '90, volume 24 of Computer Graphics, New York, August 1990. ACM. [15] D. Burger. Improved access to computers for the visually handicapped: New prospects and principles. IEEE Transactions on Rehabilitation Engineering, 2(3), 1994. [16] P.A. Carpenter and P. Eisenberg. Mental rotation and the frame of reference in blind and sighted individuals. Perception and Psychophysics, 23(2), 1978. [17] C.C. Collins. Tactile television - mechanical and electrical image projection. IEEE Transactions on Man-Machine Systems, MMS-11, 1970. [18] S. Coren and L.M. Ward. Sensation and Perception (3rd Edition). Harcourt Brace Jovanovich, San Diego, 1989. [19] T.H. Cormen, C.E. Leiserson, and R.L. Rivest. Introduction to Algorithms. MIT Press, Cambridge, Massachusetts, 1990. [20] J.C. Craig. Vibrotactile pattern perception: Extraordinary observers. Science, 196, 1977. [21] J.C. Craig. Some factors aecting tactile pattern perception. International Journal of Neuroscience, 19, 1983. [22] J.C. Craig and C.E. Sherrick. Dynamic tactile displays. In W. Schi and E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University Press, 1982. [23] D. Crystal. The Cambridge Encyclopedia of Language. Cambridge University Press, Cambridge, 1987. 101 [24] F. Deconinck and P. Verschueren. TIDE project 103 GUIB: A model of the understanding of graphical information by blind people. Technical report, Final Report, June 1993. [25] P.K. Edman. Tactile Graphics. American Foundation for the Blind, New York, 1992. [26] E. Foulke. Reading braille. In W. Schi and E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University Press, 1982. [27] J. Fricke and H. Baehring. Design of a tactile graphic I/O tablet and its integration into a personal computer system for blind users. Electronic Proceedings of the 1994 EASI High Resolution Tactile Graphics Conference, Available from http://www.rit.edu/easi/, 1994. [28] S.F. Frisken-Gibson, P. Back-Y-Rita, W.J. Thompkins, and J.G. Webster. A 64-solenoid, four-level ngertip search display for the blind. IEEE Transactions on Biomedical Engineering, BME-34(12), 1987. [29] J.P. Fritz and K.E. Barner. Design of a haptic graphic system. Proceedings of the RESNA '96 Annual Conference, 1996. [30] J.P. Fritz, T.P. Way, and K.E. Barner. Haptic representation of scientic data for visually impaired or blind persons. In Proceedings of the CSUN Conference on Technology and Disability, 1996. [31] L.H. Goldish and H.E. Taylor. The optacon: A valuable device for blind persons. New Outlook for the Blind, 68(2), 1974. [32] D. Grith. Computer access for persons who are blind or visually impaired: Human factors issues. Human Factors, 32(4), 1990. [33] C.J. Hasser and J.M. Weisenberger. Preliminary evaluation of a shape-memory alloy tactile feedback display. Advances in Robotics, Mechatronics and Haptic Interfaces, 49, 1993. [34] R. Hinton. First introduction to tactiles. The British Journal of Visual Impairment, 9(3), 1991. [35] K. Hirota and M. Hirose. Simulation and presentation of curved surface in virtual reality environment through surface display. In Proceedings - Virtual Reality Annual International Symposium '95, Los Alamitos, California, 1995. IEEE Computer Society Press. 102 [36] E.D. Hirsch, J.F. Kett, and J. Trel. The Dictionary of Cultural Literacy. Houghton Miin Company, Boston, 1988. [37] L.T. Hoshmand. Blindisms: Some observations and propositions. Education of the Visually Handicapped, May 1975. [38] T.S. Huang, G.J. Yang, and G.Y. Tang. A fast two dimensional median ltering algorithm. Proceedings of the IEEE Conference on Pattern Recognition and Image Processing, 1978. [39] K.O. Johnson and J.R. Phillips. Tactile spatial resolution: Two-point discrimination, gap detection, grating resolution, and letter recognition. Journal of Neurophysiology, 46, 1981. [40] P. Jubinski. VIRTAC, a virtual tactile computer display. Proceedings of the Johns Hopkins National Search for Computing Applications to Assist Persons with Disabilities, 1992. [41] J.M. Kennedy. Haptic pictures. In W. Schi and E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University Press, 1982. [42] R.L. Klatzky. Human Memory: Structures and Processes. W.H. Freeman and Company, New York, 2nd edition, 1980. [43] R.L. Klatzky, S.J. Lederman, and V.A. Metzger. Identifying objects by touch: An \expert system". Perception and Psychophysics, 37(4), 1985. [44] R.L. Klatzky, S.J. Lederman, and C. Reed. There's more to touch than meets the eye: The salience of object attributes for haptics with and without vision. Journal of Experimental Psychology: General, 116, 1987. [45] K.J. Kokjer. The information capacity of the human ngertip. IEEE Transactions on Systems, Man, and Cybernetics, SMC-17(1), 1987. [46] S.M. Kosslyn. Image and Brain: The Resolution of the Imagery Debate. MIT Press, Cambridge, Massachusetts, 1994. [47] L.E. Krueger. The psychophysics of touch. In W. Schi and E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University Press, 1982. [48] Z. Kuc. A bidirectional vibrotactile communication system: Tactual display design and attainable data rates. VLSI and Computer Peripherals, 1989. COMPEURO '89 - 3rd Annual European Computer Conference. [49] M. Kurze, L. Reichert, and T. Strothotte. Access to business graphics for blind people. Proceedings of the RESNA 17th Annual Conference, 1994. 103 [50] R.H. LaMotte and J. Whitehouse. Tactile detection of a dot on a smooth surface: Peripheral neural events. Journal of Neurophysiology, 56(4), 1986. [51] A. Lev, S.W. Zucker, and A. Rosenfeld. Iterative enhancement of noisy images. IEEE Transactions on Systems, Man and Cybernetics, SMC-7(6), 1976. [52] C.A. Lindley. Practical Image Processing in C. John Wiley and Sons, Inc., New York, 1991. [53] J.M. Loomis. On the tangibility of letters and braille. Perception and Psychophysics, 29, 1981. [54] J.M. Loomis. Tactile pattern perception. Perception, 10, 1981. [55] J.M. Loomis and S.J. Lederman. Tactual perception. In K.R. Bo, L. Kaufman, and J.P. Thomas, editors, Handbook of Perception and Human Performance. John Wiley and Sons, Inc., 1986. [56] B. Lowenfeld. Eects of blindness on the cognitive functions of children. In B. Lowenfeld, editor, Berthold Lowefeld on Blindness and Blind People. American Foundation for the Blind, New York, 1981. [57] B. Loweneld. The Changing Status of the Blind: From Separations to Integration. Charles C. Thomas, Springeld, Illinois, 1975. [58] Matsumoto Kosan Co. LTD. Stereo copying system for the blind. Product handbook, 1990. [59] T. Massie and K. Salisbury. The PHANToM haptic interface: a device for probing virtual objects. In Symposium on Haptic Interfaces for Virtual Environment and Teleoperator Systems, ASME Winter Annual Meeting, 1994. [60] B.S. Miller and W.H. Miller. Extinguishing `blindisms': A paradigm for intervention. Education of the Visually Handicapped, Spring 1976. [61] M. Minsky, M. Ouh-Young, O. Steele, F. Brooks, and M. Behensky. Feeling and seeing: Issues in force display. In Proceedings of the Symposium on 3D Real-Time, 1990. [62] M.D.R. Minsky. Computational Haptics: The Sandpaper System for Synthesizing Texture for a Force-Feedback Display. PhD thesis, Massachusetts Institute of Technology, June 1995. [63] A.H. Mitwalli, S.B. Leeb, T. Tanaka, and U. Sinha. Polymer gel actuators - status report. In Proceedings of the 29th Universities Power Engineering Symposium, Galway, Ireland, 1994. 104 [64] G.J. Monkman. An electrorheological tactile display. Presence, 1(2), 1992. [65] E.D. Mynatt and G. Weber. Nonvisual presentation of graphical user interfaces: Contrasting two approaches. In Proceedings of the CHI'94 Conference on Human Factors in Computer Systems. ACM, 1994. [66] V.S. Nalwa. A Guided Tour of Computer Vision. Addison-Wesley Publishing Company, Reading, Massachusetts, 1993. [67] T.N. Pappas. An adaptive clustering algorithm for image segmentation. IEEE Transactions on Signal Processing, 40(4), 1992. [68] R. Peier. Possible applications of polymer and photopolymer technologies to high resolution tactile graphics. Electronic Proceedings of the 1994 EASI High Resolution Tactile Graphics Conference, Available from http://www.rit.edu/easi/, 1994. [69] L. Petrosino and D. Fucci. Temporal resolution of the aging tactile sensory system. Perceptual and Motor Skills, 68, 1989. [70] K.K. Pingle. Visual perception by a computer. In A. Grasselli, editor, Automatic Interpretation and Classication of Images. Academic Press, 1969. [71] L.H.D. Poll and R.P. Waterham. Graphical user interfaces and visually disabled users. IEEE Transactions on Rehabilitation Engineering, 3(1), 1995. [72] W.K. Pratt. Digital Image Processing. John Wiley and Sons, New York, 1991. [73] Repro-Tronics Inc., Westwood, New Jersey. Setup and Operating Instructions for the Tactile Image Enhancer, 1994. Product specications. [74] E. Rich and K. Knight. Articial Intelligence. McGraw-Hill, Inc., New York, 2nd edition, 1992. [75] F.S. Roberts. Applied Combinatorics. Prentice-Hall, Inc., Englewood Clis, New Jersey, 1984. [76] A. Rosenfeld and L.S. Davis. Image segmentation and image models. Proceedings of the IEEE, 67(5), 1979. [77] J. Sardegna and T.O. Paul. The Encyclopedia of Blindness and Vision Impairment. Facts On File, New York, 1991. [78] R.J. Schalko. Digital Image Processing and Computer Vision: An Introduction to Theory and Implementations. John Wiley and Sons, New York, 1989. 105 [79] G.T. Scholl. What does it mean to be blind. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and Youth: Theory and Practice. American Foundation for the Blind, 1986. [80] A.S. Schwartz, A.J. Perey, and A. Azulay. Further analysis of active and passive touch in pattern discrimination. Bulletin of the Psychonomic Society, 6(1), 1975. [81] R.S. Schwertfeger. Making the GUI talk. Byte Magazine, December 1991. [82] C.E. Sherrick and J.C. Craig. The psychophysics of touch. In W. Schi and E. Foulke, editors, Tactual Perception: A Sourcebook. Cambridge University Press, 1982. [83] Telesensory Systems, Inc., Palo Alto, California. OPTACON Owner's Manual: Model R1D, 1978. [84] Telesensory Systems, Inc., Palo Alto, California. Optacon Announcement, Available from http://www.telesensory.com, 1996. [85] J.A. Terry and H. Hsiao. Tactile feedback in a computer mouse. Proceedings of the 14th Northeast Conference on Bioengineering, 1988. [86] J.P. Thomas. JAWS User's Guide and Reference Manual, Second Edition. Henter-Joyce, Inc., St. Petersburg, FL, 1994. [87] C.M. Thompson and L. Shure. Image Processing Toolbox: For Use with MATLAB. The Math Works, Inc., Natick, Massachusetts, 1995. [88] J.H. Todd. Resources, media, and technology. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and Youth: Theory and Practice. American Foundation for the Blind, 1986. [89] J.T. Tou and R.C. Gonzalez. Pattern Recognition Principles. Addison-Wesley Publishing Company, Reading, Massachusetts, 1974. [90] B. Tversky and D. Baratz. Memory for faces: Are caricatures better than photographs. Memory and Cognition, 13(1), 1985. [91] G.C. Vanderheiden. Systems 3 - an interface to graphic computers for blind users. Proceedings of the RESNA 13th Annual Conference, 1990. [92] G.C. Vanderheiden. Dynamic and static strategies for nonvisual presentation of graphic information. Electronic Proceedings of the 1994 EASI High Resolution Tactile Graphics Conference, Available from http://www.rit.edu/easi/, 1994. 106 [93] M.E. Ward. The visual system. In G.T. Scholl, editor, Foundations of Education for the Blind and Visually Handicapped Children and Youth: Theory and Practice. American Foundation for the Blind, 1986. [94] T. Way and K. Barner. Towards automatic generation of tactile graphics. Proceedings of the RESNA '96 Annual Conference, 1996. [95] T.P. Way and K.E. Barner. Automatic visual to tactile translation, part I: Human factors, access methods and image manipulation. IEEE Transactions on Rehabilitation Engineering, 5:81{94, March 1997. [96] T.P. Way and K.E. Barner. Automatic visual to tactile translation, part II: Evaluation of the tactile image creation system. IEEE Transactions on Rehabilitation Engineering, 5:95{105, March 1997. [97] S. Weinstein. Intensive and extensive aspects of tactile sensitivity as a function of body part, sex, and laterality. In D.R. Kenshalo, editor, The Skin Senses. Charles C. Thomas, Springeld, IL, 1968. [98] B.W. White, F.A. Saunders, L. Scadden, P. Bach-y Rita, and C.C. Collins. Seeing with skin. Perception and Psychophysics, 7, 1970. [99] S.F. Wiker, G. Vanderheiden, S. Lee, and S. Arndt. Development of tactile mice for blind access to computers: Importance of stimulation locus, object size and vibrotactile display resolution. Proceedings of the Human Factors Society 35th Annual Meeting, 1991. [100] D.H. Willis. Relationship between visual acuity, reading mode, and school systems for blind children: A 1979 replication. American Printing House, 1979. 107 Appendix A LISTING OF IMAGES These images are available for experimental purposes via the World Wide Web at http://www.asel.udel.edu/sem/research/tactile/appendix.html. A.1 Pilot Study Images 1. 2. 3. 4. 5. 6. 7. 8. Close-up of President Bill Clinton Close-up of a researcher (Tom Way) Close-up of Albert Einstein Hot air balloon Chimney end of a house Notebook computer Diagram of a human heart Space shuttle launch A.2 TACTICS Evaluation Images 1. 2. 3. 4. 5. 6. 7. Desktop computer Desktop computer (another angle) Notebook computer Astronaut taking soil sample Astronaut planting ag pole Space shuttle landing (left to right) Space shuttle landing (right to left) 108 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. Double-layer plume nuclear mushroom cloud Single-layer plume nuclear mushroom cloud Micrograph of the eyeball of Drosophiliaeye (house y) Electron micrograph of a Streptococcus bacteria (96,000x) Planet Saturn Planet Jupiter Moon Chocolate chip cookie Close-up of President Ronald Reagan Close-up of President Bill Clinton Close-up of a researcher (Tom Way) Close-up of Albert Einstein Hot air balloon Two-shot of Beavis and Butthead Two-shot of Bill Clinton and Al Gore Chinese student blocking tanks in Tiananmen Square Chinese student blocking tanks in Tiananmen Square (another angle) Golden Gate Bridge in San Francisco Twin Towers in New York City Tornado funnel cloud in Oklahoma Electron micrograph of a cell shedding HIV particles Electron micrograph of a Pinosyllis Heterocirrata worm Electron micrograph of the Ebola virus 109 Appendix B SIMPLE AND TIMED DISCRIMINATION IMAGE PAIRINGS B.1 Preparation Note that each pair was processed four dierent ways, using the four image processes under investigation. B.2 Flexi-Paper Pairs 1. Desktop computer & Notebook computer 2. Double-layer plume nuclear mushroom cloud & Single-layer plume nuclear mushroom cloud 3. Micrograph of the eyeball of Drosophiliaeye (house y) & Moon 4. Close-up of President Bill Clinton & Close-up of President Bill Clinton 5. Two-shot of Bill Clinton and Al Gore & Two-shot of Bill Clinton and Al Gore 6. Chinese student blocking tanks in Tiananmen Square & Chinese student blocking tanks in Tiananmen Square (another angle) 7. Twin Towers in New York City & Twin Towers in New York City 8. Tornado funnel cloud in Oklahoma & Tornado funnel cloud in Oklahoma 9. Electron micrograph of a cell shedding HIV particles & Electron micrograph of a cell shedding HIV particles 10. Electron micrograph of a Pinosyllis Heterocirrata worm & Electron micrograph of the Ebola virus 110 B.3 Matsumoto Kosan Paper Pairs 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Desktop computer (another angle) & Desktop computer (another angle) Astronaut taking soil sample & Astronaut planting ag pole Space shuttle landing (left to right) & Space shuttle landing (left to right) Electron micrograph of a Streptococcus bacteria (96,000x) & Electron micrograph of a Streptococcus bacteria (96,000x) Planet Saturn & Planet Saturn Close-up of President Ronald Reagan & Hot air balloon Moon & Chocolate chip cookie Close-up of Albert Einstein & Close-up of Albert Einstein Two-shot of Beavis and Butthead & Two-shot of Bill Clinton and Al Gore Golden Gate Bridge in San Francisco & Twin Towers in New York City 111 Appendix C IDENTIFICATION EXPERIMENT IMAGES AND CATEGORIES C.1 Preparation Note that each image used was processed four dierent ways, using the four image processes under investigation. The four categories associated with each image were arbitrarily arranged for each of the four applied processes. C.2 Listing of Images and Categories 1. Desktop computer A. an oce building B. a notebook computer C. a desktop computer D. a trampoline 2. Notebook computer A. a painting hanging on a wall B. an open cardboard box C. a notebook computer D. a desktop computer 3. Micrograph of the eyeball of Drosophiliaeye (house y) A. the Moon B. an oatmeal raisin cookie 112 4. 5. 6. 7. 8. 9. 10. C. an eyeball of a y D. a helicopter in ight Electron micrograph of a Streptococcus bacteria (96,000x) A. the face of Albert Einstein B. a magnied Streptococcus bacteria C. the end of a stethoscope D. a punching bag Planet Saturn A. a Frisbee B. a helicopter in ight C. the planet Jupiter D. the planet Saturn Planet Jupiter A. the planet Saturn B. the planet Jupiter C. a Frisbee D. the face of President Clinton Chocolate chip cookie A. a baseball B. a chocolate chip cookie C. the Moon D. the face of President Clinton Close-up of President Ronald Reagan A. the face of former president Ronald Reagan B. a hot air balloon C. a magnied Streptococcus bacteria D. the Moon Hot air balloon A. a hot air balloon B. a punching bag C. the planet Saturn D. an oatmeal cookie Golden Gate Bridge in San Francisco A. the Twin Towers and the New York City skyline B. the twin spans of the Golden Gate Bridge C. a helicopter in ight D. a picket fence 113 Appendix D COMPREHENSION EXPERIMENT IMAGES, DESCRIPTIONS AND QUESTIONS D.1 Preparation Note that each image was processed solely using the aggregate image process. D.2 Listing of Images, Descriptions and Questions 1. Desktop computer: This is a personal computer. 1. This computer is a: A. desktop computer B. notebook computer 2. This computer is: A. on B. o 3. This computer has a mouse that is visible. A. true B. false 4. Locate the keyboard. A. (successful) B. (not successful) 2. Notebook computer: This is a personal computer. 1. This computer is a: A. desktop computer B. notebook computer 114 2. This computer is: A. on B. o 3. This computer has a mouse that is visible. A. true B. false 4. Locate the keyboard. A. (successful) B. (not successful) 3. Astronaut planting ag pole: This is an astronaut dressed in a spacesuit, working on the surface of the Moon. 1. The astronaut is: A. using a short pole to collect a lunar soil sample B. placing a ag atop a agpole on the Moon's surface 2. The astronaut is: A. standing still B. moving 3. The astronaut is facing to the: A. left B. right 4. Locate the the astronaut's feet. A. (successful) B. (not successful) 4. Space shuttle landing (right to left): This is the Space Shuttle Endeavor landing in the California desert at Edwards Air Force Base. 1. The shuttle is headed to the: A. left B. right 2. The landing gear have already touched the ground. A. true B. false 3. An Air Force ghter jet escort is plainly present in the scene. A. true B. false 4. Locate the tail n of the Space Shuttle. A. (successful) B. (not successful) 5. Double-layer plume nuclear mushroom cloud: This is a nuclear explosion, complete with mushroom cloud. 1. How many layers of plumes are there on top of the cloud? A. one B. two 115 2. Clouds of dust have started to rise around the base of the explosion. A. true B. false 3. Locate the very top of the mushroom cloud. A. (successful) B. (not successful) 4. Locate ground zero, the likely spot where the actual bomb exploded. A. (successful) B. (not successful) 6. Chocolate chip cookie: This is a homemade cookie. 1. This cookie is a: A. chocolate chip cookie B. sugar cookie 2. Somebody has already taken a large bite out of this cookie. A. true B. false 3. How many chocolate chips are there? A. 6 or fewer B. more than 6 4. Some chips are small, others are large. Locate a large chocolate chip. A. (successful) B. (not successful) 7. Close-up of President Bill Clinton: This is President Bill Clinton. 1. This picture shows the President: A. from the waist up B. from the neck up 2. The President is wearing a brimmed hat. A. true B. false 3. Locate the President's mouth. A. (successful) B. (not successful) 4. Locate the President's eyes. A. (successful) B. (not successful) 8. Two-shot of Beavis and Butthead: This is a picture from MTV's cartoon show \Beavis and Butthead," with the two stars of the show sitting on a couch. Butthead is on the left and Beavis is on the right. 1. The one on the left, Butthead, is facing to the: A. left B. front 116 2. The one on the right, Beavis, is facing to the: A. left B. right 3. Which one has more hair? A. Butthead, on the left B. Beavis, on the right 4. One of the two has dark hair, the other has light hair. Locate the dark hair. A. (successful) B. (not successful) 9. Tornado funnel cloud in Oklahoma: This is an active tornado funnel cloud photographed recently in Oklahoma. 1. The tornado has already touched the ground. A. true B. false 2. There are buildings in the path of the tornado. A. true B. false 3. Locate the point where the tornado funnel merges with the general cloud cover in the scene. A. (successful) B. (not successful) 4. Locate the point of the tornado that is closest to, or touching, the ground. A. (successful) B. (not successful) 10. Electron micrograph of the Ebola virus: This is a highly magnied electron microscope picture of the deadly Ebola virus. 1. The overall shape of the virus is: A. straight B. curved 2. The ends of the Ebola virus are identical. A. true B. false 3. The head end of the virus has 3 loops, while the tail end is a single strand. Locate the head end. A. (successful) B. (not successful) 4. Locate the tail end. A. (successful) B. (not successful) 117 Appendix E COLLECTED TACTICS PARAMETERS Table E.1: Summary of parameters relevant to TACTICS and tactile image perception. Factor Ratio of tactual to visual bandwidths Minimum discernible separation of two points (static) Minimum discernible displacement of a point on a smooth surface Height of braille dot Minimum discernible separation of groves in grating (dynamic) Resolution of laser printer Resolution of microcapsule paper (expanded) Expanded displacement of microcapsule paper Resolution of human ngertip Resolution of ngertip compares with: Human memory organization Congenital blindness Adventitious blindness Blind population (worldwide) Blind population (U.S.) Braille uency (U.S. blind population) Best size for tactile image 118 Parameters 1:10000 2.5mm 0.002mm 0.2-0.5mm 1.0mm 7620-15240 dots/mm (300-600 dpi) 1-5 capsules/mm 0.2-1.0mm 1 dot/mm very blurry vision Hierarchical: general to specic onset up to age 5 onset after age 5 30-40 million 500,000 <16% 3-5in on a side Appendix F HUMAN SUBJECTS REVIEW BOARD EXEMPTION 119 Appendix G TACTILE IMAGE EXAMPLES Figure G.1: Electron micrograph of Ebola Zaire virus before and after processing with TACTICS. (CDC) 120 Figure G.2: Figure G.1 expanded on microcapsule paper. 121 Figure G.3: Image of space shuttle Challenger landing before and after processing with TACTICS. (NASA) 122 Figure G.4: Figure G.3 expanded on microcapsule paper. 123 Figure G.5: Image of moon before and after processing with TACTICS. (NASA) 124 Figure G.6: Figure G.5 expanded on microcapsule paper. 125 Figure G.7: Image of a face before and after processing with TACTICS. (US Govt) 126 Figure G.8: Figure G.7 expanded on microcapsule paper. 127 Figure G.9: Image of a desktop computer before and after processing with TACTICS. (public domain) 128 Figure G.10: Figure G.9 expanded on microcapsule paper. 129 Figure G.11: Image of a tornado in Oklahoma before and after processing with TACTICS. (public domain) 130 Figure G.12: Figure G.11 expanded on microcapsule paper. 131 Figure G.13: Image of Emma before and after processing with TACTICS. (personal) 132 Figure G.14: Figure G.13 expanded on microcapsule paper. 133