Shorthand Alphabets and Pen Computing 1/1
Transcription
Shorthand Alphabets and Pen Computing 1/1
SHORTHAND ALPHABETS AND PEN COMPUTING C.C. TAPPERT CSIS, Pace University, 861 Bedford Road, Pleasantville NY 10570 E-mail: [email protected] This paper reviews the history of shorthand alphabets, focusing on the Roman alphabet, and analyzes the shorthand alphabets developed for machine recognition in pen computing. We begin with a description of a sequence of historical shorthand alphabets, then discuss handwriting recognition, both OCR and online recognition, and finally analyze the use of shorthand alphabets in pen computing. We find that the Graffiti recognizer in the popular Palm OS PDA devices, notably the Palm Pilot and Handspring models, cleverly combines known technology to produce a highly useful system. It uses a shorthand alphabet of easy to learn symbols that must be written in a prescribed way and has separate writing areas for alphabetic and numeric input. 1 Introduction We review the history of shorthand alphabets, focusing on the Roman alphabet and not covering numerics or punctuation, and describe three early alphabets that we later relate to online alphabet systems. We then discuss handwriting recognition, both OCR and online recognition, but focus on the latter and especially on the online recognition difficulties and solution methodologies. Finally, we analyze the use of shorthand alphabets developed for machine recognition in pen computing. 2 History of Shorthand Alphabets Shorthand is “a method of writing rapidly by substituting characters, abbreviations, or symbols for letters, words, or phrases” and can be traced back to the Greeks [11]. We focus on orthographic shorthands that use conventional spelling and have one symbol for each letter of the Roman alphabet. Here, we present and describe three documented shorthand alphabets that are relevant to this study: the Tironian, Stenographie, and Moon alphabets. Marcus Tullius Tiro, Cicero’s secretary, devised the first extensive Latin shorthand system to record speeches in the Roman Senate. Many Romans, including Julius Caesar, favored shorthand and this system remained in use for over a thousand years. The Tironian alphabet [11] (63 BC, Fig. 1) consists of 22 symbols (no J, U, W, or Y) and 10 of them resemble in some respect the corresponding current Roman alphabet symbols: C and I completely; K, V, and Z closely; A, B, Shorthand Alphabets and Pen Computing 1/1 and T partially (in one stroke); and D (lowercase delta) and H are somewhat like their lowercase counterparts. Figure 1. Tironian Alphabet, 63 BC During the Middle Ages shorthand became associated with witchcraft and fell into disrepute, and it was not until the late 12th century that King Henry II revived the use of Tironian shorthand. It is interesting that many famous writers throughout history preferred shorthand – Cicero’s orations, Martin Luther’s sermons, and Shakespeare’s and George Bernard Shaw’s plays were all written in a style of shorthand. The next alphabet presented is particularly interesting because it uses a systematic graphical design approach of eight basic shapes in different orientations to denote the whole alphabet: Stenographie [4] (1602, Fig. 2). Figure 2. Stenographie Alphabet, 1602 Note that the letters C and K are represented by the same symbol and only a few of the symbols resemble their corresponding current Roman alphabet symbols. In Shorthand Alphabets and Pen Computing 2/2 contrast to cursive shorthand, this is an example of geometric shorthand, which is based on geometrical figures such as circles, ovals, straight lines and combinations of these [5]. The basic shapes (Fig. 3) are I, L, U, (, α, V, Z and +. Figure 3. Stenographie Alphabet Basic Shapes and Orientations Several shorthand systems arose in the 19th century spawning the effective Pitman (1837) that uses stroke thickness to differentiate some letters and Gregg (1885) phonetic alphabets, and the Braille (1824) system for the blind. Several cursive shorthands such as the 1834 Gabelsberger system [5] also originated during this time. None of these are covered here. The next alphabet presented is Moon Type [7] (1894, Fig. 4), named for its English inventor William Moon. It is a system of embossed letters for the blind that was designed to require less finger sensitivity than Braille targeting those blinded late in life. It consists of eight basic shapes derived from the Roman capital letters and used in varying orientations to denote the whole alphabet – the basic shapes (arranged by column in the figure) are V, J, C, L, I, Z, an angle shape, and O. Over half of these symbols resemble in some respect their corresponding current Roman alphabet symbols – eight completely and perhaps seven partially. Figure 4. Moon Alphabet, 1894 Shorthand Alphabets and Pen Computing 3/3 3 Offline (OCR) Handwriting Recognition Offline handwriting recognition, a subset of optical character recognition (OCR), uses optical scanning equipment to convert handwriting images into bit patterns for machine recognition. Although most OCR work has been on machineprinted characters, considerable effort has also been devoted to handwriting. A handwritten document can be scanned at any time after it is written, even many years later. No special equipment is necessary during the writing process, in contrast to online recognition. However, the information captured is static and does not contain any of the handwriting dynamics. Recognizing handprinted characters is considerably more difficult than recognizing machine printed characters due to the great variation among writers in drawing the characters. In order to reduce these variations and increase recognition accuracy, the OCR Committee of the American National Standards Institute (ANSI) developed a standard in 1974 [15] (Fig. 5). The extra strokes on the O and S were introduced to reduce ambiguity. Other countries have also developed standards. Unfortunately, however, most writers of the documents processed by these systems do not use these standards. Figure 5. ANSI Standard Alphabet for OCR 4 Online Handwriting Recognition 4.1 What is it? Online or real-time handwriting recognition means that the machine recognizes the writing while the user writes. In contrast to offline, online devices capture the temporal or dynamic information of the writing. This information consists of the number of strokes, the order of the strokes, the direction of the writing of each stroke, and the speed of writing within each stroke. A stroke is the writing from pen down to pen up. In English, uppercase handprint averages about two strokes and lowercase about one stroke, while cursive averages less than one stroke per letter. This dynamic information can be helpful – for example, in distinguishing the statically similar two-stroke 5 from the one-stroke S. It can also complicate the recognition process with having to handle many variations of a character, some of which, like the one-stroke 5, may be similar to other characters. As an example of the large number of possible variations, the letter E can be written with one (in Shorthand Alphabets and Pen Computing 4/4 cursive fashion), two, three, or four strokes, and with various stroke orders and directions, and many variations appear the same when completed. The four-stroke E consisting of a vertical and three horizontal strokes (or, for that matter, any fourstroke character) has 48 variations (24 different stroke orders and two stroke directions for each stroke). Unlike offline recognition, special equipment is required during the writing process. Tablet digitizers (electronic tablets), available since the 1950’s, allow the capture of handwriting and drawing by accurately recording the x-y coordinate data of pen-tip movement. The advent of pen computers in the 1980’s combined digitizers and flat displays bringing input and output into the same surface to provide immediate electronic ink feedback of the digitized writing and mimicking the familiar pen and paper paradigm to provide a "paper-like" interface [12, 14, 16]. With a pen computer, or a pen-enabled computer, users can not only use the pen (writing stylus) as a mouse but also write or draw as they do with pen and paper. Keyboard entry can be mimicked by touching sequences of buttons on a "soft" keyboard displayed on the screen or, alternatively, handwriting can be automatically converted to ASCII code. Researchers have worked on many pattern recognition problems for handwriting and drawing on tablet digitizers. The more common problems for English include the recognition of handprinted (discrete) characters and the recognition of cursive writing. Available handwriting recognition products are highly accurate on careful handprinting and some products are available that recognize cursive script with accuracy dependent on the writing style and the regularity and clarity of the writing. In this paper we deal only with handprinting. 4.2 History of Recognition Strategies There are many tradeoffs in designing a handwriting recognition system. For example, at one extreme, the designer puts no constraints on the user and attempts to recognize the user's normal writing. At the other extreme, the writing is severely constrained, requiring the writer to write strokes in a particular order, direction, and graphical specification. Because computers and tablet digitizers were rather primitive in the late 1950s and 1960s, much of the early work employed simple character segmentation and recognition techniques and required the user to write the characters in a prescribed manner with respect to their stroke number, order, and direction. Later systems tended to be more flexible in that they attempted to handle most of the ways that the alphabet characters are usually written. Some systems allowed the user to train the system with a user’s particular way of drawing each character. Many of these were actually hybrid systems, allowing a user to train the system, but also recognizing the common variations in the untrained or walk-up mode. Shorthand Alphabets and Pen Computing 5/5 4.3 4.3.1 Handprint Recognition Difficulties and Solution Methodologies Segmentation One difficulty of handprint recognition is character segmentation (separation). Although this problem is extreme in cursive writing where several characters can be made with one stroke, it remains a significant problem with handprinted characters because they can consists of one of more strokes and it is often not clear which strokes should be grouped together. Segmentation ambiguities include the wellknown character within character problem where, for example, a hand printed lowercase d might be recognized as a cl if drawn with two strokes that are somewhat apart from one another. This problem is alleviated somewhat, however, by the fact that character separation can occur only at stroke ends – not at the end of every stroke but at the end of those that end a character. Most segmentation methods are applied prior to recognition, either under writer control or during preprocessing (machine processing prior to recognition). The writer-controlled techniques include an explicit signal from the user, temporal separation usually characterized by a time-out signal, writing characters in predefined boxes, and the use of a penlift either alone or in combination with a timeout [16]. The preprocessing techniques typically involve spatial separation, which actually is also under writer control. Other methods involve recognition. The stroke code method [3, 8, 16] performs stroke-by-stroke recognition prior to segmentation. That is, each stroke is recognized immediately upon pen lift. Then in the general case of multiple stroke characters the resulting stroke codes are combined to obtain the recognized characters. Another method performs character segmentation in conjunction with recognition. In this method, each stroke end (pen lift) is a potential segmentation point and all combinations of strokes, up to the maximum number of strokes per character, is sent to the recognition component and the results sorted to obtain a final output [13, 16]. This run-on method can recognize characters that run together – those that touch or overlap or even those written on top of one another. Perhaps the simplest segmentation method is to use an alphabet of only singlestroke characters. Burr [2] uses this method of using a pen lift to separate characters by instructing his writers to write each character with a single stroke. This method can also be viewed as the trivial case of either the stroke code method where each stroke code is a character or the run-on method where only single strokes are sent to the recognizer. 4.3.2 Uppercase versus Lowercase versus Digits What makes handwritten communication possible is that differences between different characters are more significant than differences between different drawings Shorthand Alphabets and Pen Computing 6/6 of the same character, and this might be considered the fundamental property of writing. This property holds within the subalphabets of uppercase, lowercase, and digits, but not across them. Fig. 6 shows an example of the uppercase I, lowercase l, and number 1 all drawn identically with a single vertical stroke, and the O and 0 drawn identically with an oval. Figure 6. Different Characters with the Same Shape The most general solution to this problem is to handle it the way humans do by using context. This is usually done in a postprocessing (after recognition) phase that uses syntax and possibly semantics to resolve ambiguities. Simpler solutions are to use different regions of the writing surface for different subalphabets or to use special symbols to put the system into uppercase, lowercase, or digit mode [16]. 4.3.3 Dynamic Writing Variation Dynamic writing variation is the various ways – in stroke number, order, and direction – that characters can be drawn. In some countries, like Japan, there is more homogeneity in writing dynamics because children are taught to write each character in a specific way. In other countries, like America, this writing variation is greater. As we saw earlier with the various ways of writing E, dynamic variation can be a serious problem, and it becomes even more complex when combined with the segmentation problem. The simplest way to handle wide dynamic variation is not to allow it. Thus, many systems, especially the early ones, prescribe the limited ways each character can be written and the writer must conform to these variations. When allowing many variations, one solution is to reorder the strokes into a normalized form, for example from left to right and top to bottom, and writing direction could be handled similarly. For example, a four stroke E would be reordered into the vertical stroke followed by the three horizontal ones from top to bottom. Another method is to incorporate the more common variations and either inform the users as to what they are or simply let them adapt to the system by trial and error. Other techniques, like stroke codes or single stroke alphabets mentioned earlier, can also be used to solve this problem. Shorthand Alphabets and Pen Computing 7/7 4.3.4 Sloppiness of the Writing Perhaps the most difficult problem is sloppy writing, and size and slant variation can also be included here. The easiest solution, and that used in many systems, is to use an alphabet in which the symbols are quite different from each other, and that is the reason many of the shorthand alphabets are based on a small number of shapes in different orientations. The Roman alphabet, unfortunately, does not have this property since many letter pairs, such as the U and V, are similar. Solution to the more general problem involves sophisticated recognition algorithms like dynamic programming, Markov models, and Affine transformations [12, 14, 16]. 4.4 Future The ideal recognition system is one that supports handwritten input of Unicode characters and does not impose any constraints on what and where the user writes. The user should also be able to intermingle text with graphics and gestures, and the recognizer should be able to distinguish these kinds of input. Even though recent advances in handwriting recognition are promising, further research is warranted since the above goals have yet to be achieved. Similar to the ANSI alphabet standard of constrained fonts for OCR handwriting recognition, standards for online drawing of characters has been suggested as appropriate for certain applications due to the higher recognition rate that could be achieved [16]. 5 Shorthand Alphabets in Pen Computing Shorthand in the field of handwriting recognition is well known. Some of the earliest instances were in the field of CAD/CAM applications where symbols were used to represent various graphical items and commands. Later, shorthand was used to represent scientific symbols and notations, and Pitman shorthand was also implemented. Other systems used special alphabets and symbols for online character recognition and we present and discuss several of these in this section. Any of the three historical alphabets presented above could be used for machine recognition and all of the symbols of those alphabets are drawn with a single stroke except for the K in Tironian and the + in Stenographie. In addition to shape and orientation, online systems can also use stroke direction to differentiate among symbols. We present and discuss four alphabet systems: Organek, Allen, Goldberg, and Graffiti. Stroke direction of online symbols is usually indicated either with a dot at the start of the stroke or with an arrow at the end. Shorthand Alphabets and Pen Computing 8/8 The Organek system uses straight lines in eight different directions and three different lengths [9] (Fig. 7). Each of these alphabet symbols is drawn with a single stroke (starting at the origin in the figure) and sequences of the symbols can be connected to form words. This alphabet has only one basic shape, a straight line, in four orientations (the same orientations used in other geometric shorthands), three lengths, and two stroke directions to represent the first 24 letters of the alphabet. The y and z are represented in a second wheel along with numerals and other characters. These symbols do not resemble those of the Roman alphabet. Figure 7. Organek Alphabet The Allen alphabet [1] (Fig. 8) consists of four basic shapes in various orientations and stroke directions to denote the whole alphabet, and each of these alphabet symbols is drawn with a single stroke. Shorthand Alphabets and Pen Computing 9/9 Figure 8. Allen Alphabet The four basic stroke shapes (Fig. 9) are a straight line, and three two-linesegment strokes with angle changes of 45, 90, and 135 degrees. The straight stroke has four orientations and two possible writing directions for a total of eight possibilities. Each of the angle change strokes has eight orientations but the writing direction is not used as a differentiator – that is, both writing directions are used to represent the same letter. Therefore, a total of 32 symbols can be represented with this alphabet. Shorthand Alphabets and Pen Computing 10/10 Figure 9. Allen Alphabet Basic Shapes and Orientations Although there can be little similarity of these symbols to their Roman counterparts, given the symbol constraints, there is some attempt in that direction the I matches exactly and the F and T match partially (the same as in Graffiti, see below), as does A (but not as closely as Graffiti). The vowels, except for A, are made from straight strokes presumably for speed of input since vowels are high frequency letters. The Goldberg alphabet [6] (Fig. 10) is designed for accurate machine recognition and speed of entry. Figure 10. Goldberg Alphabet Goldberg’s alphabet symbols come from five basic shapes (I, L, tight U, Z, and α) each rotated in four different orientations (Fig. 11), and each capable of being drawn with two stroke directions. Thus, with the five shapes, four orientations, and two stroke directions, up to 40 different output symbols can be represented. Figure 11. Goldberg Alphabet Basic Shapes and Orientations The symbols are graphically well separated from each other for ease of machine recognition and, within the design constraints, several of the alphabet symbols are similar to their Roman, mostly lowercase, counterparts. While the specially drawn symbols of this alphabet, and to a lesser extent the two preceding alphabets, are easy to recognize automatically, the disadvantage is that one must remember the unique way to draw the symbols and consistently draw each symbol accurately. Shorthand Alphabets and Pen Computing 11/11 The basic shapes are simple so they can be drawn quickly, and to optimize for writing speed the alphabet is chosen to assign the simplest shapes to the frequently used letters. For example, straight strokes are used for the common letters a, e, i, r, and t. The symbols of this alphabet are single stroke lexical symbols and, as with the Burr system [2], the single stroke per symbol recognition method avoids the common recognition difficulties. The Graffiti alphabet [10] (Fig. 12) is used in the popular Palm OS devices, notably the Palm Pilot and Handspring models. However, it is not composed of a small number of basic shapes but rather has a high correspondence with the Roman alphabet – twenty letters match exactly (19 with uppercase and one with lowercase Roman alphabet symbols) and six match partially. Of the partial matches, the symbol for A (a typical first stroke of the Roman A) is that of the Tironian, Stenographie, and Moon alphabets; the F (again the first stroke of the Roman F) and K (basically the second stroke of the Roman K) are close to those of the Moon alphabet; T is from Tironian; and Y is a one-stroke way to draw the Roman Y. This high correspondence with the Roman alphabet makes it easier to learn than the basic-shape alphabets described above, but at the sacrifice of graphical separability and speed of entry. Figure 12. Graffiti Alphabet This alphabet has one symbol that requires two strokes, the X (although there is an acceptable one-stroke variant), and the extended alphabet of this system has other multiple stroke characters. This system uses the stroke code recognition method and separate writing areas for the alphabetic and numeric symbols to avoid the common recognition difficulties. 6 Conclusions Shorthand alphabets are frequently designed for use in pen computing because their symbols are drawn quickly with few strokes and can essentially avoid or at least simplify the common recognition difficulties. The Graffiti system uses symbols similar to the Roman alphabet for ease of learning. The few symbols that Shorthand Alphabets and Pen Computing 12/12 do not correspond exactly to their Roman counterparts have one stroke in common or other similarities, and several appear to have evolved from the historical alphabets. This system uses the stroke code recognition method and separate writing areas for subalphabets to simplify the recognition process. We conclude that the designers have cleverly combined known techniques to create a highly useful system. In spite of the clever design of Graffiti, we hear that most Palm users prefer the “soft” keyboard to handwriting input, indicating that the devices are popular more for their form factor and other attributes than for their handwriting capability. Finally, we note that the general unconstrained handprint recognition problem is as yet unsolved and will likely require advanced syntactic and semantic processing capability. 7 Acknowledgements I thank my colleagues of the graphonomics and handwriting recognition communities for interesting and educational interactions over the years I have worked in this area. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Allen, G., “Data input grid for computer,” U.S. Patent 5,214,428, 1993. Burr, D.J., “Designing a handwriting reader,” IEEE Transactions Pattern Analysis Machine Intelligence, Vol. PAMI-5, 1983, pp. 554-559. Crane, H.D., “Process and apparatus involving pattern recognition,” U.S. Patent 4,718,102, 1988. Daniels P.T. and Bright W. (eds.), The World’s Writing Systems, Oxford Press, 1996. Glatte, H., Shorthand Systems of the World, Philosophical Library, 1959. Goldberg, D., “Unistrokes for computerized interpretation of handwriting,” U.S. Patent 5,596,656, 1997. Gove, P.B. (ed.), Webster’s Third New International Dictionary, 1986. Loh, S-C, “On-line handwritten character recognition apparatus with nonambiguity algorithm,” U.S. Patent 5,034,989, 1991. Organek Technology Ad, “penput – Beyond character recognition!” 1991. Palm Computing, “Palm Pilot: Graffiti Reference Card.” Panati, C., “The Browser's Book of Beginnings,” Houghton Mifflin, 1984. Schomaker L., “From handwriting analysis to pen-computer applications,” Electronics and Communication Eng. J., 1998, pp. 93-102. Sklarew, R., “Handwritten keyboardless entry computer system,” U.S. Patent 4,972,496, 1990. Shorthand Alphabets and Pen Computing 13/13 14. Subrahmonia J. and Zimmerman T., “Pen computing: challenges and applications,” Proc. 15th Intl. Conf. Pattern Recognition, Vol. 2, 2000, pp. 6066. 15. Suen, C.Y., “Automatic recognition of handprinted characters – The state of the art,” Proc. IEEE, Vol. 68, No. 4, 1980, pp. 469-487. 16. Tappert C.C., Suen C.Y., and Wakahara T., "The state-of-the-art in on-line handwriting recognition," IEEE Transactions Pattern Analysis Machine Intelligence, Vol. PAMI-12, 1990, pp. 787-808. Shorthand Alphabets and Pen Computing 14/14