Shorthand Alphabets and Pen Computing 1/1

Transcription

Shorthand Alphabets and Pen Computing 1/1
SHORTHAND ALPHABETS AND PEN COMPUTING
C.C. TAPPERT
CSIS, Pace University, 861 Bedford Road, Pleasantville NY 10570
E-mail: [email protected]
This paper reviews the history of shorthand alphabets, focusing on the Roman alphabet, and
analyzes the shorthand alphabets developed for machine recognition in pen computing. We
begin with a description of a sequence of historical shorthand alphabets, then discuss
handwriting recognition, both OCR and online recognition, and finally analyze the use of
shorthand alphabets in pen computing. We find that the Graffiti recognizer in the popular
Palm OS PDA devices, notably the Palm Pilot and Handspring models, cleverly combines
known technology to produce a highly useful system. It uses a shorthand alphabet of easy to
learn symbols that must be written in a prescribed way and has separate writing areas for
alphabetic and numeric input.
1
Introduction
We review the history of shorthand alphabets, focusing on the Roman alphabet
and not covering numerics or punctuation, and describe three early alphabets that we
later relate to online alphabet systems. We then discuss handwriting recognition,
both OCR and online recognition, but focus on the latter and especially on the online
recognition difficulties and solution methodologies. Finally, we analyze the use of
shorthand alphabets developed for machine recognition in pen computing.
2
History of Shorthand Alphabets
Shorthand is “a method of writing rapidly by substituting characters,
abbreviations, or symbols for letters, words, or phrases” and can be traced back to
the Greeks [11]. We focus on orthographic shorthands that use conventional
spelling and have one symbol for each letter of the Roman alphabet. Here, we
present and describe three documented shorthand alphabets that are relevant to this
study: the Tironian, Stenographie, and Moon alphabets.
Marcus Tullius Tiro, Cicero’s secretary, devised the first extensive Latin
shorthand system to record speeches in the Roman Senate. Many Romans, including
Julius Caesar, favored shorthand and this system remained in use for over a
thousand years. The Tironian alphabet [11] (63 BC, Fig. 1) consists of 22 symbols
(no J, U, W, or Y) and 10 of them resemble in some respect the corresponding
current Roman alphabet symbols: C and I completely; K, V, and Z closely; A, B,
Shorthand Alphabets and Pen Computing
1/1
and T partially (in one stroke); and D (lowercase delta) and H are somewhat like
their lowercase counterparts.
Figure 1. Tironian Alphabet, 63 BC
During the Middle Ages shorthand became associated with witchcraft and fell
into disrepute, and it was not until the late 12th century that King Henry II revived
the use of Tironian shorthand. It is interesting that many famous writers throughout
history preferred shorthand – Cicero’s orations, Martin Luther’s sermons, and
Shakespeare’s and George Bernard Shaw’s plays were all written in a style of
shorthand.
The next alphabet presented is particularly interesting because it uses a
systematic graphical design approach of eight basic shapes in different orientations
to denote the whole alphabet: Stenographie [4] (1602, Fig. 2).
Figure 2. Stenographie Alphabet, 1602
Note that the letters C and K are represented by the same symbol and only a few of
the symbols resemble their corresponding current Roman alphabet symbols. In
Shorthand Alphabets and Pen Computing
2/2
contrast to cursive shorthand, this is an example of geometric shorthand, which is
based on geometrical figures such as circles, ovals, straight lines and combinations
of these [5]. The basic shapes (Fig. 3) are I, L, U, (, α, V, Z and +.
Figure 3. Stenographie Alphabet Basic Shapes and Orientations
Several shorthand systems arose in the 19th century spawning the effective
Pitman (1837) that uses stroke thickness to differentiate some letters and Gregg
(1885) phonetic alphabets, and the Braille (1824) system for the blind. Several
cursive shorthands such as the 1834 Gabelsberger system [5] also originated during
this time. None of these are covered here.
The next alphabet presented is Moon Type [7] (1894, Fig. 4), named for its
English inventor William Moon. It is a system of embossed letters for the blind that
was designed to require less finger sensitivity than Braille targeting those blinded
late in life. It consists of eight basic shapes derived from the Roman capital letters
and used in varying orientations to denote the whole alphabet – the basic shapes
(arranged by column in the figure) are V, J, C, L, I, Z, an angle shape, and O. Over
half of these symbols resemble in some respect their corresponding current Roman
alphabet symbols – eight completely and perhaps seven partially.
Figure 4. Moon Alphabet, 1894
Shorthand Alphabets and Pen Computing
3/3
3
Offline (OCR) Handwriting Recognition
Offline handwriting recognition, a subset of optical character recognition
(OCR), uses optical scanning equipment to convert handwriting images into bit
patterns for machine recognition. Although most OCR work has been on machineprinted characters, considerable effort has also been devoted to handwriting. A
handwritten document can be scanned at any time after it is written, even many years
later. No special equipment is necessary during the writing process, in contrast to
online recognition. However, the information captured is static and does not contain
any of the handwriting dynamics.
Recognizing handprinted characters is considerably more difficult than
recognizing machine printed characters due to the great variation among writers in
drawing the characters. In order to reduce these variations and increase recognition
accuracy, the OCR Committee of the American National Standards Institute (ANSI)
developed a standard in 1974 [15] (Fig. 5). The extra strokes on the O and S were
introduced to reduce ambiguity. Other countries have also developed standards.
Unfortunately, however, most writers of the documents processed by these systems
do not use these standards.
Figure 5. ANSI Standard Alphabet for OCR
4
Online Handwriting Recognition
4.1
What is it?
Online or real-time handwriting recognition means that the machine recognizes
the writing while the user writes. In contrast to offline, online devices capture the
temporal or dynamic information of the writing. This information consists of the
number of strokes, the order of the strokes, the direction of the writing of each
stroke, and the speed of writing within each stroke. A stroke is the writing from pen
down to pen up. In English, uppercase handprint averages about two strokes and
lowercase about one stroke, while cursive averages less than one stroke per letter.
This dynamic information can be helpful – for example, in distinguishing the
statically similar two-stroke 5 from the one-stroke S. It can also complicate the
recognition process with having to handle many variations of a character, some of
which, like the one-stroke 5, may be similar to other characters. As an example of
the large number of possible variations, the letter E can be written with one (in
Shorthand Alphabets and Pen Computing
4/4
cursive fashion), two, three, or four strokes, and with various stroke orders and
directions, and many variations appear the same when completed. The four-stroke E
consisting of a vertical and three horizontal strokes (or, for that matter, any fourstroke character) has 48 variations (24 different stroke orders and two stroke
directions for each stroke).
Unlike offline recognition, special equipment is required during the writing
process. Tablet digitizers (electronic tablets), available since the 1950’s, allow the
capture of handwriting and drawing by accurately recording the x-y coordinate data
of pen-tip movement. The advent of pen computers in the 1980’s combined
digitizers and flat displays bringing input and output into the same surface to
provide immediate electronic ink feedback of the digitized writing and mimicking
the familiar pen and paper paradigm to provide a "paper-like" interface [12, 14, 16].
With a pen computer, or a pen-enabled computer, users can not only use the pen
(writing stylus) as a mouse but also write or draw as they do with pen and paper.
Keyboard entry can be mimicked by touching sequences of buttons on a "soft"
keyboard displayed on the screen or, alternatively, handwriting can be automatically
converted to ASCII code.
Researchers have worked on many pattern recognition problems for
handwriting and drawing on tablet digitizers. The more common problems for
English include the recognition of handprinted (discrete) characters and the
recognition of cursive writing. Available handwriting recognition products are
highly accurate on careful handprinting and some products are available that
recognize cursive script with accuracy dependent on the writing style and the
regularity and clarity of the writing. In this paper we deal only with handprinting.
4.2
History of Recognition Strategies
There are many tradeoffs in designing a handwriting recognition system. For
example, at one extreme, the designer puts no constraints on the user and attempts to
recognize the user's normal writing. At the other extreme, the writing is severely
constrained, requiring the writer to write strokes in a particular order, direction, and
graphical specification.
Because computers and tablet digitizers were rather primitive in the late 1950s
and 1960s, much of the early work employed simple character segmentation and
recognition techniques and required the user to write the characters in a prescribed
manner with respect to their stroke number, order, and direction. Later systems
tended to be more flexible in that they attempted to handle most of the ways that the
alphabet characters are usually written.
Some systems allowed the user to train the system with a user’s particular way
of drawing each character. Many of these were actually hybrid systems, allowing a
user to train the system, but also recognizing the common variations in the untrained
or walk-up mode.
Shorthand Alphabets and Pen Computing
5/5
4.3
4.3.1
Handprint Recognition Difficulties and Solution Methodologies
Segmentation
One difficulty of handprint recognition is character segmentation (separation).
Although this problem is extreme in cursive writing where several characters can be
made with one stroke, it remains a significant problem with handprinted characters
because they can consists of one of more strokes and it is often not clear which
strokes should be grouped together. Segmentation ambiguities include the wellknown character within character problem where, for example, a hand printed
lowercase d might be recognized as a cl if drawn with two strokes that are somewhat
apart from one another. This problem is alleviated somewhat, however, by the fact
that character separation can occur only at stroke ends – not at the end of every
stroke but at the end of those that end a character.
Most segmentation methods are applied prior to recognition, either under writer
control or during preprocessing (machine processing prior to recognition). The
writer-controlled techniques include an explicit signal from the user, temporal
separation usually characterized by a time-out signal, writing characters in
predefined boxes, and the use of a penlift either alone or in combination with a timeout [16]. The preprocessing techniques typically involve spatial separation, which
actually is also under writer control.
Other methods involve recognition. The stroke code method [3, 8, 16]
performs stroke-by-stroke recognition prior to segmentation. That is, each stroke is
recognized immediately upon pen lift. Then in the general case of multiple stroke
characters the resulting stroke codes are combined to obtain the recognized
characters.
Another method performs character segmentation in conjunction with
recognition. In this method, each stroke end (pen lift) is a potential segmentation
point and all combinations of strokes, up to the maximum number of strokes per
character, is sent to the recognition component and the results sorted to obtain a
final output [13, 16]. This run-on method can recognize characters that run together
– those that touch or overlap or even those written on top of one another.
Perhaps the simplest segmentation method is to use an alphabet of only singlestroke characters. Burr [2] uses this method of using a pen lift to separate characters
by instructing his writers to write each character with a single stroke. This method
can also be viewed as the trivial case of either the stroke code method where each
stroke code is a character or the run-on method where only single strokes are sent to
the recognizer.
4.3.2
Uppercase versus Lowercase versus Digits
What makes handwritten communication possible is that differences between
different characters are more significant than differences between different drawings
Shorthand Alphabets and Pen Computing
6/6
of the same character, and this might be considered the fundamental property of
writing. This property holds within the subalphabets of uppercase, lowercase, and
digits, but not across them. Fig. 6 shows an example of the uppercase I, lowercase l,
and number 1 all drawn identically with a single vertical stroke, and the O and 0
drawn identically with an oval.
Figure 6. Different Characters with the Same Shape
The most general solution to this problem is to handle it the way humans do by
using context. This is usually done in a postprocessing (after recognition) phase that
uses syntax and possibly semantics to resolve ambiguities.
Simpler solutions are to use different regions of the writing surface for different
subalphabets or to use special symbols to put the system into uppercase, lowercase,
or digit mode [16].
4.3.3
Dynamic Writing Variation
Dynamic writing variation is the various ways – in stroke number, order, and
direction – that characters can be drawn. In some countries, like Japan, there is
more homogeneity in writing dynamics because children are taught to write each
character in a specific way. In other countries, like America, this writing variation is
greater. As we saw earlier with the various ways of writing E, dynamic variation
can be a serious problem, and it becomes even more complex when combined with
the segmentation problem.
The simplest way to handle wide dynamic variation is not to allow it. Thus,
many systems, especially the early ones, prescribe the limited ways each character
can be written and the writer must conform to these variations.
When allowing many variations, one solution is to reorder the strokes into a
normalized form, for example from left to right and top to bottom, and writing
direction could be handled similarly. For example, a four stroke E would be
reordered into the vertical stroke followed by the three horizontal ones from top to
bottom. Another method is to incorporate the more common variations and either
inform the users as to what they are or simply let them adapt to the system by trial
and error.
Other techniques, like stroke codes or single stroke alphabets mentioned earlier,
can also be used to solve this problem.
Shorthand Alphabets and Pen Computing
7/7
4.3.4
Sloppiness of the Writing
Perhaps the most difficult problem is sloppy writing, and size and slant
variation can also be included here.
The easiest solution, and that used in many systems, is to use an alphabet in
which the symbols are quite different from each other, and that is the reason many of
the shorthand alphabets are based on a small number of shapes in different
orientations. The Roman alphabet, unfortunately, does not have this property since
many letter pairs, such as the U and V, are similar.
Solution to the more general problem involves sophisticated recognition
algorithms like dynamic programming, Markov models, and Affine transformations
[12, 14, 16].
4.4
Future
The ideal recognition system is one that supports handwritten input of Unicode
characters and does not impose any constraints on what and where the user writes.
The user should also be able to intermingle text with graphics and gestures, and the
recognizer should be able to distinguish these kinds of input. Even though recent
advances in handwriting recognition are promising, further research is warranted
since the above goals have yet to be achieved.
Similar to the ANSI alphabet standard of constrained fonts for OCR
handwriting recognition, standards for online drawing of characters has been
suggested as appropriate for certain applications due to the higher recognition rate
that could be achieved [16].
5
Shorthand Alphabets in Pen Computing
Shorthand in the field of handwriting recognition is well known. Some of the
earliest instances were in the field of CAD/CAM applications where symbols were
used to represent various graphical items and commands. Later, shorthand was used
to represent scientific symbols and notations, and Pitman shorthand was also
implemented. Other systems used special alphabets and symbols for online
character recognition and we present and discuss several of these in this section.
Any of the three historical alphabets presented above could be used for machine
recognition and all of the symbols of those alphabets are drawn with a single stroke
except for the K in Tironian and the + in Stenographie. In addition to shape and
orientation, online systems can also use stroke direction to differentiate among
symbols. We present and discuss four alphabet systems: Organek, Allen, Goldberg,
and Graffiti. Stroke direction of online symbols is usually indicated either with a
dot at the start of the stroke or with an arrow at the end.
Shorthand Alphabets and Pen Computing
8/8
The Organek system uses straight lines in eight different directions and three
different lengths [9] (Fig. 7). Each of these alphabet symbols is drawn with a single
stroke (starting at the origin in the figure) and sequences of the symbols can be
connected to form words. This alphabet has only one basic shape, a straight line, in
four orientations (the same orientations used in other geometric shorthands), three
lengths, and two stroke directions to represent the first 24 letters of the alphabet.
The y and z are represented in a second wheel along with numerals and other
characters. These symbols do not resemble those of the Roman alphabet.
Figure 7. Organek Alphabet
The Allen alphabet [1] (Fig. 8) consists of four basic shapes in various
orientations and stroke directions to denote the whole alphabet, and each of these
alphabet symbols is drawn with a single stroke.
Shorthand Alphabets and Pen Computing
9/9
Figure 8. Allen Alphabet
The four basic stroke shapes (Fig. 9) are a straight line, and three two-linesegment strokes with angle changes of 45, 90, and 135 degrees. The straight stroke
has four orientations and two possible writing directions for a total of eight
possibilities. Each of the angle change strokes has eight orientations but the writing
direction is not used as a differentiator – that is, both writing directions are used to
represent the same letter. Therefore, a total of 32 symbols can be represented with
this alphabet.
Shorthand Alphabets and Pen Computing
10/10
Figure 9. Allen Alphabet Basic Shapes and Orientations
Although there can be little similarity of these symbols to their Roman
counterparts, given the symbol constraints, there is some attempt in that direction the I matches exactly and the F and T match partially (the same as in Graffiti, see
below), as does A (but not as closely as Graffiti). The vowels, except for A, are
made from straight strokes presumably for speed of input since vowels are high
frequency letters.
The Goldberg alphabet [6] (Fig. 10) is designed for accurate machine
recognition and speed of entry.
Figure 10. Goldberg Alphabet
Goldberg’s alphabet symbols come from five basic shapes (I, L, tight U, Z, and α)
each rotated in four different orientations (Fig. 11), and each capable of being drawn
with two stroke directions. Thus, with the five shapes, four orientations, and two
stroke directions, up to 40 different output symbols can be represented.
Figure 11. Goldberg Alphabet Basic Shapes and Orientations
The symbols are graphically well separated from each other for ease of machine
recognition and, within the design constraints, several of the alphabet symbols are
similar to their Roman, mostly lowercase, counterparts. While the specially drawn
symbols of this alphabet, and to a lesser extent the two preceding alphabets, are easy
to recognize automatically, the disadvantage is that one must remember the unique
way to draw the symbols and consistently draw each symbol accurately.
Shorthand Alphabets and Pen Computing
11/11
The basic shapes are simple so they can be drawn quickly, and to optimize for
writing speed the alphabet is chosen to assign the simplest shapes to the frequently
used letters. For example, straight strokes are used for the common letters a, e, i, r,
and t.
The symbols of this alphabet are single stroke lexical symbols and, as with the
Burr system [2], the single stroke per symbol recognition method avoids the
common recognition difficulties.
The Graffiti alphabet [10] (Fig. 12) is used in the popular Palm OS devices,
notably the Palm Pilot and Handspring models. However, it is not composed of a
small number of basic shapes but rather has a high correspondence with the Roman
alphabet – twenty letters match exactly (19 with uppercase and one with lowercase
Roman alphabet symbols) and six match partially. Of the partial matches, the
symbol for A (a typical first stroke of the Roman A) is that of the Tironian,
Stenographie, and Moon alphabets; the F (again the first stroke of the Roman F) and
K (basically the second stroke of the Roman K) are close to those of the Moon
alphabet; T is from Tironian; and Y is a one-stroke way to draw the Roman Y. This
high correspondence with the Roman alphabet makes it easier to learn than the
basic-shape alphabets described above, but at the sacrifice of graphical separability
and speed of entry.
Figure 12. Graffiti Alphabet
This alphabet has one symbol that requires two strokes, the X (although there is
an acceptable one-stroke variant), and the extended alphabet of this system has other
multiple stroke characters. This system uses the stroke code recognition method and
separate writing areas for the alphabetic and numeric symbols to avoid the common
recognition difficulties.
6
Conclusions
Shorthand alphabets are frequently designed for use in pen computing because
their symbols are drawn quickly with few strokes and can essentially avoid or at
least simplify the common recognition difficulties. The Graffiti system uses
symbols similar to the Roman alphabet for ease of learning. The few symbols that
Shorthand Alphabets and Pen Computing
12/12
do not correspond exactly to their Roman counterparts have one stroke in common
or other similarities, and several appear to have evolved from the historical
alphabets. This system uses the stroke code recognition method and separate
writing areas for subalphabets to simplify the recognition process. We conclude that
the designers have cleverly combined known techniques to create a highly useful
system.
In spite of the clever design of Graffiti, we hear that most Palm users prefer the
“soft” keyboard to handwriting input, indicating that the devices are popular more
for their form factor and other attributes than for their handwriting capability.
Finally, we note that the general unconstrained handprint recognition problem is as
yet unsolved and will likely require advanced syntactic and semantic processing
capability.
7
Acknowledgements
I thank my colleagues of the graphonomics and handwriting recognition
communities for interesting and educational interactions over the years I have
worked in this area.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Allen, G., “Data input grid for computer,” U.S. Patent 5,214,428, 1993.
Burr, D.J., “Designing a handwriting reader,” IEEE Transactions Pattern
Analysis Machine Intelligence, Vol. PAMI-5, 1983, pp. 554-559.
Crane, H.D., “Process and apparatus involving pattern recognition,” U.S. Patent
4,718,102, 1988.
Daniels P.T. and Bright W. (eds.), The World’s Writing Systems, Oxford Press,
1996.
Glatte, H., Shorthand Systems of the World, Philosophical Library, 1959.
Goldberg, D., “Unistrokes for computerized interpretation of handwriting,”
U.S. Patent 5,596,656, 1997.
Gove, P.B. (ed.), Webster’s Third New International Dictionary, 1986.
Loh, S-C, “On-line handwritten character recognition apparatus with nonambiguity algorithm,” U.S. Patent 5,034,989, 1991.
Organek Technology Ad, “penput – Beyond character recognition!” 1991.
Palm Computing, “Palm Pilot: Graffiti Reference Card.”
Panati, C., “The Browser's Book of Beginnings,” Houghton Mifflin, 1984.
Schomaker L., “From handwriting analysis to pen-computer applications,”
Electronics and Communication Eng. J., 1998, pp. 93-102.
Sklarew, R., “Handwritten keyboardless entry computer system,” U.S. Patent
4,972,496, 1990.
Shorthand Alphabets and Pen Computing
13/13
14. Subrahmonia J. and Zimmerman T., “Pen computing: challenges and
applications,” Proc. 15th Intl. Conf. Pattern Recognition, Vol. 2, 2000, pp. 6066.
15. Suen, C.Y., “Automatic recognition of handprinted characters – The state of the
art,” Proc. IEEE, Vol. 68, No. 4, 1980, pp. 469-487.
16. Tappert C.C., Suen C.Y., and Wakahara T., "The state-of-the-art in on-line
handwriting recognition," IEEE Transactions Pattern Analysis Machine
Intelligence, Vol. PAMI-12, 1990, pp. 787-808.
Shorthand Alphabets and Pen Computing
14/14