Introduction

Transcription

Introduction

Documentation of Conferences with indexed data access
Master’s Thesis
at
Graz University of Technology
submitted by
Bernd Kohlmaier
F874 – 9032604
Institute for Information Processing and Computer Supported New Media
(IICM),
Graz University of Technology
A-8010 Graz, Austria
January 2000
© 2000, adhoc Hard- und Software Ges.m.b.H Nfg Kg
Advisor:
o.Univ.-Prof. Dr. Dr.h.c. Hermann Maurer
Supervisor: Dipl. Ing. Thomas Dietinger
Dokumentation von Konferenzen mit indiziertem Datenzugriff
Diplomarbeit
An der
Technischen Universität Graz
Vorgelegt von
Bernd Kohlmaier
F874 – 9032604
Institut für Informationsverarbeitung und Computergestützte neue Medien
(IICM),
Technische Universität Graz
A-8010 Graz, Österreich
Jänner 2000
© 2000, adhoc Hard- und Software Ges.m.b.H Nfg Kg
Begutachter: o.Univ.-Prof. Dr. Dr.h.c. Hermann Maurer
Betreuer:
Dipl. Ing. Thomas Dietinger
Abstract
Meetings, conferences and individual arrangements are a very important part of a company’s
life. No matter if business meetings take place with personal presence or arrangements are
made on the telephone, all the necessary information must be preserved for later retrieval. It
suggests itself to create an audio or video recording to get a complete documentation and
therefore to be able to retrieve the exact wordings at a later time.
This thesis describes an application called ConfDoc that helps to generate such a complete
documentation of conferences or meetings and quickly retrieve specific information within
one or more documentation sets. Using so-called index data to seek within a single document
allows to find specific information quickly without reviewing the whole recording.
Porting the system to portable computers like handhelds or palmtops allows to use it on the
road which is an important point when the meeting takes place in a client’s room.
Kurzfassung
Einen wesentlichen Bestandteil des betrieblichen Alltags bilden Besprechnungen,
Konferenzen und individuelle Vereinbarungen. Unabhängig davon, an welchem Ort diese
Besprechungen stattfinden besteht die Notwendigkeit alle relevanten Informationen für den
späteren Gebrauch aufzuzeichnen. Naheliegend ist, eine Audio- oder Videoaufzeichnung zu
erstellen, um mit Hilfe dieser vollständigen Dokumentation das Gesagte zu einem späteren
Zeitpunkt wortwörtlich nachvollziehen zu können.
Diese Diplomarbeit beschreibt ConfDoc, ein Programm um obig beschriebene
Dokumentationen einfach zu erstellen. Zusätzlich gestattet ConfDoc eine schnelle Suche nach
relevanten Informationen innerhalb eines einzelnen oder mehrerer Dokumente. Die
Referenzierung mit Hilfe sog. Indexdaten innerhalb eines Dokumentes ermöglicht die rasche
Informationsauffindung spezifischer Daten ohne jeweils die gesamte Aufzeichnung sichten zu
müssen.
Eine eigene ConfDoc - Version für tragbare Computer (Handhelds oder Palmtops) bringt den
Vorteil, stets ein handliches und unauffälliges Equipment zur Verfügung zu haben. Die
praktische Anwendung ist vor allem dann gegeben, wenn Besprechungen in den
Räumlichkeiten des Kunden stattfinden.
I hereby certify that the work presented in this thesis is my own and that work performed by
other is properly cited.
Ich versichere hiermit, diese Arbeit selbständig verfaßt, andere als die angegebenen Quellen
nicht benutzt und mich auch sonst keiner unerlaubter Hilfsmittel bedient zu haben.
vi
Contents
1 Introduction ...........................................................................................................................1
2 Current documentation techniques .....................................................................................3
2.1 Handwritten notes .............................................................................................................3
2.2 Audio / Video recording ...................................................................................................3
2.3 Using “standard software” ................................................................................................4
2.4 Authoring on the fly..........................................................................................................5
2.5 Conclusion ........................................................................................................................7
3 Patent search ..........................................................................................................................9
3.1 Request for a patent search ...............................................................................................9
3.1.1 Beschreibung..............................................................................................................9
3.1.2 Patentanspruch .........................................................................................................12
3.1.3 Zusammenfassung....................................................................................................12
3.2 Result EP0495612 – A data access system.....................................................................12
3.2.1 General .....................................................................................................................12
3.2.2 Abstract ....................................................................................................................13
3.2.3 Description ...............................................................................................................13
3.2.4 Figures ......................................................................................................................16
3.2.5 Patent claims ............................................................................................................17
3.2.6 Comparison to our request .......................................................................................18
3.3 Result US5172281 – Video Transcript Retriever ...........................................................18
3.3.1 General .....................................................................................................................18
3.3.2 Abstract ....................................................................................................................19
3.3.3 Patent claims ............................................................................................................19
3.3.4 Comparison to our request .......................................................................................21
3.4 Conclusion ......................................................................................................................21
4 Requirements on a new system...........................................................................................23
4.1 Requirements made by the user: Optimal scenario.........................................................24
4.1.1 Recording a document..............................................................................................24
4.1.2 After the conference .................................................................................................24
4.1.3 Playback and usage of the document .......................................................................25
4.2 Aspects seen by the developer ........................................................................................25
vii
4.2.1 Types of information................................................................................................26
4.2.2 Hardware / Software.................................................................................................28
4.3 Client Hardware discussion ............................................................................................29
4.3.1 Laptops, Notebooks..................................................................................................29
4.3.2 Sub-Notebook...........................................................................................................30
4.3.3 Pen Computer...........................................................................................................31
4.3.4 Handheld PC ............................................................................................................32
4.3.5 Palmtop PC...............................................................................................................33
4.3.6 Crosspad ...................................................................................................................34
4.3.7 Summary ..................................................................................................................35
4.4 Extended terms of reference ...........................................................................................36
5 Developing a new system.....................................................................................................37
5.1 Platform, Operating system ............................................................................................37
5.2 Additional Hardware.......................................................................................................38
5.3 Functionality ...................................................................................................................39
5.3.1 Raw data...................................................................................................................39
5.3.2 Index data .................................................................................................................43
5.4 Scenario with this application.........................................................................................46
5.4.1 Recording a document..............................................................................................46
5.4.2 Playback and usage of the document .......................................................................46
5.5 Purpose of this application..............................................................................................47
6 User guide.............................................................................................................................48
6.1 Running the application ..................................................................................................48
6.2 Configuration ..................................................................................................................49
6.3 Creating a new documentation........................................................................................51
6.3.1 The main window.....................................................................................................51
6.3.2 Handwriting and Notes.............................................................................................52
6.4 Reviewing a documentation............................................................................................53
6.4.1 Random seek ............................................................................................................54
6.4.2 Accurate seek ...........................................................................................................54
6.5 Extending a documentation.............................................................................................55
7 The Implementation ............................................................................................................57
7.1 The main parts of the program........................................................................................58
7.2 Creating a new documentation........................................................................................59
7.2.1 Audio........................................................................................................................59
7.2.2 Markers.....................................................................................................................61
7.2.3 Handwritten notes ....................................................................................................61
7.2.4 Textual notes ............................................................................................................63
7.3 Reviewing a documentation............................................................................................63
7.3.1 Playback ...................................................................................................................64
viii
7.3.2 Searching for information ........................................................................................65
8 Making the system portable................................................................................................69
8.1 Hardware.........................................................................................................................69
8.1.1 Operating system......................................................................................................69
8.1.2 Device Type .............................................................................................................70
8.2 Software ..........................................................................................................................71
8.2.1 Differences between the Win32 and Windows CE APIs .........................................72
8.2.2 Memory Limitations.................................................................................................72
8.2.3 Energy Limitations...................................................................................................73
8.2.4 Hardware Characteristics .........................................................................................73
8.2.5 User Interface ...........................................................................................................74
8.2.6 Testing and Debugging ............................................................................................76
8.3 Porting ConfDoc .............................................................................................................77
8.3.1 Preparing the implementation ..................................................................................77
8.3.2 The porting process ..................................................................................................77
8.4 Porting a Windows NT ACM Driver to Windows CE ...................................................79
8.4.1 Motivation ................................................................................................................79
8.4.2 The theoretical porting .............................................................................................80
8.4.3 The practical porting ................................................................................................80
8.5 Conclusion ......................................................................................................................81
9 Analysis of the new system..................................................................................................82
9.1 Stationary System ...........................................................................................................82
9.1.1 Hardware ..................................................................................................................82
9.1.2 Recording a documentation......................................................................................83
9.1.3 Reviewing a documentation .....................................................................................83
9.1.5 Suggestions for improvement...................................................................................84
9.2 Portable System ..............................................................................................................84
9.2.1 Hardware ..................................................................................................................84
9.2.2 Recording a documentation......................................................................................85
9.2.3 Reviewing a documentation .....................................................................................85
9.2.5 User Interface ...........................................................................................................86
9.2.6 Suggestions for improvement...................................................................................87
9.3 Combining the two systems ............................................................................................87
10 Conclusion ..........................................................................................................................88
10.1 Further topics and future work......................................................................................89
1
INTRODUCTION
1
Introduction
To arrange meetings and conferences is a must for every company in whatever business. No
matter if these meetings take place with personal presence or arrangements are made on the
telephone, all the necessary information from this meeting must be preserved for later
retrieval.
The usage of computers itself suggests to create a documentation system, where the audio or
video information is interlinked with additional textual notes or graphical illustrations. This
additional information can be used to indicate raw audio / video data to review the recordings.
The idea was born in the company named “adhoc Hard- und Software”, that is located in
Klagenfurt where I am currently employed.
As a company, that is focused on the development and production of special designs
concerning electronics and telecommunications we are concurrently working on several
projects. During the development process many questions and problems arise, which deserve
to be solved quickly. It is also usual that many agreements are made by phone. This kind of
communication results in many single and often short phone calls. The effect is that it is hard
to keep control over all those verbal arrangements and to relate such agreements to the right
project.
So we started to look for a system to record these dialogues. Our demands were that this
product should be usable for different types of conferences such as phone calls or meetings.
Further we wanted a system that would work at various locations like the company’s
conference room or on the road in a client’s room.
2
INTRODUCTION
This thesis describes an application called ConfDoc which helps to generate a complete
documentation of conferences or meetings and to quickly retrieve specific information within
more than one documentation sets.
Chapter 2 outlines the currently used documentation methods. The range lasts from
handwritten notes on a sheet of paper to extended systems which create files with the help of
computers.
Chapter 3 focuses a patent search which was carried out. The request for the patent search is
outlined and two interesting results are described in detail.
Chapter 4 summarizes the requirements of a new system. Two aspects are specially looked at:
The requirements raised by the user and the aspects seen by the developer which cover the
hardware and software requirements.
The process to create this new ConfDoc application is the topic of Chapter 5. It starts with a
discussion concerning the hardware, covers the functionality of this application and describes
a scenario that is possible with the help of ConfDoc.
Chapter 6 contains a user guide while the actual implementation is documented in Chapter 7.
An important feature of ConfDoc is the portability. Chapter 8 explains how this system is
made portable using a Palmtop device with the operating system Windows CE. The different
device types and operating systems are discussed and it is described how to port the software
to Windows CE.
An analysis of the implemented application ConfDoc running on either the stationary or the
portable system is made in Chapter 9. This chapter also includes hardware and software
aspects and some suggestions for further improvements.
Finally, Chapter 10 gives a short conclusion and an outlook of other ideas to create a
documentation and obvious difficulties they cause.
3
CURRENT DOCUMENTATION TECHNIQUES
2
Current documentation techniques
This chapter lists the currently used techniques to document a meeting. The different types of
generated data are outlined and the connections between these data-types are shown if there
exist any.
2.1 Handwritten notes
The simplest way to generate a documentation is to make notes. In general, the participants
themselves write down some comments by hand during the meetings. Depending on the
participant´s speed writing by hand, the information is more or less detailed. As there is
hardly time to write down the conversation word for word, the writer must decide at the
moment if the information is important and worth writing down or not. When using a
computer to avoid handwritten scrawl, typing is still slower than writing by hand.
In many cases, the resulting documents are not complete, important pieces of information are
missing or the written notes are incoherent.
The best result can be obtained by creating a fair copy of the handwritten notes. For this case
it is necessary to review the meeting and spend some time to rework the meeting’s contents.
This process will surely take place at the end of the meeting. Because the incompleteness of
the handwritten notes it is up to the participant to extend it from memory. If this meeting takes
a long time this could be a very difficult job.
2.2 Audio / Video recording
An extended method is to make an audio or video recording of the event. Since a recording
represents a complete documentation, the possibility is given to review the conference or
4
retrieve the exact wording of a conversation. This may be important for proving definitely
agreements or promises.
But this method is not perfect at all. It does not provide the possibility to search for a specific
phrase within the recorded data, or quickly review the audio / video recording belonging to a
specific subject being discussed during the conversation. The only way to use the recordings,
is to have a look at the whole data or to seek within the document depending on additional
notes or distinguishing marks from memory.
2.3 Using “standard software”
With the help of a common personal computer and already available software it is possible to
create multimedia documents which represent a simple documentation of a meeting or
conference.
Just imagine a word processing application with the ability to use drawings like rectangles,
circles or lines. Every popular system is of course able to import graphics and even audio or
video files.
With this availability you can make notes and draw a sketch during a conference to document
the proceedings directly on your computer. Using a multitasking system allows to record an
audio or video source in the background, while the word processing system is running in the
foreground. This audio files can be linked to or imported into the word processing document.
The result is a multimedial document with the ability to review a conference word by word by
the usage of the recorded audio.
The above outline gives an impression of a multimedial document, which contains all the
necessary information for a complete meetings´ documentation. This means the document
includes text, graphics and audio or video.
But one important point is missing: The different types of information are not connected to
each other. They are more like several independent single documents.
Of course, an audio recording and also a text document is part of a whole documentation, but
within this single files there is no relation from one point in the first component to the
corresponding point in the other component.
When examining a specific passage in the text file, it is impossible to find the corresponding
passage in the audio file. The only way to verify the spoken words in the audio recording is to
search manually for this passage or listen to the whole audio document. Especially this last
alternative is not satisfying because of listening to the whole document takes a very long time.
5
2.4 Authoring on the fly
The following method was not intended for a documentation of a meeting. But the resulting
documents may show some similarity with this intention.
Authoring on the fly is a way of producing hypermedia documents for supporting teaching at
universities. A computer held lecture is automatically converted into the core of a multimedia
document and is linked together with papers, textbooks, animations and simulations.
The following lines in [HM96] describe the method:
Authoring on the fly, a term coined by Maurer in 1994, refers to the possibility of
preparing a substantial piece of courseware during the process of actually
delivering a lecture.
The basic idea is simple enough: A lecture is delivered by extracting prepared
multimedia material from a Hyper-G server and projecting it with a videobeam or
LCD panel. The prepared multimedia material may just look like ordinary color
transparencies, but might also include animation, movie clips, sound effects,
simulation and other educational material. Thus, Hyper-G and the material
prepared is just used for presentation purposes, so far.
As the lecture is delivered, the voice, face (or the whole body for gestures) and
actions of the lecturer such as pointing to or highlighting certain material shown
are recorded, digitized and stored in Hyper-G together with the original
presentation material. Thus, after the lecture is finished it is available as a HyperG document that can be reused at leisure at any later time: authoring of a one-hour
piece of courseware complete with everything takes just one hour! Of course, the
effort to produce the material to present or more realistically to select from the,
hopefully, comprehensive electronic library has also to be taken into account.”
An implementation developed by Thomas Ottmann and Christian Bacher uses an electronic
substitute of the blackboard and transmits the lecture also to remote locations.
The carried out experiments demonstrate that classroom lecturing, distance teaching, and the
production of educational hypermedia can be successfully integrated.
The production of a AOF (authoring on the fly) document is described in [BM97] as follows:
6
“First, all the slides to be used in a specific lecture are selected. These slides are
PostScript documents produced as usual by standard tools such as LATEX or
Framemaker. In order to enhance the comprehension of the lecture, any
applications such as animations or movies can be included. The slides and
applications can be provided with custom titles. These are later used to
automatically generate a table of contents for the hypermedia document. At this
stage, the recording session is ready to be started. While recording, the slides are
loaded into the whiteboard. These can be marked and annotated as usual using the
whiteboard' s features. While delivering the lecture, the selected applications can
be started simultaneously. The data resulting from these actions, namely the
whiteboard data stream, the applications' start and stop commands and the audio
stream, are recorded. At the end of the recording session, the data is immediately
converted into an internally used format [OB95] to generate an integrated
document for offline use. Upon accessing the document, the built-in viewer
performs the synchronous replay of the recorded data streams as well as starting
and stopping the applications as presented during the lecture. An integrated
document handler supports the creation of collections where any additional
documents can be inserted. Operations for moving, copying or deleting documents
are also supported.”
They delivered a lecture on the computer and converted it into a hypermedia document.
The lecture was recorded on a S-VHS video tape, which was later digitized (audio and video)
with the SGI capture tools. The capturing of the audio and video stream in a sufficient quality
needs a powerful hardware and some experience in a proper use of the software tools. The wb
output was recorded with MCASTREC1, a novel program to record a whiteboard session, and
then converted into a format which is readable for an external Hyper-G viewer SYNCVIEW2.
Then a textfile with the paths of the postscript slides and titles was edited.
As a result you get a multimedia document consisting of sound and video of the lecturer's
talk, but also the demonstrations on the wb. The program SYNCVIEW presents this
multimedia document by synchronizing the wb actions with video and sound. It is also
possible to scroll back and forward in the document by using a slider.
1
available under ftp://ftp.informatik.uni-freiburg.de/pub/AOF
2
see 1
7
Figure 2-1: screenshot of the movie and the accumulated whiteboard
Although this tool provides recording and playback of video data and generating additional
information like predefined graphics, text and drawings, it provides not a way to use the
additional information as an index for the video data. The resulting hypermedia document is
useful for lectures, but lectures are straight forward and the document is not useful when
searching for phrases.
2.5 Conclusion
These techniques show us the different type of information.
The “audio / video recording” generates some raw data, which provides a complete
documentation of a meeting. This method allows to retrieve the exact wording of a
conversation. This may be important for some cases, but it will be very time consuming to
8
retrieve some specific information, because there is no way to search within the document.
The only way to get information is to browse through the document.
The “handwritten notes” represent so called index data. This method requires to filter the
contents during the meeting to decide if the information is important and worth to write down
or not. This leads to incomplete documents. In general the retrieval of exact wordings is not
possible.
The other two methods provide a complete documentation represented by a video recording
combined with some additional information, which is collected manually by the user. This
additional information mainly consist of text and drawings.
But these methods give no possibility to creates connections between these two parts.
Retrieving specific information from the complete documentation is as difficult as retrieving
information from an video recording because no search method is provided.
9
PATENT SEARCH
3
Patent search
With the intention to claim a patent, a patent search was requested.
The subsequent request yields to two interesting results, which will be covered following the
description of the request.
3.1 Request for a patent search
The following request was sent to the Austrian patent office in 1999.
Note: To avoid mistranslation, the article is given in german.
3.1.1 Beschreibung
3.1.1.1 Technisches Gebiet
Beschleunigtes Auffinden markanter Positionen in multimedialen Datenströmen von Video
und Audioaufzeichnungen
3.1.1.2 Stand der Technik
Multimediale Dokumentationen wie Videoaufzeichnungen, Bilder, Tondokumente,
handschriftliche und ASCII-Textvorlagen sind durch entsprechende Leistungssprünge im
EDV-Bereich (hohe Rechnerleistungen und Speichervermögen) in modernen Unternehmen
durchaus üblich. Die Problematiken dabei sind zur Zeit im wesentlichen 2 Aspekte:
•
Mangelhafte Wiederauffindbarkeit aus Dokumentationssammlungen
10
PATENT SEARCH
•
Zeitaufwendiges Suchen in solchen Dokumenten
3.1.1.3 Innovativer Anspruch
Die oben angeführten Probleme sollen durch folgende Maßnahme aus der Welt geschafft
werden:
•
Logische Verknüpfung (Indizierung) von unübersichtlichen Dokumenten (Audio/VideoAufzeichnung) mit übersichtlichen (handschriftlicher Aufzeichnung)
3.1.1.4 Equivalenter Patentanspruch
Automatische
Indizierung
multimedialer
Datenströme
durch
(hand)schriftliche
Aufzeichnungen gekennzeichnet dadurch, daß automatisch eine logische Verknüpfung von
übersichtlichen Dokumenten (handschriftliche Aufzeichnung oder ASCII Text) mit
unübersichtlichen Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt
sinnvollerweise über entsprechende Zeitmarkierungen, die einzelnen Objekten des
übersichtlichen Dokumentes angeheftet werden und auf entsprechende Objekte des
unübersichtlichen Dokumentes verweisen..
3.1.1.5 Figuren
A
B
C
D
Audio/Video-Aufzeichnung auf EDV-Datenträger mit mitlaufender Zeitfunktion
Elektronischer Notizblock mit integrierter Uhrenfunktion
(Hand)schriftliche Aufzeichnung in Realzeit
Automatisch zugeordnete Verweise über Zeitfunktion
11
PATENT SEARCH
Figure 3-1 : Abbildung für Patentantrag
3.1.1.6 Ausführliche Beschreibung
Um bestimmte, charakteristische Positionen in sequentiell aufgezeichneten multimedialen
Dokumenten wiederzufinden, wird beim gegenständlichen System bei einer solchen
Aufzeichnung eine fortlaufende Zeitmarkierung implizit mitgespeichert (Figur A)
Zum selben Zeitpunkt erstellte (hand)schriftliche Aufzeichnungen (Figur C) mit Hilfe
entsprechender elektronischer Medien (Figur B) verwenden dieselben Zeitmarkierungen und
erlauben damit - in weiterer Folge - über diese Zeitmarkierungen eine direkte Referenzierung
bestimmter Textstellen mit den, zu diesem Zeitpunkt erstellten, sequentiellen Aufzeichnungen
(Figur D).
12
PATENT SEARCH
Unter Verwendung elektronischer Aufzeichnungsmedien für A und B wird somit ein
wahlfreier Zugriff (unmittelbare Positionierbarkeit) auf ansonsten in sequentieller Form
vorliegende Dokumente möglich.
3.1.2 Patentanspruch
Automatische
Indizierung
multimedialer
Datenströme
durch
(hand)schriftliche
Aufzeichnungen gekennzeichnet dadurch, daß automatisch eine logische Verknüpfung von
übersichtlichen Dokumenten (handschriftliche Aufzeichnung oder ASCII Text) mit
unübersichtlichen Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt
sinnvollerweise über entsprechende Zeitmarkierungen, die einzelnen Objekten des
übersichtlichen Dokumentes angeheftet werden und auf entsprechende Objekte des
unübersichtlichen Dokumentes verweisen.
3.1.3 Zusammenfassung
Automatische
Indizierung
multimedialer
Datenströme
durch
(hand)schriftliche
Aufzeichnungen dadurch, daß automatisch eine logische Verknüpfung von übersichtlichen
Dokumenten (handschriftliche Aufzeichnungen oder ASCII Text) mit unübersichtlichen
Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt über
entsprechende Zeitmarkierungen, die einzelnen Objekten des übersichtlichen Dokumentes
angeheftet werden und auf entsprechende Objekte des unübersichtlichen Dokumentes
verweisen.
3.2 Result EP0495612 – A data access system
Because of the similarity of this result to our thoughts, this result is displayed with all details.
3.2.1 General
Inventor(s):
Lamming, Michael G.
Applicant(s):
XEROX CORPORATION
Issued/Filed Dates:
July 22, 1992 / Jan. 14, 1992
Application Number:
EP1992000300285
13
PATENT SEARCH
3.2.2 Abstract
A note-taking system based on a notepad computer with an integrated audio/video-recorder is
described. A document is created or retrieved. As the user types on the keyboard or writes
with the stylus or similar input instrument, each character or stroke that is input by the user is
invisibly time-stamped by the computer. The audio/video stream is also continuously timestamped during recording. To play a section of recording back, the user selects part of the
note (perhaps by circling it with a stylus) and invokes a "playback selection" command. The
computer then examines the time-stamp and "winds" the record to the corresponding place in
the audio/video recording, where it starts playing - so that the user hears and/or sees what was
being recorded at the instant the selected text or strokes were input. With a graphical user
interface, the user may input key "topic" words and subsequently place check marks by the
appropriate word as the conversation topic veers into that neighborhood.
3.2.3 Description
This invention relates to a data access system, and in particular to one in which the taking of
notes usually accompanies the recording of data. The notes themselves are used to gain
selective access to the data.
3.2.3.1 The problem
At seminars, interviews or meetings, it has long been recognized that the simple act of taking
notes helps the writer memorize key facts. On the other hand, to make a note, the listener has
to divert attention from the speaker, figure out how to encode the information he has heard or
the idea he has had, and then focus on writing it down. It doesn't seem surprising that the
listener often loses the speaker's thread! Nowadays many people resort to making audio or
video recordings of important events and then spending time transcribing key ideas
afterwards. Locating the interesting parts of an audio or video recording is often very timeconsuming and tedious, which reduces the likelihood that the listener will bother to transcribe
the tape, or even bother to make the recording in the first place.
In many situations the need to record is not anticipated. Only after the event, sometimes long
afterwards, does the need become evident. To overcome this problem it may eventually
become technologically possible, from a storage point of view, to record and store every
second of a person's working day. The problem of locating key pieces of information from
this huge and expanding base of recordings becomes overwhelming.
14
PATENT SEARCH
This invention builds upon these ideas by providing a semi-automatic, fine-grained,
audio/video indexing tool. Although the invention focuses on gaining access to audio and
video records, it is not limited to this application. It is envisaged that the invention could also
be applied to accessing files in a computer system, phone call records, or any other timestamped data set.
The present invention will now be described by way of example with reference to the
accompanying drawings, in which:
•
Figure 1 is a diagrammatic view of one form of the invention using a laptop computer;
•
Figure 2 is a diagrammatic view of another form of the invention using a notepad
computer with stylus input, and
• Figure 3 is a diagrammatic view of one form of the architecture of the invention.
The example of the invention shown in Figure 1 combines a laptop computer 2 with a video
or audio recording device 4.
In the example of the invention shown in Figure 2, the portable computer is replaced by a
note-pad computer 6 with stylus or other graphical input device 8, connected to video or audio
recording device 4. The stylus input device may take the place of the keyboard of the portable
computer as the user's means of input. Alternatively, a keyboard 10 may be connected to the
notepad computer for input.
Either form of computer is interfaced to the recorder in such a manner that the computer can
monitor the recorder's progress during recording. To enable this, the recorder may emit timecode or make other unique index information continuously available to the computer.
Additionally the computer can operate the recorder directly, causing it to record, or play from
a specific index point as required. In any situation it is possible for the computer to find out
the time-code or index information.
In the general system shown in Figure 3, the computer presents a document editor style user
interface to the user. This interface may resemble that of a word-processor, an illustrator, or
an application specific interface, depending on the user's needs and computer hardware. The
user interface provides commands for creating a new document and starting and stopping
recording.
A new document is created, or an old one recalled, and recording commences. The editor 22
allows the user to draw, type or sketch on the blank document using keyboard 10, stylus 8, or
other graphical input device. As each mark (or indicium) 26, is added to the document (an
15
PATENT SEARCH
'indicium' is any indivisible displayable symbol created by the interface, e.g. pen-stroke,
character, graphical symbol, etc.), the editor time-stamps and stores it in an indicium-totimestamp index 14. The video-frame time-stamper 20 also continuously notes the index
information 32 arriving from the recorder 4 at that instant, and time-stamps it and stores it in a
timestamp-to-timecode index 16. The time-stamps are not visible to the user - they are stored
with the computer's internal representation of the indicia. When the session is over, the user
stores away the document; the contents of the two indexes, and the recording for subsequent
use. Any method that allows time-stamps to be used as an index into the recording may be
used instead of the timestamp-to-timecode index, e.g. interpolation of the timecode itself.
When the user wants to recall sections of the recording, the appropriate related document is
retrieved and recalled to the display 30 by the browser 24. Using the stylus 8, or the keyboard
10, the user selects one or more indicia on the document, perhaps by circling them,
positioning a cursor, or typing an identifying name, and instructs the browser 24 to play the
associated video section. The browser 24 identifies the selected indicium or indicia, looks up
the time-stamp(s) in the indicium-to-timestamp index 14; looks up the timestamp(s) in the
timestamp-to-timecode index 16, and plays the section of the recording in the area indicated
by the resultant timecode. Thus the user sees what was recorded at the time the marks were
made on the video monitor 28.
Indicium-to-timestamp and timestamp-to-timecode indices may be combined into a single
indicium-to-timecode index.
It is expected that users will develop shorthand notations for marking the document to
indicate an interesting idea, change of topic or speaker, and so forth. It also is expected that
users will type or handwrite key words as the topic arises, and then add additional marks
nearby each time the subject veers in the direction of that topic. Thus, to hear all section of a
seminar associated with the same topic, it is necessary only to circle all the marks in the
location of the topic keyword.
Accordingly it will be seen that the present invention provides a simple, inexpensive,
lightweight, low-power device for quick access to data records.
16
PATENT SEARCH
3.2.4 Figures
Figure 3-2 : Figure 1 of EP0495612
17
PATENT SEARCH
3.2.5 Patent claims
1. A system for providing random access to a time-stamp identified part of a time-stamped
data set, said system comprising:
•
means for capturing, filing, retrieving, displaying, and automatically time-stamping
seer-generated indicia for selection by a user, said time-stamped data set and said
time-stamped indicia having a common time base;
•
means for determining the time-stamp of each user selected indicia; and
•
means for indexing into said data set to identify the part thereof that is time-stamp
correlated with said selected indicia.
2. A system as claimed in claim 1, including means for presenting to the user the identified
portion of the data set.
3. A system as claimed in claim 2, in which the time-stamped data set is audio and/or video
data.
18
PATENT SEARCH
4. A system as claimed in any of claims 1 - 3 , which is implemented by means of a portable
computer.
5. A system as claimed in any of claims 1 - 3, which is implemented by means of a portable
video recorder.
6. A system as claimed in claim 5, in which the portable video recorder is a camcorder.
7. A system as claimed in claim 4 in which the portable computer is a notepad computer
with stylus input.
3.2.6 Comparison to our request
Although they never purchased a product with this design, the basic idea is actually the same
as ours. Even the portability is covered in the patent claims.
Differences between the system described above and our system are the combination of a
stationary system with a portable system and the search capabilities over more than one
document.
We also thought about combining more than one document regarding one conference.
This patent could be a reason for our request to fail.
3.3 Result US5172281 – Video Transcript Retriever
Since this second result is not as similar to our idea as the first result, it is not listed with all
details. Only the abstract and the patent claims are shown.
3.3.1 General
Inventor(s):
Ardis; Patrick M. , Memphis, TN 38119
Markovich; Marko R. , New Orleans, LA 70115
Thompson; Kevin W. , New Orleans, LA 70115
Applicant(s):
None
Issued/Filed Dates:
Dec. 15, 1992 / Dec. 17, 1990
Application Number:
US1990000628082
19
PATENT SEARCH
3.3.2 Abstract
A video transcript retriever includes a control unit, a control interface, a tape unit and a
display unit. The control unit includes a control computer having a software package
consisting of control software, text software and edit software. The control software has the
capacity to permit simultaneous operation of both the test software, which is capable of
storing and searching voluminous documents, and the edit software, which has the capacity to
operate the tape unit with precision. The text software is capable of performing a search
function that at any time can provide the exact location of a specific passage within the
searched document in terms of page and line. The edit software has the capacity to provide at
any time the timecode number pre-recorded on the videotape that corresponds to a specific
passage. The process for locating and retrieving specific information on a videotape includes
the steps of striping the videotape by assigning a numerical address for every one-thirtieth
(1/30) of a second segment of the videotape; indexing the words written in a computer
transcript to the words spoken on the videotape by assigning a timecode number to both the
computer transcript and the videotape segment where each question/answer passage begins;
and instructing the tape unit to shuttle to a precise tape location determined by the timecode
numerical address located during the search of the computer transcript.
The present invention relates to a process and apparatus for retrieving and displaying
information on videotapes and, more particularly, to a process and apparatus for quickly and
precisely retrieving and displaying specific images and testimony in a videotaped deposition
by indexing a computer-generated transcript of the deposition proceedings with a video
timecode number address on the videotape.
3.3.3 Patent claims
1. A system for indexing written and video records of a deposition comprising:
•
a control unit having a control computer including a software package consisting of a
text software having the capacity to store and search documents, an edit software
having the capacity to operate a tape unit with precision, and a control software having
the capacity to control said system and permit the simultaneous operation of said text
software and said edit software;
•
a videotape having video and audio information thereon;
•
a document consisting of said audio information in a written record format stored in
said test software;
•
a control interface connected to said control unit to receive signals from said control
unit and pass signals to a tape unit;
20
PATENT SEARCH
•
a tape unit having said videotape thereon and connected to said control interface to
receive signals therefrom;
•
a numerical timecode address assigned to said videotape for every 1/30 of a second
segment of said videotape;
•
an identical numerical timecode address assigned to each audio information segment
in said document corresponding to the identical audio information segment on said
videotape; and
•
a display unit connected to receive signals from said control unit and from said tape
unit for selectively displaying the audio and video information at any numerical
address on said videotape.
2. A process for indexing each question and answer segment on a videotape deposition to
each corresponding question and answer segment on a written record document of said
deposition comprising:
•
listening to the audio information on said videotape while simultaneously reading a
written record of said deposition;
•
marking the beginning of each question and answer segment of said deposition on said
videotape with a numerical timecode address;
•
sending said timecode addresses which identify the beginning of each question and
answer segment on said videotape through a control interface to a control unit; and
•
placing at the beginning of each question and answer segment of said written record of
said disposition an identical numerical timecode address corresponding to the
numerical timecode address previously assigned to each question and answer segment
appearing on said videotape by means of a command generated in said control unit
and passing through said control interface.
3. A process for retrieving and displaying video and audio information on a videotape
comprising the steps of:
•
providing a written record document of the spoken words appearing on said videotape
in question and answer segments;
•
indexing the words written in said record to the identical words spoken on said
videotape by assigning a timecode numerical address for each question and answer
segment on said written record corresponding to the identical question and answer
segment on said videotape;
•
searching said written record to locate a specific word passage and the timecode
numerical address assigned thereto;
•
instructing a tape unit to shuttle said videotape to the precise tape location as
determined by the timecode numerical address located in said written record; and
21
PATENT SEARCH
•
displaying the information on said videotape located at said timecode numerical
address.
3.3.4 Comparison to our request
This invention describes a system, which preserves two types of data:
•
The video stream recorded on a tape
•
Full-text of the video recording taken by the stenographer
These two parts are combined to obtain the best effort in retrieving information from the
video recordings.
Aspects of this system, which are the same as in our project:
•
Audio- or video recording
•
Textual information used for indexing inside the recording
Characteristics of our project, which are not included in this system:
•
Drawing capabilities and graphical information used for indexing
•
Full text search facilities
•
Searching within several documents
•
Portability
Looking at these features I guess, this patent claims are no problem for our request.
3.4 Conclusion
The first result, “A data access system”, describes more ore less exactly our idea. It is
surprising, that this system has not been realized and purchased by now.
The second discussed system is intended for use in litigation. In the field of litigation, a
deposition is a proceeding in which an attorney asks oral questions of a witness. For this
reason, this system is especially made for question – answer related conversations.
Due to the full-text data this system claims to be a good documentation system. With this
database it is possible to perform a full-text search within the whole range of the recorded
video. But as mentioned above, this system requires a stenographer to create the textual
22
PATENT SEARCH
information. Additional, the lack of the possibility to create sketches or markers restricts the
user to textual information. This system is very specific for use in litigation.
Our own patent claim is currently running, but there may be a problem because of the
similarity to the patent number EP0495612.
23
REQUIREMENTS ON A NEW SYSTEM
4
Requirements on a new system
Because an index for the recorded data is missing, the development of a new system including
this feature is necessary.
The first question to be answered covers the system requirements on this application.
A perfect documentation system claims the following conditions:
•
no information may be lost
This requirement could be taken for granted because it is the first essential for retrieving a
precise information at a later time.
•
fast retrieval of information is required
Systems which obtain the information not so fast are already available. But the factor time
is an important part of the new task.
And a proverb says: “Time is Money”
•
conventional documentation techniques are preferred
This statement means that the person who is documenting a conference does not have to
change his behaviors. The system should adapt to the human habits.
The only hardware, which fulfills these requirements, consists of several sheets of paper
and a pen. Of course, this is not the hardware we are thinking of but it gives us a good
idea what the hardware should look like.
With this three points in mind let’s have a look at a visionary scenario which describes a
perfect documentation of a conference.
24
4.1 Requirements made by the user: Optimal scenario
4.1.1 Recording a document
To go through the worst case scenario consider a conference, which takes place in the client´s
premises. This means, no permanently installed hardware can be used.
The hardware for the client application to make notes and drawings, which fulfills the
conventional techniques, is realized by a pen device. The pad of this device is as large as a
sheet of paper and the pen is freely mobile, which means that it does not need a cable.
The thinner the pad, the better a sheet of paper can be simulated. The result is a pen based
computer or a pen based input device with a touch screen which can be used as a notepad.
Textual and graphical input is supported by the input device. The software supports online
handwriting recognition and aids the user when drawing straight lines, rectangles or other
primitives.
A small and lightweight video camera, which is also used by video conferencing systems,
produces the raw data. So video and audio sources are recorded. Thus the camera is small and
handy, it can be easily used to point to an overhead or video beam display to capture
demonstrations or supplemental material if necessary. To avoid annoying camera handling
during the discussion the camera is set up in a way where the whole scene can be captured the
entire time.
Before the conference starts, some general information can be entered. This includes data
about the client, the project, time and place of this meeting, and so forth. During the
conference, notes can be made as usual with a pen and paper including leafing through the
pages.
An important thing for the recording phase is that every change, every character and every
single line of a drawing is stored. Even delete and undo actions are recorded. This allows an
exact reproduction of the particular conference.
4.1.2 After the conference
When the conference has finished, the system allows a quick review of the document
including adding and changing information.
When back at the company the client system is attached to the server and transfers the data to
the server. From this time on, the server is responsible for backup, retrieval and access rights.
25
After the data are successfully copied to the server, the client memory can be freed for a new
documentation.
4.1.3 Playback and usage of the document
To review a documentation, the client program is used to search within the database on the
server for a specific document. The search criterion can be a part of the general data, which
was entered at the beginning of the conference, or a phrase out of the full text.
Once the document is found, it is opened to examine the contents. Changing the document is
also possible, if the user has this access rights granted.
The first opportunity to review the recording is a bit like the usage of a video recorder.
Several controls allow to start playback, stop or seek within the document. All recorded
actions of the user are reproduced which of course include the video data.
The second method uses the system´s index capabilities. The starting point of a search
procedure is the whole document. The user can look at the notes and drawings he made
during the meeting or use the system´s full text search capabilities.
By clicking onto one word or line of these notes, the video playback is synchronized with this
part of the notes. The user can now look at the video scene which was recorded at that time he
made these certain notes and is able to listen to the conference word by word or look at every
gesture of the participants.
In order to realize a system which fits the above scenario, some considerations may follow:
4.2 Aspects seen by the developer
The primary goal of this new system is to create categorized collections of documents to
generate a complete documentation of meetings, phone calls and conferences during a
development process or service.
This system should be able to generate a documentation semi-automatically, combine all the
different types of data to one coherent document, keep the links between these different types
of information, store all available material permanently and provide easy retrieval with some
search mechanism.
26
4.2.1 Types of information
A whole documentation consists of raw data and additional information. This additional
information, which will be called index data from now on, will be used to indicate specific
positions inside the raw data.
4.2.1.1 Raw data
The raw data should be stored unmodified and must be retrieved with random access.
Possible types of raw data are:
•
Audio
•
Video
As the raw data can not easily be searched for keywords or phrases with the current
technology, it is the job of the indexes to provide a fast and easy access to the raw data. With
the help of the index data, the system should seek for a specified starting point within the raw
data and play back the data stream.
The only requirement this system must fulfill concerning the raw data is to record the whole
conference and play back the recorded data starting at a specified position within the whole
recordings.
Of course, recording and playback of audio or video data is not very difficult, so the key point
of interest for this system will be the generation and usage of index data.
4.2.1.2 Index data
The index data is used to point to specific positions within the raw data.
Index data can be generated in different ways. Any distinctive feature of a conference can be
used to create index data. The variety of methods may include noise level, change of subject
in a discussion, direction of the loudest noise, gestures when using video data, and so on.
The most important separation is the generation method. We distinguish between two
different methods:
•
automatic
•
manual or forced
4.2.1.2.1 Automatic indexing
Automatic indexing includes all methods, so that no special action needs to be taken by a
meeting participant to create a new index. The indexes are created by the system itself when
distinguishing marks appear during the conference.
27
•
time stamp
A time stamp is the simplest form of an index. The indexes are generated periodically.
The time between two index items depends on the type of the conference. When using this
index type for short conferences, the period should be smaller than for longer conferences.
It is hard to define a period that is convenient for different situations. With the current
state-of-the-art technology is easy to realize this type of indexing when recording audio or
video, as every multimedia player is able to seek within the whole document and start
playback at a certain time-position from the beginning of the file. The resolution for this
time-position depends on the sample rate of the recording. Since a minimum sample rate
of 8 kHz is recommended and certainly realizable by common audio recording hardware,
the resolution is sufficient for every need.
•
change of token
A new index is created, when one speaker has finished and the next speaker begins. This
may be difficult to realize with the current technology because it includes a realtime
analysis of the raw data. As the example of speech recognition shows, this method
requires a powerful hardware. Another problem is that the recognition of smoothly spoken
words results in a very poor quality.
An easier way to realize this method could be a kind of direction sensitive microphone,
which can determine the direction where the loudest or softest noise level comes from.
The 360 degrees around the microphone are divided into several sections. Every time the
direction with the loudest noise changes from one section to another, a new index entry is
generated.
•
striking moments
The recorded data is examined for a noticeable situation. So a longer break in an audio or
video stream can be a hint for a coffee break, or a video stream with several identical
frames one by one may lead to an empty room.
4.2.1.2.2 Forced indexing
Forced indexing describe all methods, where the user has to take an action like pressing a
button or writing down some words to initiate a new index entry.
•
notes and drawings
The common method for indexing is to make notes. The advantage of this way is that the
user has a very short setting-in period, because the user is familiar with this method.
Maybe his method changes from handwriting to typing, but the basic idea is the same.
Drawings require a more complex type of hardware or a more complex method of creating
it. In the first case, a special hardware simulates a sheet of paper and the user can make
28
notes the way he is used to. The other method uses a common hardware device which is
easier to handle for the system, but more complicated for the user. Two examples for these
cases may be a pen driven drawing device (which is familiar to the user) and an ordinary
computer mouse. It is more difficult to draw a simple rectangle using the mouse then
using a pen.
When using the keyboard for entering text, the index-generation process is simple.
Whenever the user presses a key a new index is created.
Selecting the best input device for this indexing method is the users job. Maybe the usage
of a keyboard is not suitable because of the slower typing speed or the annoying noise of
the keyboard.
•
markers
By pressing a button, the user can set a marker to mark a specific event. Usually different
markers are used to mark different events. These events may include the change of
subject, a question, simple comment or action to be taken. When reviewing the document,
the user can search for specific events.
Every marker can contain some additional notes to make the retrieval process easier. A
full-text search within these marker- descriptions can help to find a specific marker faster.
4.2.2 Hardware / Software
Since probably many single documents belong to one great project and many different
persons use this system at maybe various locations, a client server architecture can be very
helpful.
4.2.2.1 Server
The server should store the data files onto his system and he is responsible for the security
services and backup processes. The usage of a network connection for remote locations can be
taken for granted. The minimum requirements include a simple file server. Regarding to
access control, private and public documents, backups, etc... a Hyperwave server or
something similar is recommended.
4.2.2.2 Client
The client application must be able to record all data which are part of one documentation.
The recording shall include the raw data, all index information and the connections between
the raw data and the index information.
29
The client application also has to provide a mechanism, that allows the user to review a
documentation, search within the index data and listen / view to the raw data depending on
the index information.
The system should enable the user to edit and change the document if he has granted access
rights. The edit procedure is the same as the index editing functions when recording a
documentation. This means creating new indexes, add comments to one index and delete
indexes.
Many different types of input devices are able to create index data. The keyboard will lead to
textual information, the mouse or a pen device is more useful for generating graphics and
drawings. Simple push buttons are easy to use with any of the above devices.
These different types of information should not be used separately, they shall rather
complement one another. A simple button may lead to a marker, which can be linked together
with textual or graphical information.
4.3 Client Hardware discussion
As seen above, choosing the hardware and software concerning the server side of the
application should be no problem. I guess every common server hardware will fulfill the
minimum system requirements.
Every common personal computer with a built in sound card will meet the system
requirements for a stationary client. As such hardware is currently available I will suggest it.
The following section will now discuss the different types of client hardware for portable
systems only.
The goal of this part is not to list as many different devices as possible, but rather to show the
advantages and disadvantages of the different types of devices.
[PENC] is a good starting point to look for hardware reviews and comparison tables between
different devices. This will help choosing the best device.
4.3.1 Laptops, Notebooks
When speaking of a notebook, we think about a standard PC with an average size of 320 *
250 mm. The display should be able to show a minimum of 1024 * 768 pixels and the
operating system and processor is of a similar type as in a desktop PC.
30
The common available notebooks are often equipped with more powerful hardware than
many desktop systems. Also a built in sound card can be taken for granted. Assuming the
notebook is running the same operating system as the stationary system, this will lead to the
following advantages:
4.3.1.1 Advantages
A notebook (and all following devices with this operating system) will be a good solution for
the developer. There is no need to port the system from the stationary system to the portable
system. A simple installation of the program will make it available on the road. Another
advantage is, that the user is familiar with the operating system so that he doesn’t have to
learn how to use it.
Since many companies are using notebooks for some reason, there is no need to buy a new
hardware.
4.3.1.2 Disadvantages
The disadvantage using notebooks is not one that meets technical requirements but business
politeness. It is not seen as very respectful to “hide” behind an opened notebook display while
negotiating.
A technical aspect is the usage of batteries. Using this device on the road for a long time may
cause problems with the power supply. For example this problems may occur at long lasting
conferences.
4.3.2 Sub-Notebook
A sub-notebook is a little bit smaller and less powerful than a standard notebook. The size
will be about 220 * 180 mm and the display will be able to show a maximum of 800 * 600
pixels. Many of these devices are equipped with a touch screen, which will give new
opportunities in entering index data. [PANA] will be a good example for such a device.
Devices are available with Windows 95/98/NT or Windows CE as operating system.
31
Figure 4-1 : Examples for subnotebooks: [PANA] and [LIBR]
4.3.2.1 Advantages
This device will be a good alternative to a standard notebook. The operation system Windows
95/98/NT will allow to quickly install the ConfDoc and other common applications. Many
devices with this size have no built-in CDROM or diskette drive but using a serial interface or
USB connection to other PC’s will help.
Due to the missing CDROM and the small size of the keyboard, these devices look like
underpowered notebooks. Regarding the lower power and almost the same price in
comparison to standard notebooks I think these devices will be used for some special reasons
only. Why buy a sub-notebook when you can get a standard notebook for the same price ?
A subnotebook also works with batteries, so that the same problems regarding the power
supply as described above for notebooks may occur.
4.3.3 Pen Computer
A pen computer looks like a notebook with removed keyboard.
The operations system is often Windows 95/98/NT, the display is equipped with a touch
screen and the input is made with the help of a pen. A handwriting recognition system will
substitute the keyboard. Another method is a software keyboard. This means, that a keyboard
layout is displayed on the screen and can be operated with the pen by just tipping on the right
button.
32
Figure 4-2 : Example of a pen device: [FUJI]
4.3.3.1 Advantages
Due to the operating system Windows 95/98/NT there is no need to port the application.
These devices are made for walking so they are robust and lightweight. They are a good
combination of a powerful PC and a handy device.
The missing keyboard will make it difficult to enter textual notes. A further problem due to
the lack of a keyboard is to use it for standard applications like word processing. Although
this device type is running a standard operating system it will rarely be used as a standard PC.
The reason for this is that it is hard to use e.g. a word processing system without a keyboard.
The built in handwriting recognition system is too slow to really use it comfortably during a
meeting.
4.3.4 Handheld PC
The features of a handheld computer include a nearly full-featured keyboard, a touch screen
of half height VGA screen and Windows CE as operating system with some built-in standard
applications as Pocket Word or Pocked Excel. Handheld PC Pro devices are available with a
full-size VGA screen.
33
Figure 4-3 : Example of a handheld PC : [JORD]
4.3.4.1 Advantages
The device contains a keyboard and a touch screen for user input. The small size of the device
makes it easy to take it along when you are on the road. Also a handheld computer will be a
lot cheaper than a standard notebook or a sub-notebook.
The operating system Windows CE requires special versions of applications. Standard
Windows programs can not be run on this device. They must be developed especially for this
operating system
Although a keyboard is available, it’s size of about 17 * 9 cm makes it difficult to enter a big
amount of text.
4.3.5 Palmtop PC
This device type represents the smallest piece of equipment. With an average size of 14 * 8.5
cm it will fit in every pocket. The intention to built only a personal information manager
(PIM) without a keyboard resulted in just being equipped with a touch screen. Those Palmtop
PCs run with the operating system Windows CE. Based on Windows CE and available
software development kits (SDK’s) many freeware and shareware programs were written
around the world to let these devices be more functional.
34
Figure 4-4 : Example of a palmsize PC: [AERO]
4.3.5.1 Advantages
The small size offers a discreet device which can easily be carried around. This device should
fit in every pocket so no additional carrying case is necessary. The nature of Windows CE
allows to use it quickly. There is no time consuming boot sequence to perform as it is using
standard PC’s.
The biggest advantage can also be a great disadvantage. The small display provides only a
small user interface and entering a big amount of text with the tip of a pen can be boring.
Palmtops are battery driven. Therefore the well known problems also apply to this device.
4.3.6 Crosspad
This device type is a little bit different. It is a pad which captures handwritten notes in a
graphical form. The pad is handy enough to carry it around. After connecting the Crosspad to
a standard PC the captured pages are transferred to the computer. The available software
allows to easily store, search, cut and paste and send digital notes.
35
Figure 4-5 : The Crosspad: see [CROSS]
4.3.6.1 Advantages
Remember the third conditions for a perfect documentation system mentioned at the start of
this chapter: “conventional documentation techniques are preferred ” . This is the only device
which fulfills this request. The user can write with a traditional pen on a sheet of paper.
The biggest advantage may be the power supply. The battery life time of about 3-4 months is
unbeatable.
As a disadvantage can be mentioned that a special software is needed to import the data into
ConfDoc. Since no audio recording is provided, an additional device is needed for that
purpose. The Crosspad could only be used as a supplemental device not as a standalone tool.
4.3.7 Summary
There are some different devices available which could be used for ConfDoc. The devices
power and also price vary in a wide range which makes it almost impossible to generally
suggest a specific type of device. Each user must choose a device type by himself. To choose
the best device depends on one´s individual wishes and requirements. A good idea before
purchasing a specific device is to take a look at the present habits and procedures when
documenting meetings and agreements. The goal is to find out where, when and how
situations occur to use ConfDoc. On the basis of such conclusions and also regarding
technical demands and financial aspects a decision can be reached.
36
4.4 Extended terms of reference
Using audio or video data may lead to a very large amount of data. A problem that should not
be underrated is the usage of compression methods to reduce the amount of data. This topic is
discussed in detail in Chapter 5.3.1.1 , “Audio compression”.
37
DEVELOPING A NEW SYSTEM
5
Developing a new system
As treated above we need a system which is capable of recording and compressing audio or
video data, create so-called index data and additionally link these types of information
together.
For creating the index data, we want to use the keyboard, a pen device and a mouse.
So we have textual information entered via keyboard, drawings or handwritten notes entered
using the pen and additionally markers for simply indicating significant positions within the
recording.
For the usage on the road, a portable device will be selected.
5.1 Platform, Operating system
Choosing the best platform and operation system for this application depends on several
different aspects.
•
Audio or video recording capabilities for the raw data.
This should be simple, because nearly every common hardware and operation system
enables the recording and compressing of audio or video data.
•
Making the system portable demands handy hardware.
Common notebooks or subnotebooks may be too large for a suitable usage. A smaller
hardware would be better.
38
•
Identical or similar programming techniques for desktop and portable system.
This is an important point of view especially for a developer. Identical programming
techniques require only one development for all hardware platforms. In the best case, the
application is developed on one hardware platform and is ported to the other. This
shortens implementation times
•
Availability for the developer and user
Available hardware avoids additional costs for the developer and user.
Looking at the above aspects, a personal computer in combination with a personal digital
assistant may be the best solution. The usage of Windows 95/98 or Windows NT for the
personal computer and Windows CE for the PDA provides similar programming techniques.
The widely available operating system Windows CE allows to choose between many different
devices and different models. The range lasts from handheld PC’s, handheld pro PC’s to
palm-size PC’s. Selecting the best device is up to the user and depends on his preferred input
method and the main usage of this system.
Using this application for a stationary phone call documentation, a personal computer will
also fit the requirements.
If a mobile system with keyboard is required a notebook or handheld computer will be the
best choice.
Some users prefer the smallest available device and do their documentation job without a
keyboard. For them a palmtop device will suit their needs.
5.2 Additional Hardware
To record telephone calls, it is necessary to capture the audio from the telephone. For this
purpose, a special hardware is required.
The main goal of this circuit is to connect the line in jack of the sound card potential-free to
the a,b lines of the analog phone line.
The Audio Transformer (TFÜ) is current-limited by the R-C circuit and the two diodes cut the
output voltage to a maximum of 0.7 V. The attached variable resistor is used as a volume
control.
This simple but effective circuit would do the job for the first tests.
39
Figure 5-1 : audio capture circuit for a analog phone
5.3 Functionality
The first implementation of this project has less requirements than the whole project. When
using Windows NT or Windows 95 as operating system the server application is covered by a
simple file server. So it is not necessary to implement the server side of this application. The
client saves it’s information locally with files that can be stored later onto a server if
necessary.
5.3.1 Raw data
For this version an audio stream represents the raw data. An audio recording is provided with
at least 8 kHz3 sampling rate.
The recording of the audio stream will be done with double buffering. This is a method where
two memory buffers are used to receive the audio data from the audio device. While one
buffer is filled with new recorded data, the second buffer can be compressed, written to file
3
8 kHz means 8000 samples per second
40
and cleared for the next usage. The computer which runs this application must be fast enough
to process this second buffer while the first buffer is filled with new data.
5.3.1.1 Audio compression
Since simple PCM streams lead to a very large amount of data, a compression algorithm is
used. The compression allows to trade the audio quality with file size. The Windows 95 / 98 /
NT / CE operating systems API provides an audio compression manager (ACM) which allow
to use so called codec4 drivers for compression and decompression.
When recording, the raw data is taken from an audio source, compressed by the codec driver
and can then be stored into a file. The playback is as simple as the recording. The compressed
file is read, decompressed by the codec driver and then sent to the audio device for playback.
This implementation allows to choose a specific codec driver from the available drivers on the
system. The file size of the resulting audio file will not only depend on the recording time but
massively on the chosen codec. Be sure to select a suitable codec driver which helps to get a
good audio quality and an acceptable file size.
A good introduction in speech coding and commonly used coders can be found at [MMRUS].
I will not get in detail of audio compression codecs and the three classes (waveform codecs,
source codecs and hybrid codecs). As a help for choosing a suitable compression driver some
common codecs will be listed below.
5.3.1.1.1 PCM
Standard PCM streams are not compressed. Depending on the sample rate, the bit resolution
and the number of channels the memory consumption per second (kB/s) may vary within a
long range.
The kB/s is calculated with the following formula:
bits
8
The following table compares some possible combinations:
kB / s = sample rate ⋅ number of channels ⋅
NR
Comment, Usage
Sample rate
[1/sec]
channels bits Approx. Size
[kB/s]
1
CD, highest quality
48000
2
16
192
2
CD quality
44000
2
16
176
3
Radio quality
22050
1
8
22
4
Telephone quality
11025
1
8
11
5
Minimum quality
8000
1
8
8
4
codec: compressor and decompressor driver
41
Due to the very big resulting file size, this format is not recommended. There will be better
codecs which produce nearly the same audio quality while needing less disk space.
5.3.1.1.2 Mobile Voice
The Mobile Voice codec compresses and decompresses audio data using an HP/CU
proprietary algorithm. This codec was developed especially for handheld and palmtop devices
which have very few memory and disk space. The resulting file size is below 1 kB/sec. The
quality of the recorded audio is not satisfactory, so this codec is not recommended.
5.3.1.1.3 Truespeech from DSP group
TrueSpeech is a family of speech compression and decompression algorithm and also a
software. It has been designed for personal computers and personal communication devices.
With the high compression rates ranging from 15:1 to 27:1, TrueSpeech improves the storage
and communication transmission of digital voice information and it can also be used in the
integration of personal computers and telephones.
This could be a suitable driver, but it is not the driver of my choice.
5.3.1.1.4 MPEG Audio
MPEG (Moving Pictures Experts Group) is a group of people that meet under ISO (the
International Standards Organization) to generate standards for digital video and audio
compression. In particular, they define a compressed bit stream, which implicitly defines a
decompressor. However, the compression algorithms are up to the individual manufacturers,
and that is where proprietary advantage is obtained within the scope of a publicly available
international standard.
Real time encoding with this algorithms requires very fast machines so this codec is not
recommended.
5.3.1.1.5 ADPCM Codecs
Adaptive Differential Pulse Code Modulation (ADPCM) codecs are a waveform codec which
quantize the difference between the speech signal and a prediction that has been made of the
speech signal instead of quantizing the speech signal directly, like PCM codecs do it. If the
prediction is accurate then the difference between the real and predicted speech samples will
have a lower variance than the real speech samples, and will be accurately quantized with
fewer bits than would be needed to quantize the original speech samples.
The best compression rates that are achieved with this codec were about 4 kB/sec. The next
codec will do a better job.
42
5.3.1.1.6 GSM 6.10
GSM 06.10 is a standardized lossy speech compression used by most European wireless
telephones. It uses RPE/LTP (residual pulse excitation / long term prediction) coding to
compress frames of 160 13-bit samples (8 kHz sampling rate, i.e. a frame rate of 50 Hz) into
260 bits.
For more details see [JS96] and [GSM610].
The quality of the algorithm is good enough for reliable speaker recognition, even music often
survives transcoding in recognizable form (given the bandwidth limitations of 8 kHz sampling
rate).
This driver produces approximately 1.6 kilobytes data per second. This is about five times
smaller than PCM streams that are not compressed. Since this was the only codec driver,
which was working correctly on a Windows NT machine, I would recommend this codec as a
good deal between audio quality and file size.
5.3.1.2 Audio file format
To be compatible with common audio applications a standard file format is implemented.
The preferred format for multimedia files is the resource interchange file format (RIFF). The
RIFF file I/O functions work with the basic buffered and unbuffered file I/O services.
[MSDN01] describes this format in detail.
RIFF files use four-character codes to identify file elements. These codes are 32-bit quantities
representing a sequence of one to four ASCII alphanumeric characters, padded on the right
with space characters.
The basic building block of a RIFF file is a chunk. A chunk is a logical unit of multimedia
data, such as a single frame in a video clip. Each chunk contains the following fields:
•
A four-character code specifying the chunk identifier
•
A doubleword value specifying the size of the data member in the chunk
•
A data field
The following illustration shows a "RIFF" chunk that contains two subchunks.
43
Figure 5-2: structure of a RIFF chuck
A chunk contained in another chunk is a subchunk. The only chunks allowed to contain
subchunks are those with a chunk identifier of "RIFF" or "LIST". A chunk that contains
another chunks is called a parent chunk. The first chunk in a RIFF file must be a "RIFF"
chunk. All other chunks in the file are subchunks of the "RIFF" chunk.
"RIFF" chunks include an additional field in the first four bytes of the data field. This
additional field provides the form type of the field. The form type is a four-character code
identifying the format of the data stored in the file. For example, Microsoft waveform-audio
files have a form type of "WAVE".
5.3.2 Index data
Three different types of index data are implemented.
All indices work the same way using so-called events: An event occurs, whenever an index is
created and the current running time of the documentation is saved with this event. With this
running time, the audio playback can be positioned right to this event.
As an addition to the recording, the finished documentation can be extended at a later time.
This is helpful when reviewing the documentation after a meeting and correcting some typing
errors, making some extensions or carrying out some topics more detailed.
For this reason the same events are created as they were created during the recording. The
only difference is, that the stored running time represents the end of the documentation. With
this trick it is possible to distinguish between actions which were stored during the recording
and actions which were stored during the extension.
44
5.3.2.1 Keyboard entry
The first type of index data is a camcorder like function which records all pressed keys in an
edit window.
Pressing a key during the recording function causes the application to enter a subclass
function, where the current key-code and the status of the Toggle Keys (Shift, Ctrl and Alt
Keys) are stored to a file. All keyboard actions belonging to the edit window are stored in this
manner. This shows, that pressing the backspace key not only deletes one character in the edit
window, but also is stored with the key-code for later playback.
The contents of the edit window itself are never stored. It is only a result of playing back the
recorded keyboard actions. When seeking to a specific position within the documentation
during playback, the edit window is cleared and all recorded keyboard actions are repeated till
the current position. This leads to a content of the edit window, which is the same as it was at
the recording time.
Looking at this some will see, that the file-size of the keyboard file will grow at every
keyboard tip in contrast to the content of the edit window.
For example pressing the letter ‘A’ and then the ‘BACKSPACE’-key results in an empty edit
window, but leads to two stored keyboard actions in the file.
5.3.2.2 Drawings
For this type of event a new window is created which can be used as a sketch pad by the user.
Drawing in the window can be done with different input devices. The only important thing is,
that the device has to be mapped as a mouse. Some manufacturers provide good hardware to
simulate a pencil and a sheet of paper. A good example is [WAC]. The mapping as a mouse
causes the input device act like a mouse. So the application can not and has not to distinguish
between different input devices. From now on there will be no difference between the several
input devices and all further explanations will be done with a mouse as an input device. So
clicking a mouse button when using a standard mouse is the same as tapping the tip of the pen
onto the tablet when using a pen device.
All lines the user draws are stored using events. For a better understanding of the storage
mechanism let’s have a look at the different phases of drawing a line:
•
clicking the mouse button down
•
moving the mouse around (with the button still pressed)
•
releasing the mouse button
This represents the drawing of a line, which has not to be a straight one. It can be constructed
by adding several short straight lines to one connected line. See the figure below:
45
Figure 5-3: drawing a line
Every mouse movement generates a WM_MOUSEMOVE event. These events are only
processed, when the mouse button is down. When the mouse button is up, the events are
ignored. Looking at the above phases, two different drawing events are generated and stored
into the file:
•
A start point of a line
•
A point inside a line (including the end point of a line)
A start point of a line is detected, when the mouse button was up before the mouse movement,
and is now down. Then a simple point is drawn and the position is saved for the next event as
the starting point of a single line. Every following WM_MOUSEMOVE event with the mouse
button still down is stored as an inside point of a line. The line ends, when the mouse button is
released.
With this ability to differ between recordings during the meeting and recording while
extending some information, the extended strokes can be displayed in a different color.
5.3.2.3 Markers
A marker can be used to mark a specific passage of the conference. Six different types of
marker are supported. These types are:
•
Remark
•
Question
•
Agree
•
Disagree
•
Action to be taken
•
Change of subject
46
5.4 Scenario with this application
The following description shows a scenario, which the application should enable:
The target platform for the client application is a common personal computer with Windows
95 or Windows NT as operation system.
5.4.1 Recording a document
The audio source is recorded in the background. Common information like actual recording
time is displayed continuously.
During the conference it is possible to write down textual notes in an edit window. Every
action in this edit window is recorded including backspace, cursor keys or cut and paste
actions.
A second window is provided, which can be used as a sketch pad. The lines are drawn with
the usage of a standard mouse as input device. Image editor specific tools like an eraser,
shapes or something else is not supported.
Additional markers can be set by pressing a suitable push button and every marker can
contain some extra textual information which can be displayed and edited when clicking on a
marker. The different types of markers are displayed on a time line at the corresponding time
position with characteristic icons for that marker type.
5.4.2 Playback and usage of the document
The playback is similar to the operation of a video recorder. The playback can be started or
stopped with buttons and a slider, which indicates the actual playing position, can be moved
within the document. The playback can the be started at the new slider position.
With the index information of the markers, the drawing tool and the textual information, the
slider is repositioned to the corresponding time. Once again, playback can be started at this
position.
The textual information and the drawings are restored during playback. This looks like a
ghost writer is using the keyboard and the mouse.
The markers, which are set during the recording phase, are displayed on the timeline and can
be clicked to show or edit the additional information.
Some search mechanism are provided. When the slider is at the end of the document, the final
text in the edit window and all the drawings are visible. By clicking into the drawing area, the
slider is repositioned to the time, where the drawing event occurs that is the closest to the
47
clicking position. The same procedure is used with the textual information. By setting the
cursor into the edit window, the slider is set to the time, when this word was typed.
5.5 Purpose of this application
This application is used to test the habits of different users when documenting a conference or
meeting. The different methods to create index information should be examined and
compared to obtain a method, which is easy to use but provides a sufficient way to retrieve
information from the raw data with an acceptable effort.
Porting the application to a Windows CE device should point out the advantages and
disadvantages of a small but portable system.
48
USER GUIDE
6
User guide
The user guide should help users to handle this system. The different chapters are arranged in
an order the user should need it. So this would be a complete walkthrough.
6.1 Running the application
After the installation of the program, it can be started using the start menu of Windows. There
are no command line options necessary to run this application successfully.
Figure 6-1 : The Main window before ...
Figure 6-2 : ... and after the configuration
When the application is started for the first time, the button for a new documentation
recording and the button to review a recording are disabled. First a correct configuration is
needed to enable these buttons. For this purpose, select “Configuration ...” from the Menu.
Figure 6-3 : The main menu to bring up the configuration
49
USER GUIDE
6.2 Configuration
The configuration window includes sections for a data path, the audio compression and a
section to define the colors for the drawing controls.
Figure 6-4 : The configuration window
The Data Path represents the folder on the client machine, were the newly created
documentation files are stored. When reviewing a documentation, this folder is the default for
searching for files. The Browse Button on the right side of the data path lets the user select a
specific path without entering it’s name into the edit control. So typing errors are avoided.
Note: The Data Path can also be a network path, but it is recommended to use a local path in
order to avoid loss of data during network problems.
The section for the audio compression lets the user select a specific audio codec for the
compression and decompression of the audio streams. Pressing the Select Button on the right
side of the line brings up a standard window to select a codec.
Figure 6-5 : Selecting a codec for the compression and decompression
50
USER GUIDE
The third and last section in the configuration window contains three customizable color
fields. These colors stand for:
•
Main stroke color
This is the color of the main strokes. All lines, which will be drawn during the playback
are drawn with this color. This should be a dark color on the white background to achieve
a good contrast.
•
Preview stroke color
The preview gives the user a good outlook of the whole page of the drawing. The preview
color should be a light color with a low contrast to the background, till it represents not
the main strokes.
•
Extended stroke color
The extended strokes are not recorded during the conference, but after it. A different color
to the main stroke color lets the user quickly recognize, which strokes are made during the
conference and which strokes are the result of the extension process.
The colors can easily be chosen by clicking onto the colored rectangle. By performing this
action a standard color choose dialog pops up and a new color can be selected.
Figure 6-6 : Selecting a color
51
USER GUIDE
6.3 Creating a new documentation
To start a new documentation session simply press the New Button in the Main Window.
Note: This Button is only active after a successful configuration.
When pressing this button, the three windows for a documentation are displayed and a new
recording starts immediately. Feel free to move around the three windows on the screen. The
position is stored and used for all following documentation sessions.
6.3.1 The main window
The main recording window contains:
•
six buttons for the markers,
•
some information about the current running time and space consumption of the audio
recording,
•
a slider for the current position and
•
two buttons to terminate the recording.
Figure 6-7 : The main recording window
Pressing one of the six marker buttons creates a new marker and opens a window to enter
some additional notes for this marker. The newly created marker is displayed in a rectangular
region below the slider using a small icon. This icon can be clicked to reopen the additional
marker notes window.
52
USER GUIDE
Figure 6-8 : The additional marker notes
In the recording mode, the slider can not be moved. This feature is only enabled during the
playback mode.
Pressing the OK button causes the application to save all the data. The Cancel Button allows
to discard the recordings.
In both cases the recording is terminated and the three windows are closed.
6.3.2 Handwriting and Notes
To create the important index data, two windows are provided:
•
Handwriting
•
Notes
The Handwriting window is used like a sketch pad. The best results can be obtained using a
pen device.
In the lower right corner some buttons are available to leaf through the pages. When pressing
the next button at the last page, a new page is created automatically.
53
USER GUIDE
Figure 6-9 : The Handwriting window
The Notes window is used for textual information. The content can be used later for a full text
search over several documents because it is available as a simple text file, too.
All common keyboard shortcuts are provided including selecting, cut and paste.
Figure 6-10 : The Notes window
6.4 Reviewing a documentation
To review a documentation open a previously recorded document using the Open Button in
the main window. Similar to the recording process, three windows are opened. The
54
USER GUIDE
Handwriting and The Notes window are exactly the same, but the main playback window
differs from the main recording window.
Figure 6-11 : The main playback window
When seeking within the document with any of the available methods, all the windows are
synchronized depending on the current start position after that seek:
•
The slider is set to the new position
•
The handwriting window displays the contents as they were at the recording time and
•
The Notes window shows the same text as at the recording time.
6.4.1 Random seek
Using the operator buttons on the top left of the main window the playback can be controlled.
In the following, each individual button is explained:
Seek to the beginning of the document
Seek back approximately 2 seconds
Seek back approximately 2 seconds and then start playback
Start playback at the current position
Seek forward approximately 2 seconds
Seek to the end of the document
Alternatively it is possible to move the slider to seek within the document.
Note: Seeking is only possible, when the playback is stopped. During an active playback,
seek actions are rejected.
6.4.2 Accurate seek
When looking for specific information, the random seek methods would not meet the
requirements for an effective search. There are three better seek / search methods available
and these methods represent the key features of this application.
55
USER GUIDE
6.4.2.1 Markers
Markers, which were set during the recording phase, are stored with time information.
Clicking a specific marker seeks the slider to the corresponding time, where the marker was
set and a window is opened to review the additional marker information.
6.4.2.2 Handwriting
The Handwriting window provides a simple search method using the mouse.
First of all, the page which is used for the search should be displayed. The buttons on the
bottom of the Handwriting window can be used to leaf through the pages. When the
appropriate page is displayed, a mouse click within the window starts the seeking process.
Starting from the click position the nearest point on a line is used to get the time information
and the new start position is set to this time.
Figure 6-12 : Seeking using the Handwriting window
6.4.2.3 Notes
Seeking by using the notes requires to display that part of the text where you want to go to.
This can be achieved by seeking to the end of the document using the button or moving the
slider to the right end.
Now use the mouse to set the text cursor on any position within the text in the Notes window.
Using this cursor position, the new start position is updated to that time, where these letters
were typed.
6.5 Extending a documentation
An existing documentation can be extended by pressing the Extend Button.
This enters into an extending mode, which is similar to the recording phase.
56
USER GUIDE
In this mode additional textual and handwriting notes can be entered in the appropriate
windows. The time information for these extended actions is set to the end of the document.
So for the whole documentation this looks like the information was entered at the stop time at
the recording.
In the Handwriting window, this additional information is displayed in a different color, so
when reviewing the document at a later time, it is very easy to see the extended information.
Pressing the Extend Button a second time leaves the extend mode and reenters the normal
playback mode.
57
THE IMPLEMENTATION
7
The Implementation
ConfDoc is programmed in C++. This chapter deals with the actual implementation and
describes some important processes with the help of state diagrams and flow charts.
58
THE IMPLEMENTATION
7.1 The main parts of the program
Idle
Main menu
Playback
Recording
Extend
Figure 7-1: main state diagram
The above state diagram shows the different modes of this application.
When starting the program the idle mode is started. All further program states are reached
through the main menu. Pressing the New Button causes the application to enter the recording
mode, pressing the Open Button leads to the playback mode. The Extend mode can only be
entered when the playback state is already active. Pressing the Extend button a second time
leaves this mode.
Note: The configuration window is ignored in the above illustration.
59
THE IMPLEMENTATION
7.2 Creating a new documentation
Because the audio recording is done in the background, only the creation of the three index
data is described.
7.2.1 Audio
The audio recording is realized with double buffering. For this reason, memory is allocated
for two buffers which are filled alternately from the audio device. While one buffer is filled
with new recorded data, the second buffer can be written to file and cleared for the next usage.
The wave audio API supports a callback function, which is called from the audio device
driver whenever a buffer is full and ready to be processed.
begin
recording
terminate
recording
open wave
device
wait till de vice
drive r is finis he d
prepare
buffers
unprepare
buffers
s e nd
buffe rs to
de vice drive r
close wave
device
Figure 7-3 : Terminating the recording
wait till callback
is e nte re d
Figure 7-2 : Initializing the recording
The recording starts with an initialization procedure where the audio device is opened, the
buffers are allocated and both buffer are sent to the audio device. Now the recording starts and
60
THE IMPLEMENTATION
there is nothing else to do than to wait till the first buffer is full and the callback function is
entered.
begin callback
buffer 1
is full ?
yes
save buffer 1
to file
no
buffer 2
is full ?
no
prepare
buffer 1
yes
save buffer 2
to file
s e nd
buffe r 1 to
de vice drive r
prepare
buffer 2
s e nd
buffe r 2 to
de vice drive r
leave callback
Figure 7-4 : Callback function of the recording
The callback function is responsible for saving the full buffer to file, preparing and sending it
to the audio device for the next recording phase.
The termination of a recording is done when the user presses the OK or Cancel Button. In
this case, the audio device notifies the termination and then it is necessary to wait till the
audio device has actually finished. This is essential to let the audio device enough time to stop
the recording and compress the audio data in the buffer so far.
61
THE IMPLEMENTATION
7.2.2 Markers
Creating a new marker requires no demanding activities. The only two actions to be taken are:
•
Get the type of the marker and
•
Retrieve the current audio position from the recording.
This is all necessary information which is stored with a new marker.
Marker Button
was pressed
get type of
marker
get audio
position
save data to file
*.mrk
Figure 7-5 : Recording a marker event
7.2.3 Handwritten notes
Saving handwritten notes is done using a subclass function. This function is called whenever
the mouse is moved or a mouse button is pressed.
This requires to filter the incoming mouse events and discard all events where no mouse
button is pressed. The remaining events can be divided into two sections:
•
The mouse button was pressed right now
•
The mouse button was down and the mouse was moved.
62
THE IMPLEMENTATION
The first case represents a start of a new line, the second is equivalent to a single straight line
in-between a curve. Remember “Figure 5-3: drawing a line” for an illustration of drawing a
line.
enter Subclass
event
WM_MOUSE
MOVE ?
yes
no
yes
no
no
event
WM_LBUTTON
DOWN ?
is mouse
button
pressed ?
yes
get cursor
position
get cursor
position
save start point
for line
draw line
get audio
position
save data to file
leave Subclass
Figure 7-6 : handwritten notes
*.drw
63
THE IMPLEMENTATION
7.2.4 Textual notes
The textual notes are also captured using a subclass function. The function captures all
WM_KEYUP, WM_KEYDOWN, WM_SYSKEYUP and WM_SYSKEYDOWN messages.
Every event requires to get and save the keycode, the keyboard flags including SHIFT and
CTRL key state and the event itself to reproduce the press of this key at later time. Any other
event is ignored during this subclass function.
enter Subclass
event:
WM_KEYDOWN
WM_KEYUP
no
yes
get keycode
get keyboard
flags
get audio
position
save data to file
*.eky
leave Subclass
Figure 7-7 : Textual notes
7.3 Reviewing a documentation
Opening an existing documentation leads to two different situations.
The first is a simple playback, where the audio data is played and the previously recorded
events are faked at the appropriate running time.
64
THE IMPLEMENTATION
The second, and the more important, is the search capability of the three different index data
types.
7.3.1 Playback
When playing back a documentation, the events are faked by the system at the appropriate
time. The key procedure for this faking is a timer event at the interval of 250 ms. In other
words, every 250 ms the keyboard and the handwriting event queues are checked for pending
events.
The marker events need not to be faked, because all markers are visible on a timeline. The
additional marker remarks are displayed by clicking onto the appropriate marker. During
playback, no actions need to be taken regarding the markers.
enter
Timer event
pending
handwriting
event ?
yes
fake handwriting event
read next handwriting event
yes
fake
keyboard event
read next
keyboard event
no
pending
keyboard event ?
no
leave
Timer event
Figure 7-8 : Playback of a document
65
THE IMPLEMENTATION
7.3.2 Searching for information
Searching for information can only be done when the current playback is stopped. At a
running playback it is necessary to press the Stop Button before the search capabilities are
enabled.
Searching can be done with all three index data types. The markers, the textual and the
handwritten notes. Each index requires a different search method.
7.3.2.1 Markers
Since the time position for a marker is stored directly, searching with markers is a straight
forward method. Strictly speaking it is more a seek than a search. Clicking on a specific
marker causes the application to seek to the position where the marker was set and then opens
the window with the additional marker information to review them.
Mouse button
clicked
ge t time
pos ition of
marke r
fake slider
event with time
ope n marke r
re mark window
end
Figure 7-9 : Searching using markers
66
THE IMPLEMENTATION
7.3.2.2 Handwritten notes
The handwritten notes provide a visual search method. As described in the User Guide
(Chapter 6), a mouse click within the window starts the seeking process. Using the nearest
point to click position is searched by examining all draw events in the recorded document.
Starting at the file beginning all events are examined one by another and the distance between
this event and the click position is calculated. When one event is closer to the click position
than all other previous events, the event is stored and the search is continued. At the end of
the file the event with the smallest distance to the click position is used to redraw the
handwriting window.
67
THE IMPLEMENTATION
Mouse button
clicked
event =NULL
as far a possible
seek to
first event
all events
processed ?
yes
redraw till
saved event
no
end
read next
event
position is
closer than
closest yet ?
yes
save event
no
Figure 7-10 : Searching with handwritten notes
7.3.2.3 Textual notes
Searching with available textual notes is the most difficult way and the only one with
ambiguities. The information which is taken for the search is the cursor position and the two
characters before and after the cursor position. For searching, the textual notes are reproduced
character by character and the text at the cursor position is tested for equality.
68
THE IMPLEMENTATION
The following example should show the ambiguity of this search method:
Let’s suppose the user entered some text, pressed the Backspace key to delete the last
character and then enters the same character again. The same text will then exist at the same
cursor position at two different recording times. For this example the search will result in the
first occurrence of this text.
Mouse button
clicked
ge t chars be fore
and afte r curs or
clear control
seek to
first event
read and fake
next event
ge t chars be fore
and afte r curs or
chars are
the same ?
yes
end
no
Figure 7-11 : Searching using textual information
69
MAKING THE SYSTEM PORTABLE
8
Making the system portable
Making the system portable is an essential part of this project. The best system can not be
used efficiently, if it is restricted to one location. So a portable version of this software is
necessary.
8.1 Hardware
The current available handheld or palmtop computers are powerful enough to fit the hardware
requirements for this system.
8.1.1 Operating system
The operating system Windows CE seems suitable, because of the similarity to Windows
95/98. Some important aspects are:
•
Connectivity to Desktop PC’s
A Windows CE device is well integrated into the desktop of Windows 95/98 and the
communication between these two devices is managed by the operating systems. There is
no need to backup the recorded data manually. The active sync programs can be
configured in that way that they synchronize at every connection. Using a docking station
makes this process as easy as it can be.
•
Memory expansion
Recording audio streams will produce a big amount of data, but Windows CE devices will
have much less RAM available than a desktop PC. In addition, most devices will have no
disk drive. So this amount of data can be a problem.
70
With the availability of an compact flash adapter this problem can be solved.
CompactFlash storage cards range from 2 MB up to 320 MB, as supplemental removable
storage for storing data, including audio, images, files, video, and applications. In
addition, CompactFlash microdrives are available from 160 to 340 MB of storage.
8.1.2 Device Type
When choosing a PDA to purchase, the following characteristics and uses should be
considered. [CEWIN] recommends that users make a list of the intended uses for their PDA
and consider the following characteristics:
•
Size/Weight
Does the PDA fit in the pocket? Is it portable enough? Is it light enough to carry when
using it?
•
Input Methods
Does the PDA have a keyboard? Is a handwriting recognition for input desired ? Does the
PDA have a touch screen, glidepad or a trackpoint for mouse functions? Does the PDA
support an external mouse or external keyboard ?
•
Features
Is a CompactFlash slot needed ? Is a modem necessary - if yes then what speed? Is the
unit fast enough to meet the requirements? Does the PDA have enough RAM for running
programs and storage? Will a flash card suffice for additional storage or is the PC Card
slot occupied? What kind of connectivity is needed - Ethernet, IrDA, RAS, wireless LAN,
wireless nationwide? Which battery capacity is needed ?
•
Uses
Will a big amount of data be entered ? If so, is a keyboard important? How important is a
large screen? How many colors does the screen need to support - 256 colors, 65536
colors? Is image editing or printing important?
•
Applications
Is Pocket Word, Pocket Excel, Pocket Access or Pocket PowerPoint needed? If so then
consider the H/PC Pros since the P/PCs do not offer these capabilities. Is a special
application needed ?
The following table gives an built in application comparison between the two common
hardware types:
Application
Palm-size PC
Handheld PC
71
Pocket Outlook
(Calendar, Contacts, Tasks)
Yes
Yes
Pocket Applications
(Excel, Word, Powerpoint, Access)
No
Yes
Notetaker/Inkwriter
Yes, Notetaker
Yes, Inkwriter
Pocket Internet Explorer (Web Browser) No, Mobile Channels
Yes
Inbox (E-Mail)
Yes
Yes
Accessories
(Calculator, Voice Recorder,
World Clock)
Yes
Yes
Communications
(PPP, Terminal Emulator, Ethernet, PPP, Ethernet
Wireless, Redirector)
Yes
Games
Yes
Yes
Maps & GPS
Yes - Pocket Streets only Yes - Pocket Streets Only
Voice Commands
No
No
Each of these different hardware systems includes different applications. For some systems,
3rd parties have added capabilities that exist for other platforms.
8.2 Software
Many Windows 95 applications can be ported to Microsoft Windows CE. This costs less
effort than developing those from scratch. The major issues when porting to Windows CE are
([MSDN02]):
•
Differences between the Microsoft Win32® application programming interface (API) and
Windows CE application APIs
•
Differences between the standard Microsoft Foundation Class Library (MFC) and MFC
for Windows CE. This can be ignored, because MFC is not used for this application.
•
Memory limitations and out-of-memory recovery
•
Energy limitations
•
Widely varying hardware characteristics and limitations
•
Differences in testing and debugging
72
8.2.1 Differences between the Win32 and Windows CE APIs
The Windows CE API differs from the Win32 API in several important respects:
•
It is smaller. Only a subset of Win32 API is supported, and some of what is supported has
a reduced feature set . Some Win32 functions are not supported at all—and none of the
16-bit Windows functions. These functions must be replaced with alternative ones, if
available, or a work-around must me created.
•
Some Win32 functions have been substituted by Windows CE equivalents. For example,
tool and menu bars have been combined into a single Command Bar, which has a new
API.
•
Some Win32 functions are supported, but in a limited way. Some may have one or more
parameters completely disabled. Others may have parameters with a reduced range of
options.
•
Supported data types may need a modification. All necessary Win32 structures are
supported, but some members may not be used. Other structures may not accept the full
range of options.
•
Some messages are not supported—including many WM_* and EM_* messages. Some
that are supported have been modified. For example, the content of wParam or lParam
may be different. Some Windows CE–specific messages have been added—
WM_HIBERNATE, for example.
•
There are Windows CE–specific extensions. Many of these—including the touch screen
and notification—support the hardware capabilities of the various devices
•
There are limitations concerning the use of exception handling. While there is support for
Win32-structured exception handling, Windows CE does not support C++ exception
handling.
When porting existing Win32 applications from the PC platform to Windows CE, the primary
issue will usually be the smaller API. Applications will need to accommodate the limitations
of the Windows CE API and the capabilities of the target devices.
8.2.2 Memory Limitations
Windows CE devices will, in general, have much less RAM available than a desktop PC. In
addition, most devices will have no disk drive or other mass-storage device. In most cases,
porting an application to Windows CE successfully will require reducing its size.
73
When porting an application to Windows CE, you should focus on the features that are used
most frequently. Microsoft Pocket Word and Microsoft Pocket Excel are examples of how to
reduce the feature set of an application while still maintaining its essential functionality.
Applications should be written in that way that they need as little memory and storage as
possible. They must also be able to cooperate with the system in managing memory shortages.
The amount of available memory depends on the particular device, so be aware of the
capabilities of the target platforms. Windows CE makes no distinction between using mass
storage (temporary files, for instance) and using RAM.
The usage of memory and mass storage in the Windows CE application should be minimized
and memory-intensive features like bitmapped graphics should be simplified or eliminated.
Temporary files are not necessary. In some cases, the code can be rewritten to reduce memory
usage at a cost of somewhat lower speed, which may be an acceptable tradeoff.
If memory resources become tight, Windows CE has a procedure to reduce memory usage
and restore available memory to acceptable levels.
8.2.3 Energy Limitations
Windows CE devices may have very limited energy resources. The Handheld PC (H/PC), for
instance, runs on two AA batteries. Programs should be written to minimize energy
consumption as much as possible. In order to conserve energy, many Windows CE devices
will shut down if they are not used for a certain period of time. Windows CE applications are
expected to resume where they left off following a shut down. If a critical power shortage
occurs while an application is running, that application must be able to handle the situation
gracefully.
Windows CE displays warning messages when the batteries start to run low but it does not
send any warning to the applications. For many applications, these messages may be
sufficient to ensure that the user takes appropriate action to avoid loss of data.
An active (running) CPU consumes significant quantities of energy, so avoid coding practices
that use CPU cycles unnecessarily.
However, some hardware—modems, for instance—can drain batteries rapidly. If the program
is going to place significant demands on the batteries, checking the state of the batteries first
may be very helpful. If they are too low to complete the procedure, the user can be advised to
take appropriate action.
8.2.4 Hardware Characteristics
Windows CE is designed to run on devices that are, in general, smaller and less powerful than
desktop PCs. For example:
74
•
Screens are typically smaller, have fewer pixels, and may not support color.
•
CPUs are slower.
•
User-interface hardware such as keyboards may be less flexible.
On the other hand, some devices may include hardware that is not standard on a PC—the
infrared transceiver found on H/PCs, for instance. In any case, do not assume that all
Windows CE–based devices are essentially similar to either a desktop PC, or each other. Keep
the hardware characteristics of the target device firmly in mind.
When porting an application to more than one class of devices, it will be necessary to find a
"lowest common denominator" to ensure that the application will work successfully on all its
target platforms. Although emulation is an important development tool, applications must
ultimately be tested on actual devices to make sure that they perform properly.
8.2.5 User Interface
The user interface may be one of the more difficult issues you will have to deal with.
Virtually all PCs have a mouse and a keyboard for input, and a screen for output that is more
or less similar on virtually all machines. A program that works on one machine will usually
work on all. Windows CE involves a class of devices that not only differ from PCs in many
ways, but may also differ significantly from each other.
In short, there are no simple rules that will work for all Windows CE devices. To port a
program successfully, one should know the peculiarities of the target platforms, and make
whatever modifications are necessary. While PC-based emulators are valuable development
tools, they have their limits. To be sure that the user interface is well designed and functional,
it is essential to test it on the actual devices it is intended for.
8.2.5.1 Creating and Managing Windows
Creating and managing windows with Windows CE is almost the same as it is with Win32.
However, one will have fewer window-style and management options.
Probably the most noticeable difference is that a user cannot resize windows. A window can
only have the size specified at the time its creation.
When targeting devices with small screens, the application should use full-screen windows.
However, one should not code static layouts because different devices can have different
screen dimensions. Most currently available H/PCs are 240 × 480 pixels and roughly 2.5 × 4
inches. Some have 240 × 640-pixel screens that are also somewhat wider, changing both the
pitch and the aspect ratio. The GetSystemMetrics function can be called to get the relevant
screen dimensions and to define window size.
75
8.2.5.2 Using Windows CE Dialog Boxes
Windows CE supports both modal and modeless dialog boxes and the predefined controls
found in Windows 95. However, not all control styles are supported. Message Boxes are also
supported, although with fewer styles than for Windows 95.
Windows CE supports simple implementations of the common dialog boxes, Open and Save
As. Their on-screen appearance is similar to Windows 95, but there are fewer controls
available.
8.2.5.3 Porting User Interface Controls
Most of the standard Windows controls and common controls are supported, but there are
some limitations. One major difference is that the menu and toolbars are combined in a single
command bar that occupies the top of the window and has its own API. Tool tips are only
supported for buttons on the command bar.
In general, there is a more limited range of options available.
8.2.5.4 Porting the Graphics Device Interface
Most PC applications have a graphics device interface (GDI) that is inappropriate for
Windows CE, so that they have to be modified before being ported. To keep the footprint of
the operating system small, a number of Win32 GDI functions are not supported at all. In
addition, the limitations imposed by hardware on many devices—limited screen size, palette,
and aspect ratio, just to name a few—will require a somewhat different approach.
8.2.5.4.1 Adapting bitmaps and icons
The available palette can be quite restricted. Some devices support a color screen, although
probably with a more restricted palette than a typical PC. Many will have only grayscale
graphics. So, for instance, the first generation of H/PCs supported only two-bits-per-pixel
grayscale LCD displays.
Bitmaps and icons should be in an appropriate format for the target device. LCD screens can
be difficult to view in some settings, so keep the contrast as high as possible.
8.2.5.5 Using Unicode
Windows CE is an Unicode environment—it supports ASCII functionality to allow the
exchange of text files, but the native text format is Unicode. Some general guidelines for
converting an ASCII application to Unicode are:
•
Include Tchar.h. It has all the necessary conversions.
76
•
Use the Win32 string functions (lstrlen, for instance) rather than those from the C runtime library.
•
Use TCHAR, LPTSTR, and so on, for declarations. The code can then be easily compiled
for either ASCII or Unicode.
•
Use the TEXT macro for strings (for example, TEXT("Your Text")).
•
Remember that a character is no longer one byte in length, and strings end with two zeros
rather than one.
•
When incrementing an array pointer or character count, use sizeof (TCHAR) to ensure
that it is valid for either ASCII or Unicode.
8.2.5.6 Managing Windows CE Threads
Windows CE is a multithreaded operating system, but there are some limitations relative to
the Windows 95 and Microsoft Windows NT® operating systems. Probably the biggest
difference is that semaphores are not supported. If the application uses semaphores, for
example, to manage device resources, it will need to be modified to use some other method of
coordination between threads. For example, the application can use critical sections for thread
synchronization.
8.2.6 Testing and Debugging
Developing applications for Windows CE is very much like developing applications for other
Win32 targets, but there are important differences concerning the testing and debugging
methods. If you are developing for a standard Windows CE target (such as the H/PC), then
much of your development and testing work can be done using the Windows CE emulation
environment provided with your development tools. If, however, you are developing an
application for a nonstandard hardware platform (such as a custom-embedded application),
then you have to consider alternate methods for verifying the correctness of your application.
The Windows CE API includes interfaces for debugging that can be used to create in-system
debugging tools. Depending on your target hardware and application, you can also use the
Remote API (RAPI) features of Windows CE to assist in debugging.
In any case, you will need to thoroughly test your application on all classes of device that the
application is expected to operate on. You should not rely on emulation environments to
provide adequate testing.
77
8.3 Porting ConfDoc
8.3.1 Preparing the implementation
Before starting to implement this application, I tried to acquire information about coding a
Windows 95 application in a way to easily port it to Windows CE. There were really less
information available and the development for Windows CE was in the child’s shoes.
Referring to [DDJ] the most important two tips were:
• Do not use MFC (Microsoft Foundation Class)
The MFC classes for Windows CE are not fully functional. Using MFC on for Windows 95
can lead to big problems if the used classes are not available on the Windows CE device.
• Do not use STL (standard template library)
The Microsoft C++ CE compilers does not support the STL.
Some differences between the Win32 API and the Windows CE API were listed, but not all.
Especially for the audio recording and the standard wave file format was no information
obtainable.
8.3.2 The porting process
8.3.2.1 UNICODE
The Unicode environment was a bit confusing the first time, but these problems were solved
really fast. Especially the usage of string functions caused some problems because the
Unicode environment needs some rethinking.
8.3.2.2 User interface
Porting the user interface was a straight forward procedure. Although no problems occurred
during the porting process, the finally user interface is not as good as it can be. The user
interface for Windows CE must be completely redesigned to fit to such a small display.
78
8.3.2.3 Windows CE API
The most known functions not supported by the Windows CE system were omitted during the
coding process but some of them were still used. They caused some additional porting
expenses.
The bigger part of necessary porting activities were caused by function which are not
supported by Windows CE and were no information was available at the start of the coding.
These functions include the Shell – extensions for selecting a directory (SHBrowseForFolder)
and for a different color (ChooseColor) in the configuration dialog and all functions regarding
the resource interchange file format services and the buffered services for audio recording.
Since the audio recording and the wave file functions are a big part of this project new classes
with the same functionality were implemented.
8.3.2.4 Conclusion
The project was started using Microsoft Visual C++ 5.0 and the Windows CE Toolkit for
Visual C++ 5.0. The help library for the Win32 functions did not point out anything about the
availability on the Windows CE operating system. So when coding the Win32 version I had to
keep in mind, which functions are available on Windows CE and which should not be used.
This was not very easy as the results show.
Upgrading to Microsoft Visual C++ 6.0 and the appropriate Windows CE Toolkit led to a
better help system for the developer. The included MSDN Library contains an essential part
called “Requirements” in every documentation of a function, were the availability of this
function is listed.
Figure 8-1 : Requirements from MSDN online
Unluckily this toolkit was only available after the porting was done.
A tip for developers: If no MSDN Library is available it is also possible to retrieve the
information at “MSDN online”5. There you can get always the newest version of the
documentation.
5
“MSDN online”, http://msdn.microsoft.com
79
8.4 Porting a Windows NT ACM Driver to Windows
CE
8.4.1 Motivation
As covered in chapter 5.3.1.1 “Audio compression”, it is necessary to compress the audio
data to avoid a huge amount of data. Since palmtop devices are equipped with a small
quantity of memory (4 MB to 16 MB) and this memory is shared between main memory and
file system, compression is essential.
Especially the first test device, a Philips NINO 300, has a total amount of 4 MB memory.
Dividing the memory into file system and main memory results in a maximum of 2 MB space
for data files.
When buying a Palmtop PC, there are only two different codec’s preinstalled: (refer to
chapter 5.3.1.1 “Audio compression” for details)
•
PCM
•
MobileVoice
Remember, the lowest quality PCM stream produces 8 kB data each second. With reference
to the available 2 MB disk space this leads to a maximum recording time of 256 seconds, or 4
minutes and 16 seconds. This is not a satisfying recording period.
On the other hand, the MobileVoice provides the best compression rates and the longest
recording times with the given memory limitations. But the quality of this compression
method is not acceptable and the recorded words are often difficult to understand.
It is now the goal to find a compression algorithm, which fits all the needs:
•
Good audio quality
•
Acceptable disk space consumption.
The best codec in the prior comparison was the GSM codec. So it is obvious to use this codec
on the Windows CE device too. But this requires to port this codec from Windows NT to
Windows CE.
80
8.4.2 The theoretical porting
[MSDN03] describes the procedure as follows:
“Since the compression of the raw data is done via an ACM driver, this porting is necessary.
To port an ACM driver from Windows NT to Windows CE, link the driver with the
Acmdwrap.lib file, which provides the driver with the appropriate stream I/O interface. This
library not only handles all interactions with the ACM and the Device Manager, but also
passes messages from its ACM_IOControl function to the Windows NT ACM driver’s
existing DriverProc function.”
[MSDN04] and it’s subtopics list all necessary registry entries on the Windows CE device to
install the ACM codec.
8.4.3 The practical porting
So far so good.
The changes were quickly done with the help of a sample source codes on the Windows CE
Toolkit CD. Now the compilation could start. Two different methods were tried:
•
Creating a makefile and compiling via commandline
•
Creating a new project using the Developer Studio
It took several attempt and a long time to build a cegsm.dll file. But the journey should start
by now with the intention to install this brand new codec on the device.
The installation should be done by simply copying the file to the /Windows/System directory
and entering some registry entries to activate it. These registry entries are listed below.
HKEY_LOCAL_MACHINE
[Drivers]
[BuiltIn]
[CEGSM]
SZ: Dll = cegsm.DLL
DWORD: Order = 0x00000000 (0)
SZ: Prefix = ACM
DWORD: DeviceArrayIndex = 0x00000000 (0)
DWORD: DeviceType = 0x00000000 (0)
SZ: FriendlyName = GSM codec for WinCE
Creating these registry entries is possible with some freeware programs or the remote registry
editor from the Windows CE toolkit. I tried both methods, but the device driver won’t work.
81
The adapted source code, the compilation method and the created registry entries are three
part of the development process. These parts can not be tested separately. Testing requires to
perform all three steps at once. If the result is not working, it is hard to find out which part is
erroneous.
Debugging was not easy, because the device driver is loaded at boot time. This means, the
driver is activated, when the device starts from a soft- or hard-reset. Since debugging is
practically only possible via the remote connection and this connection is not available at boot
time, the debugging was not working.
The best thing I could do is contacting a specialist. I tried to receive support from Microsoft
but my multiple mails were not answered. Another source for help is the internet. Searching
some proper newsgroups I found a few articles from users with the same or similar problems,
but no solution to this. Posting my problem and waiting for help was not successful
unfortunately.
At that point of development is skipped porting the device driver and bought a compact flash
card with 16 MB of RAM. Using the installed PCM codec at 8 kHz sampling rate I achieved
an average recording time of 34 minutes. This should be enough for testing.
8.5 Conclusion
Choosing Windows CE was an easy decision, because the Windows CE devices were the only
one with multimedia support. At that time it was not possible to record audio streams using a
3COM Palm or PSION device.
Furthermore Windows CE allows a wide range of different devices. Selecting a palmtop
device as portable system led to a lightweight and handy device. The disadvantage using a
palmtop is a very small user interface which makes it difficult to create the essential index
data.
82
ANALYSIS OF THE NEW SYSTEM
9
Analysis of the new system
The system was tested on a stationary computer as a phone call documentation and as a
portable system running on a Palmsize PC.
9.1 Stationary System
9.1.1 Hardware
The stationary test PC meets the following specifications:
•
Intel Pentium at 350 MHz
•
64 MB RAM
•
8 GB Hard Disk
•
SoundBlaster 16 Stereo Sound Card with microphone and line in jack
•
Wacom Tablet as Pen Device 6 with an additional inking pen 7
•
Ordinary phone with a special circuitry to capture the audio data 8
•
Operating System Windows 95b
6
7
8
WACOM Intuos A5, see http://www.wacom.de/Products/intuos/intuosA5.htm for details
WACOM Inking pen, see http://www.wacom.de/Products/intuos/pens.htm#ink for details
Plantronics Vista Universam Amplifier, see [PLAN]
83
9.1.2 Recording a documentation
On the desktop computer, the application can be started on system boot time. In the idle
mode, where only the main window is active, the memory consumption and CPU usage of the
program is small enough to run in the background. So the system is quickly accessible every
time and there is no need for manually starting the application at the beginning of a phone
call.
When an incoming or outgoing phone call begins, a new documentation can be started easily
by pressing one button. The recording is started immediately and the notes and handwriting
windows are shown.
Because of the usage as a phone call documentation system and an ordinary phone, the user of
this system has only on hand free to generate the index data. For this reason, the pen device is
very helpful and the notes window is rarely used.
By using an inking pen for the pen device, the notes can be written directly on a sheet of
paper and the user is able to write in a familiar manner on a sheet of paper and there is no
need to look on the screen while writing on the tablet.
As seen from the tests, the procedure of recording a document is the following:
•
Starting the documentation with a single mouse click
•
The audio recording is done in the background – there is no need to interact.
•
Using the inking pen, the notes are made in the handwriting window. Because there is
only one hand free, the textual notes are rarely used. The ability to create more than one
page is necessary, because the size of the paper on the tablet is A5.
•
After the phone call the documentation can be saved using the OK button, or can be
dismissed using the Cancel button. This must be done manually.
•
Because of the searching can only be done using the textual notes, some of these should
be provided. For this reason the current recording is saved, reopened and extended. Now
the recording is not running any more, and the textual notes window can be used to enter a
short summary or conclusion of the call.
9.1.3 Reviewing a documentation
The review process is started by clicking on the open button in the main window. After
selecting a file the three windows are displayed.
During the tests, there had never been the need of reviewing the whole recording.
The handwritten and textual notes provide a good overview of the document and the index
data allows a precise access of specific parts of the recording. A simple click into the
84
handwriting window performs a seek to the specified location. The audio recording can now
be used to retrieve a conversation word by word.
This function is not directly implemented in the application. Since the textual notes and the
additional notes for markers are stored in text files, the windows search function can be used
to search for information.
With this search dialog, the filenames of the search results can be used to review a document.
So the textual notes are used to find the correct documentation and the handwritten notes are
used to locate specific information within this documentation.
9.1.5 Suggestions for improvement
•
Starting a documentation should be done with a hot key. On a multitasking system there
are running more than one applications at the same time. Sometimes it is difficult to find
the Conference Documentation Application to start a new recording.
•
To go one step further, the starting and stopping of a recording should be done
automatically.
•
The size of the tablet is too small. A bigger tablet would be better.
•
The best results were achieved by making handwritten notes during the conversation and
creating a quick summary at the end of the recording. For this reason an extra dialog for
the summary should be provided at the end of a recording. The audio recording should be
stopped before showing this dialog.
9.2 Portable System
9.2.1 Hardware
The portable version was tested on a Philips Nino 3009 Windows CE Palmsize PC. This
device meets the following specifications:
•
RISC processor at 75 MHz
•
4 MB main memory
9
PHILIPS Nino 300, see http://nino.philips.com/ for details and datasheets
85
•
grayscale display with 240 x 320 pixels and four shades of gray
•
built in microphone
•
One CompactFlash Card slot
9.2.2 Recording a documentation
A windows CE device has not as much memory as a desktop computer. The documentation
system is not running all the time – it is started when a documentation is needed.
All devices with this operating system have four programmable buttons. The conference
documentation application can be assigned to one of these buttons. This makes it easy to start
the application when needing it.
The two main differences between the desktop application and the portable system are the
screen size and the keyboard.
The small display with only 240 x 320 pixels provides a very small space for handwriting.
The lack of a keyboard makes it very difficult to enter textual notes. Although a software
keyboard and a handwriting recognition are built in to simulate a standard keyboard, the
usage of these abilities results in a very slow writing speed. So the main method for creating
index data is the handwriting window which provides only a small area to make notes.
9.2.3 Reviewing a documentation
The review process including the search and seek capabilities is done in exactly the same way
as on the desktop PC. The only difference of the two systems is the size of the devices.
Because of the small amount of available memory on the palmsize PC, there were never more
than two or three documents on the system. Therefore it was not necessary to search for a
single documentation.
The Windows CE device provides also a standard search dialog, which can be used. The
small size of the user interface makes the search dialog difficult to handle.
86
9.2.5 User Interface
Figure 9-1 : Main menu
The main window and the main menu are exactly the same as at the desktop version.
Figure 9-2 : Main recording window
Figure 9-3 : Handwriting window
The above two illustrations show the main recording window and the handwriting window.
Because a Palmtop device has no keyboard attached, the text input window is omitted.
Figure 9-4 : Review windows
87
When reviewing or recording a documentation the two windows are displayed concurrently.
Because of the small screen size it is necessary to put one window in the background when
using the other. Both windows can not be displayed entirely at the same time.
In the above situation, the marker line is hidden behind the handwriting window.
9.2.6 Suggestions for improvement
•
The main problem for the portable system is the small dimension of the Windows CE
device. Creating index information is very difficult.
•
The user interface must be redesigned for Windows CE.
•
Since practically no textual information is stored, searching for a single document will be
very difficult..
9.3 Combining the two systems
As seen above, the portable system has essential disadvantages. The best results can be
achieved, when using both systems in conjunction.
The fact that both systems create files of the same structure makes it possible to review
documents on the desktop PC, which are recorded previously on the portable system.
The integration of the Windows CE device into the desktop of the stationary computer allows
a simple data synchronization between the two devices. So the portable system can be used to
create a documentation on the road and the stationary system, which is a more powerful and
user friendly system, can be used for searching and reviewing information from these
documents.
88
CONCLUSION
10
Conclusion
Using this system simplifies the process of documenting meetings and conferences but the
results depend on the field of application and the hardware equipment.
The basic idea of this system shows that the index data is the most important part of these
documents. The raw data is created in the background both on the desktop system and on the
portable system, so the main attention will be turned on the creation of the index data. The
quality of the whole documentation always depends on the quality of the indices.
The desktop system provides an appropriate sized interface, sufficient possibilities and
workspace to create the index data. But depending on the type of index data some
requirements will be necessary. Creating textual information requires both hands free to use
the keyboard effectively and entering handwritten information demands a pen device as
mouse replacement or supplementation.
The beta tests with the desktop system has confirmed that using this application as a phone
call documentation system can only be done when using some additional hardware. Either a
headset is used to have both hands free to create the index data or, when using a standard
phone, a pen device will be used to create handwritten notes.
In contrast to the desktop system the portable system has two main differences:
•
smaller amount of memory and
•
smaller device dimension
The smaller amount of memory will play no role because of the usage of compression
algorithms and flash card memory extensions the available memory will be sufficient to
record even long meetings.
89
CONCLUSION
The smaller device dimensions may be suitable to carry the device when using it, but smaller
device dimensions subsequently include smaller display dimensions and a smaller user
interface. This makes it hard to create the necessary index data.
10.1 Further topics and future work
The documents created by ConfDoc seem to be sufficient to retrieve specific information
within a reasonable time. Using the keyboard for notes results in textual information which
can be directly used for a full-text search over several documents. This enables a good
document wide search.
The handwriting window allows to create index data in a very familiar and easy way by
simulating a pen and a sheet of paper. The search capabilities within a single document are
sufficient but the handwritten notes produce no textual information even when using the
capability for handwritten text.
The main problem for this application is the user interface especially when using the portable
system.
When using the textual notes, which provide more search capabilities, the writing speed will
be very slow and the user will need both hands free for typing. The second index generation
method is easier to use but it results in less searchable data because no textual information is
generated.
On the other hand it seems reasonable to take different hardware into consideration. The
usage of Palmsize PC’s leads to a very small user interface which are too small for a
comfortable use. The introduced CrossPad (see Chapter 4.3.6 ) could be an alternative. Of
course this device will not solve all problems. The user interface for the handwritten notes
will be more comfortable than with the Palmsize PC, but textual notes are not created.
The implementation of a handwriting recognition suggests itself to combine the handwriting
with the creation of textual information. To go one step further, a voice recognition system
could do a great job for producing textual information.
List of figures:
Figure 2-1: screenshot of the movie and the accumulated whiteboard ......................................7
Figure 3-1 : Abbildung für Patentantrag...................................................................................11
Figure 3-2 : Figure 1 of EP0495612.........................................................................................16
Figure 4-1 : Examples for subnotebooks: [PANA] and [LIBR]...............................................31
Figure 4-2 : Example of a pen device: [FUJI] ..........................................................................32
Figure 4-3 : Example of a handheld PC : [JORD]....................................................................33
Figure 4-4 : Example of a palmsize PC: [AERO] ....................................................................34
Figure 4-5 : The Crosspad: see [CROSS].................................................................................35
Figure 5-1 : audio capture circuit for a analog phone...............................................................39
Figure 5-2: structure of a RIFF chuck ......................................................................................43
Figure 5-3: drawing a line ........................................................................................................45
Figure 6-1 : The Main window before .....................................................................................48
Figure 6-2 : ... and after the configuration................................................................................48
Figure 6-3 : The main menu to bring up the configuration ......................................................48
Figure 6-4 : The configuration window....................................................................................49
Figure 6-5 : Selecting a codec for the compression and decompression..................................49
Figure 6-6 : Selecting a color ...................................................................................................50
Figure 6-7 : The main recording window................................................................................51
Figure 6-8 : The additional marker notes .................................................................................52
Figure 6-9 : The Handwriting window .....................................................................................53
Figure 6-10 : The Notes window..............................................................................................53
Figure 6-11 : The main playback window...............................................................................54
Figure 6-12 : Seeking using the Handwriting window.............................................................55
Figure 7-1: main state diagram .................................................................................................58
Figure 7-2 : Initializing the recording.......................................................................................59
Figure 7-3 : Terminating the recording ....................................................................................59
Figure 7-4 : Callback function of the recording .......................................................................60
Figure 7-5 : Recording a marker event .....................................................................................61
Figure 7-6 : handwritten notes..................................................................................................62
Figure 7-7 : Textual notes.........................................................................................................63
Figure 7-8 : Playback of a document........................................................................................64
Figure 7-9 : Searching using markers.......................................................................................65
Figure 7-10 : Searching with handwritten notes.......................................................................67
Figure 7-11 : Searching using textual information...................................................................68
Figure 8-1 : Requirements from MSDN online........................................................................78
Figure 9-1 : Main menu ............................................................................................................86
Figure 9-2 : Main recording window........................................................................................86
Figure 9-3 : Handwriting window ............................................................................................86
Figure 9-4 : Review windows...................................................................................................86
References:
[AERO]
Compaq Aero 2100 palmsize PC
http://www.compaq.com/products/handhelds/2100/index.html
[BM97]
Chr. Bacher, R. Müller, Th. Ottmann, M. Will
An Integrated Environment for Mbone Session Recording and Replay
Submitted for EDMEDIA ' 97, the World Conference on Educational
Multimedia
and Hypermedia (June 14-19, Calgary, Canada)
[CEWIN]
Chris De Herrera's Windows CE Website
http://www.cewindows.net/
“ Choosing a PDA”
http://www.cewindows.net/choosingpda.htm
[CROSS]
cross pen computing group
http://www.crosspad.com
“CrossPad”
http://www.crosspad.com/products/crosspad/index.html
[DDJ]
Dr. Dobbs Journal, April 1998, p.62ff
Bruce Radtke
“WINDOWS CE WIN32 API PROGRAMMING”
http://www.ddj.com
[FUJI]
Fujitsu personal systems, Inc. - pen computers
http://www.fpsi.fujitsu.com/
“Stylistic 2300”
http://www.fpsi.fujitsu.com/product/st2300.htm
[GSM610]
GSM 06.10 lossy speech compression
http://kbs.cs.tu-berlin.de/~jutta/toast.html
[HM96]
H. Maurer,
Hyper-G now Hyperwave: The next generation web solution,
Addisson-Wesley Publishing Company, Harlow, England, 1996
ISBN 0-201-40346-3
http://www.iicm.edu/hwbook
[JORD]
Hewlett Packard
HP Jordana 680 Handheld PC
http://www.hp.com/austria/pc/jornada_680.html
[JS96]
John Scourias
Overview of the Global System for Mobile Communication
http://www.shoshin.uwaterloo.ca/~jscouria/GSM/gsmreport.html
[LIBR]
Toshiba Libretto
http://www.toshiba.com
[MMRUS]
Mobile Multimedia Research at the University at Southampton
http://www-mobile.ecs.soton.ac.uk/speech_codecs/
[MSDN01]
Microsoft Developer Network
http://msdn.microsoft.com
“Resource Interchange File Format Services” at the location
http://msdn.microsoft.com/library/psdk/multimed/mmio_2uyb.htm
[MSDN02]
“Porting Windows 95 Programs to Windows CE” at the location
http://msdn.microsoft.com/library/backgrnd/html/msdn_porting.htm
[MSDN03]
“Porting a Windows NT ACM Driver to Windows CE” at the location
http://msdn.microsoft.com/library/wcedoc/wceddk/acmdrv_10.htm
[MSDN04]
“Registry keys for Device Drivers” at the location
http://msdn.microsoft.com/library/wcedoc/wceddk/rkeys_1_2.htm
[OB95]
Thomas Ottmann, Christian Bacher:
Authoring on the Fly
J.UCS Vol. 1, No. 10, Oct 28, 1995 - Also available as a technical report
(No. 72).
http://www.iicm.edu/authoring_on_the_fly
[OEPD]
Austrian patent database
http://www.patent.bmwa.gv.at/ and http://at.espacenet.com/
[PENC]
Pen Computing
covering mobile computing & communication
http://pencomputing.com/WinCE
[PANA]
Panasonic
“ToughBook 17”
http://www.panasonic.com/computer/notebook/products/toughbook17.htm
[PLAN]
Plantronics - Headsets
http://www.plantronics.com
“Vista universal amplifier M12”
http://www.plantronics.com/emea2/products/product_sheets/m12.html
[WAC]
WACOM Europe Ges.m.b.H
http://www.wacom.de
[WEBPAD]
Portable Web Access
http://www.national.com/webpad

Introduction

Transcription

Similar documents

4. Microsoft eMbedded Visual Basic

Saturday, February 21 - Joplin regional StockyardS