Introduction
Transcription
Introduction
Documentation of Conferences with indexed data access Master’s Thesis at Graz University of Technology submitted by Bernd Kohlmaier F874 – 9032604 Institute for Information Processing and Computer Supported New Media (IICM), Graz University of Technology A-8010 Graz, Austria January 2000 © 2000, adhoc Hard- und Software Ges.m.b.H Nfg Kg Advisor: o.Univ.-Prof. Dr. Dr.h.c. Hermann Maurer Supervisor: Dipl. Ing. Thomas Dietinger Dokumentation von Konferenzen mit indiziertem Datenzugriff Diplomarbeit An der Technischen Universität Graz Vorgelegt von Bernd Kohlmaier F874 – 9032604 Institut für Informationsverarbeitung und Computergestützte neue Medien (IICM), Technische Universität Graz A-8010 Graz, Österreich Jänner 2000 © 2000, adhoc Hard- und Software Ges.m.b.H Nfg Kg Begutachter: o.Univ.-Prof. Dr. Dr.h.c. Hermann Maurer Betreuer: Dipl. Ing. Thomas Dietinger Abstract Meetings, conferences and individual arrangements are a very important part of a company’s life. No matter if business meetings take place with personal presence or arrangements are made on the telephone, all the necessary information must be preserved for later retrieval. It suggests itself to create an audio or video recording to get a complete documentation and therefore to be able to retrieve the exact wordings at a later time. This thesis describes an application called ConfDoc that helps to generate such a complete documentation of conferences or meetings and quickly retrieve specific information within one or more documentation sets. Using so-called index data to seek within a single document allows to find specific information quickly without reviewing the whole recording. Porting the system to portable computers like handhelds or palmtops allows to use it on the road which is an important point when the meeting takes place in a client’s room. Kurzfassung Einen wesentlichen Bestandteil des betrieblichen Alltags bilden Besprechnungen, Konferenzen und individuelle Vereinbarungen. Unabhängig davon, an welchem Ort diese Besprechungen stattfinden besteht die Notwendigkeit alle relevanten Informationen für den späteren Gebrauch aufzuzeichnen. Naheliegend ist, eine Audio- oder Videoaufzeichnung zu erstellen, um mit Hilfe dieser vollständigen Dokumentation das Gesagte zu einem späteren Zeitpunkt wortwörtlich nachvollziehen zu können. Diese Diplomarbeit beschreibt ConfDoc, ein Programm um obig beschriebene Dokumentationen einfach zu erstellen. Zusätzlich gestattet ConfDoc eine schnelle Suche nach relevanten Informationen innerhalb eines einzelnen oder mehrerer Dokumente. Die Referenzierung mit Hilfe sog. Indexdaten innerhalb eines Dokumentes ermöglicht die rasche Informationsauffindung spezifischer Daten ohne jeweils die gesamte Aufzeichnung sichten zu müssen. Eine eigene ConfDoc - Version für tragbare Computer (Handhelds oder Palmtops) bringt den Vorteil, stets ein handliches und unauffälliges Equipment zur Verfügung zu haben. Die praktische Anwendung ist vor allem dann gegeben, wenn Besprechungen in den Räumlichkeiten des Kunden stattfinden. I hereby certify that the work presented in this thesis is my own and that work performed by other is properly cited. Ich versichere hiermit, diese Arbeit selbständig verfaßt, andere als die angegebenen Quellen nicht benutzt und mich auch sonst keiner unerlaubter Hilfsmittel bedient zu haben. vi Contents 1 Introduction ...........................................................................................................................1 2 Current documentation techniques .....................................................................................3 2.1 Handwritten notes .............................................................................................................3 2.2 Audio / Video recording ...................................................................................................3 2.3 Using “standard software” ................................................................................................4 2.4 Authoring on the fly..........................................................................................................5 2.5 Conclusion ........................................................................................................................7 3 Patent search ..........................................................................................................................9 3.1 Request for a patent search ...............................................................................................9 3.1.1 Beschreibung..............................................................................................................9 3.1.2 Patentanspruch .........................................................................................................12 3.1.3 Zusammenfassung....................................................................................................12 3.2 Result EP0495612 – A data access system.....................................................................12 3.2.1 General .....................................................................................................................12 3.2.2 Abstract ....................................................................................................................13 3.2.3 Description ...............................................................................................................13 3.2.4 Figures ......................................................................................................................16 3.2.5 Patent claims ............................................................................................................17 3.2.6 Comparison to our request .......................................................................................18 3.3 Result US5172281 – Video Transcript Retriever ...........................................................18 3.3.1 General .....................................................................................................................18 3.3.2 Abstract ....................................................................................................................19 3.3.3 Patent claims ............................................................................................................19 3.3.4 Comparison to our request .......................................................................................21 3.4 Conclusion ......................................................................................................................21 4 Requirements on a new system...........................................................................................23 4.1 Requirements made by the user: Optimal scenario.........................................................24 4.1.1 Recording a document..............................................................................................24 4.1.2 After the conference .................................................................................................24 4.1.3 Playback and usage of the document .......................................................................25 4.2 Aspects seen by the developer ........................................................................................25 vii 4.2.1 Types of information................................................................................................26 4.2.2 Hardware / Software.................................................................................................28 4.3 Client Hardware discussion ............................................................................................29 4.3.1 Laptops, Notebooks..................................................................................................29 4.3.2 Sub-Notebook...........................................................................................................30 4.3.3 Pen Computer...........................................................................................................31 4.3.4 Handheld PC ............................................................................................................32 4.3.5 Palmtop PC...............................................................................................................33 4.3.6 Crosspad ...................................................................................................................34 4.3.7 Summary ..................................................................................................................35 4.4 Extended terms of reference ...........................................................................................36 5 Developing a new system.....................................................................................................37 5.1 Platform, Operating system ............................................................................................37 5.2 Additional Hardware.......................................................................................................38 5.3 Functionality ...................................................................................................................39 5.3.1 Raw data...................................................................................................................39 5.3.2 Index data .................................................................................................................43 5.4 Scenario with this application.........................................................................................46 5.4.1 Recording a document..............................................................................................46 5.4.2 Playback and usage of the document .......................................................................46 5.5 Purpose of this application..............................................................................................47 6 User guide.............................................................................................................................48 6.1 Running the application ..................................................................................................48 6.2 Configuration ..................................................................................................................49 6.3 Creating a new documentation........................................................................................51 6.3.1 The main window.....................................................................................................51 6.3.2 Handwriting and Notes.............................................................................................52 6.4 Reviewing a documentation............................................................................................53 6.4.1 Random seek ............................................................................................................54 6.4.2 Accurate seek ...........................................................................................................54 6.5 Extending a documentation.............................................................................................55 7 The Implementation ............................................................................................................57 7.1 The main parts of the program........................................................................................58 7.2 Creating a new documentation........................................................................................59 7.2.1 Audio........................................................................................................................59 7.2.2 Markers.....................................................................................................................61 7.2.3 Handwritten notes ....................................................................................................61 7.2.4 Textual notes ............................................................................................................63 7.3 Reviewing a documentation............................................................................................63 7.3.1 Playback ...................................................................................................................64 viii 7.3.2 Searching for information ........................................................................................65 8 Making the system portable................................................................................................69 8.1 Hardware.........................................................................................................................69 8.1.1 Operating system......................................................................................................69 8.1.2 Device Type .............................................................................................................70 8.2 Software ..........................................................................................................................71 8.2.1 Differences between the Win32 and Windows CE APIs .........................................72 8.2.2 Memory Limitations.................................................................................................72 8.2.3 Energy Limitations...................................................................................................73 8.2.4 Hardware Characteristics .........................................................................................73 8.2.5 User Interface ...........................................................................................................74 8.2.6 Testing and Debugging ............................................................................................76 8.3 Porting ConfDoc .............................................................................................................77 8.3.1 Preparing the implementation ..................................................................................77 8.3.2 The porting process ..................................................................................................77 8.4 Porting a Windows NT ACM Driver to Windows CE ...................................................79 8.4.1 Motivation ................................................................................................................79 8.4.2 The theoretical porting .............................................................................................80 8.4.3 The practical porting ................................................................................................80 8.5 Conclusion ......................................................................................................................81 9 Analysis of the new system..................................................................................................82 9.1 Stationary System ...........................................................................................................82 9.1.1 Hardware ..................................................................................................................82 9.1.2 Recording a documentation......................................................................................83 9.1.3 Reviewing a documentation .....................................................................................83 9.1.4 Searching for information ........................................................................................84 9.1.5 Suggestions for improvement...................................................................................84 9.2 Portable System ..............................................................................................................84 9.2.1 Hardware ..................................................................................................................84 9.2.2 Recording a documentation......................................................................................85 9.2.3 Reviewing a documentation .....................................................................................85 9.2.4 Searching for information ........................................................................................85 9.2.5 User Interface ...........................................................................................................86 9.2.6 Suggestions for improvement...................................................................................87 9.3 Combining the two systems ............................................................................................87 10 Conclusion ..........................................................................................................................88 10.1 Further topics and future work......................................................................................89 1 INTRODUCTION 1 Introduction To arrange meetings and conferences is a must for every company in whatever business. No matter if these meetings take place with personal presence or arrangements are made on the telephone, all the necessary information from this meeting must be preserved for later retrieval. The usage of computers itself suggests to create a documentation system, where the audio or video information is interlinked with additional textual notes or graphical illustrations. This additional information can be used to indicate raw audio / video data to review the recordings. The idea was born in the company named “adhoc Hard- und Software”, that is located in Klagenfurt where I am currently employed. As a company, that is focused on the development and production of special designs concerning electronics and telecommunications we are concurrently working on several projects. During the development process many questions and problems arise, which deserve to be solved quickly. It is also usual that many agreements are made by phone. This kind of communication results in many single and often short phone calls. The effect is that it is hard to keep control over all those verbal arrangements and to relate such agreements to the right project. So we started to look for a system to record these dialogues. Our demands were that this product should be usable for different types of conferences such as phone calls or meetings. Further we wanted a system that would work at various locations like the company’s conference room or on the road in a client’s room. 2 INTRODUCTION This thesis describes an application called ConfDoc which helps to generate a complete documentation of conferences or meetings and to quickly retrieve specific information within more than one documentation sets. Chapter 2 outlines the currently used documentation methods. The range lasts from handwritten notes on a sheet of paper to extended systems which create files with the help of computers. Chapter 3 focuses a patent search which was carried out. The request for the patent search is outlined and two interesting results are described in detail. Chapter 4 summarizes the requirements of a new system. Two aspects are specially looked at: The requirements raised by the user and the aspects seen by the developer which cover the hardware and software requirements. The process to create this new ConfDoc application is the topic of Chapter 5. It starts with a discussion concerning the hardware, covers the functionality of this application and describes a scenario that is possible with the help of ConfDoc. Chapter 6 contains a user guide while the actual implementation is documented in Chapter 7. An important feature of ConfDoc is the portability. Chapter 8 explains how this system is made portable using a Palmtop device with the operating system Windows CE. The different device types and operating systems are discussed and it is described how to port the software to Windows CE. An analysis of the implemented application ConfDoc running on either the stationary or the portable system is made in Chapter 9. This chapter also includes hardware and software aspects and some suggestions for further improvements. Finally, Chapter 10 gives a short conclusion and an outlook of other ideas to create a documentation and obvious difficulties they cause. 3 CURRENT DOCUMENTATION TECHNIQUES 2 Current documentation techniques This chapter lists the currently used techniques to document a meeting. The different types of generated data are outlined and the connections between these data-types are shown if there exist any. 2.1 Handwritten notes The simplest way to generate a documentation is to make notes. In general, the participants themselves write down some comments by hand during the meetings. Depending on the participant´s speed writing by hand, the information is more or less detailed. As there is hardly time to write down the conversation word for word, the writer must decide at the moment if the information is important and worth writing down or not. When using a computer to avoid handwritten scrawl, typing is still slower than writing by hand. In many cases, the resulting documents are not complete, important pieces of information are missing or the written notes are incoherent. The best result can be obtained by creating a fair copy of the handwritten notes. For this case it is necessary to review the meeting and spend some time to rework the meeting’s contents. This process will surely take place at the end of the meeting. Because the incompleteness of the handwritten notes it is up to the participant to extend it from memory. If this meeting takes a long time this could be a very difficult job. 2.2 Audio / Video recording An extended method is to make an audio or video recording of the event. Since a recording represents a complete documentation, the possibility is given to review the conference or 4 CURRENT DOCUMENTATION TECHNIQUES retrieve the exact wording of a conversation. This may be important for proving definitely agreements or promises. But this method is not perfect at all. It does not provide the possibility to search for a specific phrase within the recorded data, or quickly review the audio / video recording belonging to a specific subject being discussed during the conversation. The only way to use the recordings, is to have a look at the whole data or to seek within the document depending on additional notes or distinguishing marks from memory. 2.3 Using “standard software” With the help of a common personal computer and already available software it is possible to create multimedia documents which represent a simple documentation of a meeting or conference. Just imagine a word processing application with the ability to use drawings like rectangles, circles or lines. Every popular system is of course able to import graphics and even audio or video files. With this availability you can make notes and draw a sketch during a conference to document the proceedings directly on your computer. Using a multitasking system allows to record an audio or video source in the background, while the word processing system is running in the foreground. This audio files can be linked to or imported into the word processing document. The result is a multimedial document with the ability to review a conference word by word by the usage of the recorded audio. The above outline gives an impression of a multimedial document, which contains all the necessary information for a complete meetings´ documentation. This means the document includes text, graphics and audio or video. But one important point is missing: The different types of information are not connected to each other. They are more like several independent single documents. Of course, an audio recording and also a text document is part of a whole documentation, but within this single files there is no relation from one point in the first component to the corresponding point in the other component. When examining a specific passage in the text file, it is impossible to find the corresponding passage in the audio file. The only way to verify the spoken words in the audio recording is to search manually for this passage or listen to the whole audio document. Especially this last alternative is not satisfying because of listening to the whole document takes a very long time. 5 CURRENT DOCUMENTATION TECHNIQUES 2.4 Authoring on the fly The following method was not intended for a documentation of a meeting. But the resulting documents may show some similarity with this intention. Authoring on the fly is a way of producing hypermedia documents for supporting teaching at universities. A computer held lecture is automatically converted into the core of a multimedia document and is linked together with papers, textbooks, animations and simulations. The following lines in [HM96] describe the method: Authoring on the fly, a term coined by Maurer in 1994, refers to the possibility of preparing a substantial piece of courseware during the process of actually delivering a lecture. The basic idea is simple enough: A lecture is delivered by extracting prepared multimedia material from a Hyper-G server and projecting it with a videobeam or LCD panel. The prepared multimedia material may just look like ordinary color transparencies, but might also include animation, movie clips, sound effects, simulation and other educational material. Thus, Hyper-G and the material prepared is just used for presentation purposes, so far. As the lecture is delivered, the voice, face (or the whole body for gestures) and actions of the lecturer such as pointing to or highlighting certain material shown are recorded, digitized and stored in Hyper-G together with the original presentation material. Thus, after the lecture is finished it is available as a HyperG document that can be reused at leisure at any later time: authoring of a one-hour piece of courseware complete with everything takes just one hour! Of course, the effort to produce the material to present or more realistically to select from the, hopefully, comprehensive electronic library has also to be taken into account.” An implementation developed by Thomas Ottmann and Christian Bacher uses an electronic substitute of the blackboard and transmits the lecture also to remote locations. The carried out experiments demonstrate that classroom lecturing, distance teaching, and the production of educational hypermedia can be successfully integrated. The production of a AOF (authoring on the fly) document is described in [BM97] as follows: 6 CURRENT DOCUMENTATION TECHNIQUES “First, all the slides to be used in a specific lecture are selected. These slides are PostScript documents produced as usual by standard tools such as LATEX or Framemaker. In order to enhance the comprehension of the lecture, any applications such as animations or movies can be included. The slides and applications can be provided with custom titles. These are later used to automatically generate a table of contents for the hypermedia document. At this stage, the recording session is ready to be started. While recording, the slides are loaded into the whiteboard. These can be marked and annotated as usual using the whiteboard' s features. While delivering the lecture, the selected applications can be started simultaneously. The data resulting from these actions, namely the whiteboard data stream, the applications' start and stop commands and the audio stream, are recorded. At the end of the recording session, the data is immediately converted into an internally used format [OB95] to generate an integrated document for offline use. Upon accessing the document, the built-in viewer performs the synchronous replay of the recorded data streams as well as starting and stopping the applications as presented during the lecture. An integrated document handler supports the creation of collections where any additional documents can be inserted. Operations for moving, copying or deleting documents are also supported.” They delivered a lecture on the computer and converted it into a hypermedia document. The lecture was recorded on a S-VHS video tape, which was later digitized (audio and video) with the SGI capture tools. The capturing of the audio and video stream in a sufficient quality needs a powerful hardware and some experience in a proper use of the software tools. The wb output was recorded with MCASTREC1, a novel program to record a whiteboard session, and then converted into a format which is readable for an external Hyper-G viewer SYNCVIEW2. Then a textfile with the paths of the postscript slides and titles was edited. As a result you get a multimedia document consisting of sound and video of the lecturer's talk, but also the demonstrations on the wb. The program SYNCVIEW presents this multimedia document by synchronizing the wb actions with video and sound. It is also possible to scroll back and forward in the document by using a slider. 1 available under ftp://ftp.informatik.uni-freiburg.de/pub/AOF 2 see 1 7 CURRENT DOCUMENTATION TECHNIQUES Figure 2-1: screenshot of the movie and the accumulated whiteboard Although this tool provides recording and playback of video data and generating additional information like predefined graphics, text and drawings, it provides not a way to use the additional information as an index for the video data. The resulting hypermedia document is useful for lectures, but lectures are straight forward and the document is not useful when searching for phrases. 2.5 Conclusion These techniques show us the different type of information. The “audio / video recording” generates some raw data, which provides a complete documentation of a meeting. This method allows to retrieve the exact wording of a conversation. This may be important for some cases, but it will be very time consuming to 8 CURRENT DOCUMENTATION TECHNIQUES retrieve some specific information, because there is no way to search within the document. The only way to get information is to browse through the document. The “handwritten notes” represent so called index data. This method requires to filter the contents during the meeting to decide if the information is important and worth to write down or not. This leads to incomplete documents. In general the retrieval of exact wordings is not possible. The other two methods provide a complete documentation represented by a video recording combined with some additional information, which is collected manually by the user. This additional information mainly consist of text and drawings. But these methods give no possibility to creates connections between these two parts. Retrieving specific information from the complete documentation is as difficult as retrieving information from an video recording because no search method is provided. 9 PATENT SEARCH 3 Patent search With the intention to claim a patent, a patent search was requested. The subsequent request yields to two interesting results, which will be covered following the description of the request. 3.1 Request for a patent search The following request was sent to the Austrian patent office in 1999. Note: To avoid mistranslation, the article is given in german. 3.1.1 Beschreibung 3.1.1.1 Technisches Gebiet Beschleunigtes Auffinden markanter Positionen in multimedialen Datenströmen von Video und Audioaufzeichnungen 3.1.1.2 Stand der Technik Multimediale Dokumentationen wie Videoaufzeichnungen, Bilder, Tondokumente, handschriftliche und ASCII-Textvorlagen sind durch entsprechende Leistungssprünge im EDV-Bereich (hohe Rechnerleistungen und Speichervermögen) in modernen Unternehmen durchaus üblich. Die Problematiken dabei sind zur Zeit im wesentlichen 2 Aspekte: • Mangelhafte Wiederauffindbarkeit aus Dokumentationssammlungen 10 PATENT SEARCH • Zeitaufwendiges Suchen in solchen Dokumenten 3.1.1.3 Innovativer Anspruch Die oben angeführten Probleme sollen durch folgende Maßnahme aus der Welt geschafft werden: • Logische Verknüpfung (Indizierung) von unübersichtlichen Dokumenten (Audio/VideoAufzeichnung) mit übersichtlichen (handschriftlicher Aufzeichnung) 3.1.1.4 Equivalenter Patentanspruch Automatische Indizierung multimedialer Datenströme durch (hand)schriftliche Aufzeichnungen gekennzeichnet dadurch, daß automatisch eine logische Verknüpfung von übersichtlichen Dokumenten (handschriftliche Aufzeichnung oder ASCII Text) mit unübersichtlichen Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt sinnvollerweise über entsprechende Zeitmarkierungen, die einzelnen Objekten des übersichtlichen Dokumentes angeheftet werden und auf entsprechende Objekte des unübersichtlichen Dokumentes verweisen.. 3.1.1.5 Figuren A B C D Audio/Video-Aufzeichnung auf EDV-Datenträger mit mitlaufender Zeitfunktion Elektronischer Notizblock mit integrierter Uhrenfunktion (Hand)schriftliche Aufzeichnung in Realzeit Automatisch zugeordnete Verweise über Zeitfunktion 11 PATENT SEARCH Figure 3-1 : Abbildung für Patentantrag 3.1.1.6 Ausführliche Beschreibung Um bestimmte, charakteristische Positionen in sequentiell aufgezeichneten multimedialen Dokumenten wiederzufinden, wird beim gegenständlichen System bei einer solchen Aufzeichnung eine fortlaufende Zeitmarkierung implizit mitgespeichert (Figur A) Zum selben Zeitpunkt erstellte (hand)schriftliche Aufzeichnungen (Figur C) mit Hilfe entsprechender elektronischer Medien (Figur B) verwenden dieselben Zeitmarkierungen und erlauben damit - in weiterer Folge - über diese Zeitmarkierungen eine direkte Referenzierung bestimmter Textstellen mit den, zu diesem Zeitpunkt erstellten, sequentiellen Aufzeichnungen (Figur D). 12 PATENT SEARCH Unter Verwendung elektronischer Aufzeichnungsmedien für A und B wird somit ein wahlfreier Zugriff (unmittelbare Positionierbarkeit) auf ansonsten in sequentieller Form vorliegende Dokumente möglich. 3.1.2 Patentanspruch Automatische Indizierung multimedialer Datenströme durch (hand)schriftliche Aufzeichnungen gekennzeichnet dadurch, daß automatisch eine logische Verknüpfung von übersichtlichen Dokumenten (handschriftliche Aufzeichnung oder ASCII Text) mit unübersichtlichen Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt sinnvollerweise über entsprechende Zeitmarkierungen, die einzelnen Objekten des übersichtlichen Dokumentes angeheftet werden und auf entsprechende Objekte des unübersichtlichen Dokumentes verweisen. 3.1.3 Zusammenfassung Automatische Indizierung multimedialer Datenströme durch (hand)schriftliche Aufzeichnungen dadurch, daß automatisch eine logische Verknüpfung von übersichtlichen Dokumenten (handschriftliche Aufzeichnungen oder ASCII Text) mit unübersichtlichen Dokumenten (Audio/Video-Aufzeichnung) durchgeführt wird. Dies erfolgt über entsprechende Zeitmarkierungen, die einzelnen Objekten des übersichtlichen Dokumentes angeheftet werden und auf entsprechende Objekte des unübersichtlichen Dokumentes verweisen. 3.2 Result EP0495612 – A data access system Because of the similarity of this result to our thoughts, this result is displayed with all details. 3.2.1 General Inventor(s): Lamming, Michael G. Applicant(s): XEROX CORPORATION Issued/Filed Dates: July 22, 1992 / Jan. 14, 1992 Application Number: EP1992000300285 13 PATENT SEARCH 3.2.2 Abstract A note-taking system based on a notepad computer with an integrated audio/video-recorder is described. A document is created or retrieved. As the user types on the keyboard or writes with the stylus or similar input instrument, each character or stroke that is input by the user is invisibly time-stamped by the computer. The audio/video stream is also continuously timestamped during recording. To play a section of recording back, the user selects part of the note (perhaps by circling it with a stylus) and invokes a "playback selection" command. The computer then examines the time-stamp and "winds" the record to the corresponding place in the audio/video recording, where it starts playing - so that the user hears and/or sees what was being recorded at the instant the selected text or strokes were input. With a graphical user interface, the user may input key "topic" words and subsequently place check marks by the appropriate word as the conversation topic veers into that neighborhood. 3.2.3 Description This invention relates to a data access system, and in particular to one in which the taking of notes usually accompanies the recording of data. The notes themselves are used to gain selective access to the data. 3.2.3.1 The problem At seminars, interviews or meetings, it has long been recognized that the simple act of taking notes helps the writer memorize key facts. On the other hand, to make a note, the listener has to divert attention from the speaker, figure out how to encode the information he has heard or the idea he has had, and then focus on writing it down. It doesn't seem surprising that the listener often loses the speaker's thread! Nowadays many people resort to making audio or video recordings of important events and then spending time transcribing key ideas afterwards. Locating the interesting parts of an audio or video recording is often very timeconsuming and tedious, which reduces the likelihood that the listener will bother to transcribe the tape, or even bother to make the recording in the first place. In many situations the need to record is not anticipated. Only after the event, sometimes long afterwards, does the need become evident. To overcome this problem it may eventually become technologically possible, from a storage point of view, to record and store every second of a person's working day. The problem of locating key pieces of information from this huge and expanding base of recordings becomes overwhelming. 14 PATENT SEARCH This invention builds upon these ideas by providing a semi-automatic, fine-grained, audio/video indexing tool. Although the invention focuses on gaining access to audio and video records, it is not limited to this application. It is envisaged that the invention could also be applied to accessing files in a computer system, phone call records, or any other timestamped data set. The present invention will now be described by way of example with reference to the accompanying drawings, in which: • Figure 1 is a diagrammatic view of one form of the invention using a laptop computer; • Figure 2 is a diagrammatic view of another form of the invention using a notepad computer with stylus input, and • Figure 3 is a diagrammatic view of one form of the architecture of the invention. The example of the invention shown in Figure 1 combines a laptop computer 2 with a video or audio recording device 4. In the example of the invention shown in Figure 2, the portable computer is replaced by a note-pad computer 6 with stylus or other graphical input device 8, connected to video or audio recording device 4. The stylus input device may take the place of the keyboard of the portable computer as the user's means of input. Alternatively, a keyboard 10 may be connected to the notepad computer for input. Either form of computer is interfaced to the recorder in such a manner that the computer can monitor the recorder's progress during recording. To enable this, the recorder may emit timecode or make other unique index information continuously available to the computer. Additionally the computer can operate the recorder directly, causing it to record, or play from a specific index point as required. In any situation it is possible for the computer to find out the time-code or index information. In the general system shown in Figure 3, the computer presents a document editor style user interface to the user. This interface may resemble that of a word-processor, an illustrator, or an application specific interface, depending on the user's needs and computer hardware. The user interface provides commands for creating a new document and starting and stopping recording. A new document is created, or an old one recalled, and recording commences. The editor 22 allows the user to draw, type or sketch on the blank document using keyboard 10, stylus 8, or other graphical input device. As each mark (or indicium) 26, is added to the document (an 15 PATENT SEARCH 'indicium' is any indivisible displayable symbol created by the interface, e.g. pen-stroke, character, graphical symbol, etc.), the editor time-stamps and stores it in an indicium-totimestamp index 14. The video-frame time-stamper 20 also continuously notes the index information 32 arriving from the recorder 4 at that instant, and time-stamps it and stores it in a timestamp-to-timecode index 16. The time-stamps are not visible to the user - they are stored with the computer's internal representation of the indicia. When the session is over, the user stores away the document; the contents of the two indexes, and the recording for subsequent use. Any method that allows time-stamps to be used as an index into the recording may be used instead of the timestamp-to-timecode index, e.g. interpolation of the timecode itself. When the user wants to recall sections of the recording, the appropriate related document is retrieved and recalled to the display 30 by the browser 24. Using the stylus 8, or the keyboard 10, the user selects one or more indicia on the document, perhaps by circling them, positioning a cursor, or typing an identifying name, and instructs the browser 24 to play the associated video section. The browser 24 identifies the selected indicium or indicia, looks up the time-stamp(s) in the indicium-to-timestamp index 14; looks up the timestamp(s) in the timestamp-to-timecode index 16, and plays the section of the recording in the area indicated by the resultant timecode. Thus the user sees what was recorded at the time the marks were made on the video monitor 28. Indicium-to-timestamp and timestamp-to-timecode indices may be combined into a single indicium-to-timecode index. It is expected that users will develop shorthand notations for marking the document to indicate an interesting idea, change of topic or speaker, and so forth. It also is expected that users will type or handwrite key words as the topic arises, and then add additional marks nearby each time the subject veers in the direction of that topic. Thus, to hear all section of a seminar associated with the same topic, it is necessary only to circle all the marks in the location of the topic keyword. Accordingly it will be seen that the present invention provides a simple, inexpensive, lightweight, low-power device for quick access to data records. 16 PATENT SEARCH 3.2.4 Figures Figure 3-2 : Figure 1 of EP0495612 Figure 3-3 : Figure 2 of EP0495612 17 PATENT SEARCH Figure 3-4 : Figure 3 of EP0495612 3.2.5 Patent claims 1. A system for providing random access to a time-stamp identified part of a time-stamped data set, said system comprising: • means for capturing, filing, retrieving, displaying, and automatically time-stamping seer-generated indicia for selection by a user, said time-stamped data set and said time-stamped indicia having a common time base; • means for determining the time-stamp of each user selected indicia; and • means for indexing into said data set to identify the part thereof that is time-stamp correlated with said selected indicia. 2. A system as claimed in claim 1, including means for presenting to the user the identified portion of the data set. 3. A system as claimed in claim 2, in which the time-stamped data set is audio and/or video data. 18 PATENT SEARCH 4. A system as claimed in any of claims 1 - 3 , which is implemented by means of a portable computer. 5. A system as claimed in any of claims 1 - 3, which is implemented by means of a portable video recorder. 6. A system as claimed in claim 5, in which the portable video recorder is a camcorder. 7. A system as claimed in claim 4 in which the portable computer is a notepad computer with stylus input. 3.2.6 Comparison to our request Although they never purchased a product with this design, the basic idea is actually the same as ours. Even the portability is covered in the patent claims. Differences between the system described above and our system are the combination of a stationary system with a portable system and the search capabilities over more than one document. We also thought about combining more than one document regarding one conference. This patent could be a reason for our request to fail. 3.3 Result US5172281 – Video Transcript Retriever Since this second result is not as similar to our idea as the first result, it is not listed with all details. Only the abstract and the patent claims are shown. 3.3.1 General Inventor(s): Ardis; Patrick M. , Memphis, TN 38119 Markovich; Marko R. , New Orleans, LA 70115 Thompson; Kevin W. , New Orleans, LA 70115 Applicant(s): None Issued/Filed Dates: Dec. 15, 1992 / Dec. 17, 1990 Application Number: US1990000628082 19 PATENT SEARCH 3.3.2 Abstract A video transcript retriever includes a control unit, a control interface, a tape unit and a display unit. The control unit includes a control computer having a software package consisting of control software, text software and edit software. The control software has the capacity to permit simultaneous operation of both the test software, which is capable of storing and searching voluminous documents, and the edit software, which has the capacity to operate the tape unit with precision. The text software is capable of performing a search function that at any time can provide the exact location of a specific passage within the searched document in terms of page and line. The edit software has the capacity to provide at any time the timecode number pre-recorded on the videotape that corresponds to a specific passage. The process for locating and retrieving specific information on a videotape includes the steps of striping the videotape by assigning a numerical address for every one-thirtieth (1/30) of a second segment of the videotape; indexing the words written in a computer transcript to the words spoken on the videotape by assigning a timecode number to both the computer transcript and the videotape segment where each question/answer passage begins; and instructing the tape unit to shuttle to a precise tape location determined by the timecode numerical address located during the search of the computer transcript. The present invention relates to a process and apparatus for retrieving and displaying information on videotapes and, more particularly, to a process and apparatus for quickly and precisely retrieving and displaying specific images and testimony in a videotaped deposition by indexing a computer-generated transcript of the deposition proceedings with a video timecode number address on the videotape. 3.3.3 Patent claims 1. A system for indexing written and video records of a deposition comprising: • a control unit having a control computer including a software package consisting of a text software having the capacity to store and search documents, an edit software having the capacity to operate a tape unit with precision, and a control software having the capacity to control said system and permit the simultaneous operation of said text software and said edit software; • a videotape having video and audio information thereon; • a document consisting of said audio information in a written record format stored in said test software; • a control interface connected to said control unit to receive signals from said control unit and pass signals to a tape unit; 20 PATENT SEARCH • a tape unit having said videotape thereon and connected to said control interface to receive signals therefrom; • a numerical timecode address assigned to said videotape for every 1/30 of a second segment of said videotape; • an identical numerical timecode address assigned to each audio information segment in said document corresponding to the identical audio information segment on said videotape; and • a display unit connected to receive signals from said control unit and from said tape unit for selectively displaying the audio and video information at any numerical address on said videotape. 2. A process for indexing each question and answer segment on a videotape deposition to each corresponding question and answer segment on a written record document of said deposition comprising: • listening to the audio information on said videotape while simultaneously reading a written record of said deposition; • marking the beginning of each question and answer segment of said deposition on said videotape with a numerical timecode address; • sending said timecode addresses which identify the beginning of each question and answer segment on said videotape through a control interface to a control unit; and • placing at the beginning of each question and answer segment of said written record of said disposition an identical numerical timecode address corresponding to the numerical timecode address previously assigned to each question and answer segment appearing on said videotape by means of a command generated in said control unit and passing through said control interface. 3. A process for retrieving and displaying video and audio information on a videotape comprising the steps of: • providing a written record document of the spoken words appearing on said videotape in question and answer segments; • indexing the words written in said record to the identical words spoken on said videotape by assigning a timecode numerical address for each question and answer segment on said written record corresponding to the identical question and answer segment on said videotape; • searching said written record to locate a specific word passage and the timecode numerical address assigned thereto; • instructing a tape unit to shuttle said videotape to the precise tape location as determined by the timecode numerical address located in said written record; and 21 PATENT SEARCH • displaying the information on said videotape located at said timecode numerical address. 3.3.4 Comparison to our request This invention describes a system, which preserves two types of data: • The video stream recorded on a tape • Full-text of the video recording taken by the stenographer These two parts are combined to obtain the best effort in retrieving information from the video recordings. Aspects of this system, which are the same as in our project: • Audio- or video recording • Textual information used for indexing inside the recording Characteristics of our project, which are not included in this system: • Drawing capabilities and graphical information used for indexing • Full text search facilities • Searching within several documents • Portability Looking at these features I guess, this patent claims are no problem for our request. 3.4 Conclusion The first result, “A data access system”, describes more ore less exactly our idea. It is surprising, that this system has not been realized and purchased by now. The second discussed system is intended for use in litigation. In the field of litigation, a deposition is a proceeding in which an attorney asks oral questions of a witness. For this reason, this system is especially made for question – answer related conversations. Due to the full-text data this system claims to be a good documentation system. With this database it is possible to perform a full-text search within the whole range of the recorded video. But as mentioned above, this system requires a stenographer to create the textual 22 PATENT SEARCH information. Additional, the lack of the possibility to create sketches or markers restricts the user to textual information. This system is very specific for use in litigation. Our own patent claim is currently running, but there may be a problem because of the similarity to the patent number EP0495612. 23 REQUIREMENTS ON A NEW SYSTEM 4 Requirements on a new system Because an index for the recorded data is missing, the development of a new system including this feature is necessary. The first question to be answered covers the system requirements on this application. A perfect documentation system claims the following conditions: • no information may be lost This requirement could be taken for granted because it is the first essential for retrieving a precise information at a later time. • fast retrieval of information is required Systems which obtain the information not so fast are already available. But the factor time is an important part of the new task. And a proverb says: “Time is Money” • conventional documentation techniques are preferred This statement means that the person who is documenting a conference does not have to change his behaviors. The system should adapt to the human habits. The only hardware, which fulfills these requirements, consists of several sheets of paper and a pen. Of course, this is not the hardware we are thinking of but it gives us a good idea what the hardware should look like. With this three points in mind let’s have a look at a visionary scenario which describes a perfect documentation of a conference. 24 REQUIREMENTS ON A NEW SYSTEM 4.1 Requirements made by the user: Optimal scenario 4.1.1 Recording a document To go through the worst case scenario consider a conference, which takes place in the client´s premises. This means, no permanently installed hardware can be used. The hardware for the client application to make notes and drawings, which fulfills the conventional techniques, is realized by a pen device. The pad of this device is as large as a sheet of paper and the pen is freely mobile, which means that it does not need a cable. The thinner the pad, the better a sheet of paper can be simulated. The result is a pen based computer or a pen based input device with a touch screen which can be used as a notepad. Textual and graphical input is supported by the input device. The software supports online handwriting recognition and aids the user when drawing straight lines, rectangles or other primitives. A small and lightweight video camera, which is also used by video conferencing systems, produces the raw data. So video and audio sources are recorded. Thus the camera is small and handy, it can be easily used to point to an overhead or video beam display to capture demonstrations or supplemental material if necessary. To avoid annoying camera handling during the discussion the camera is set up in a way where the whole scene can be captured the entire time. Before the conference starts, some general information can be entered. This includes data about the client, the project, time and place of this meeting, and so forth. During the conference, notes can be made as usual with a pen and paper including leafing through the pages. An important thing for the recording phase is that every change, every character and every single line of a drawing is stored. Even delete and undo actions are recorded. This allows an exact reproduction of the particular conference. 4.1.2 After the conference When the conference has finished, the system allows a quick review of the document including adding and changing information. When back at the company the client system is attached to the server and transfers the data to the server. From this time on, the server is responsible for backup, retrieval and access rights. 25 REQUIREMENTS ON A NEW SYSTEM After the data are successfully copied to the server, the client memory can be freed for a new documentation. 4.1.3 Playback and usage of the document To review a documentation, the client program is used to search within the database on the server for a specific document. The search criterion can be a part of the general data, which was entered at the beginning of the conference, or a phrase out of the full text. Once the document is found, it is opened to examine the contents. Changing the document is also possible, if the user has this access rights granted. The first opportunity to review the recording is a bit like the usage of a video recorder. Several controls allow to start playback, stop or seek within the document. All recorded actions of the user are reproduced which of course include the video data. The second method uses the system´s index capabilities. The starting point of a search procedure is the whole document. The user can look at the notes and drawings he made during the meeting or use the system´s full text search capabilities. By clicking onto one word or line of these notes, the video playback is synchronized with this part of the notes. The user can now look at the video scene which was recorded at that time he made these certain notes and is able to listen to the conference word by word or look at every gesture of the participants. In order to realize a system which fits the above scenario, some considerations may follow: 4.2 Aspects seen by the developer The primary goal of this new system is to create categorized collections of documents to generate a complete documentation of meetings, phone calls and conferences during a development process or service. This system should be able to generate a documentation semi-automatically, combine all the different types of data to one coherent document, keep the links between these different types of information, store all available material permanently and provide easy retrieval with some search mechanism. 26 REQUIREMENTS ON A NEW SYSTEM 4.2.1 Types of information A whole documentation consists of raw data and additional information. This additional information, which will be called index data from now on, will be used to indicate specific positions inside the raw data. 4.2.1.1 Raw data The raw data should be stored unmodified and must be retrieved with random access. Possible types of raw data are: • Audio • Video As the raw data can not easily be searched for keywords or phrases with the current technology, it is the job of the indexes to provide a fast and easy access to the raw data. With the help of the index data, the system should seek for a specified starting point within the raw data and play back the data stream. The only requirement this system must fulfill concerning the raw data is to record the whole conference and play back the recorded data starting at a specified position within the whole recordings. Of course, recording and playback of audio or video data is not very difficult, so the key point of interest for this system will be the generation and usage of index data. 4.2.1.2 Index data The index data is used to point to specific positions within the raw data. Index data can be generated in different ways. Any distinctive feature of a conference can be used to create index data. The variety of methods may include noise level, change of subject in a discussion, direction of the loudest noise, gestures when using video data, and so on. The most important separation is the generation method. We distinguish between two different methods: • automatic • manual or forced 4.2.1.2.1 Automatic indexing Automatic indexing includes all methods, so that no special action needs to be taken by a meeting participant to create a new index. The indexes are created by the system itself when distinguishing marks appear during the conference. 27 REQUIREMENTS ON A NEW SYSTEM • time stamp A time stamp is the simplest form of an index. The indexes are generated periodically. The time between two index items depends on the type of the conference. When using this index type for short conferences, the period should be smaller than for longer conferences. It is hard to define a period that is convenient for different situations. With the current state-of-the-art technology is easy to realize this type of indexing when recording audio or video, as every multimedia player is able to seek within the whole document and start playback at a certain time-position from the beginning of the file. The resolution for this time-position depends on the sample rate of the recording. Since a minimum sample rate of 8 kHz is recommended and certainly realizable by common audio recording hardware, the resolution is sufficient for every need. • change of token A new index is created, when one speaker has finished and the next speaker begins. This may be difficult to realize with the current technology because it includes a realtime analysis of the raw data. As the example of speech recognition shows, this method requires a powerful hardware. Another problem is that the recognition of smoothly spoken words results in a very poor quality. An easier way to realize this method could be a kind of direction sensitive microphone, which can determine the direction where the loudest or softest noise level comes from. The 360 degrees around the microphone are divided into several sections. Every time the direction with the loudest noise changes from one section to another, a new index entry is generated. • striking moments The recorded data is examined for a noticeable situation. So a longer break in an audio or video stream can be a hint for a coffee break, or a video stream with several identical frames one by one may lead to an empty room. 4.2.1.2.2 Forced indexing Forced indexing describe all methods, where the user has to take an action like pressing a button or writing down some words to initiate a new index entry. • notes and drawings The common method for indexing is to make notes. The advantage of this way is that the user has a very short setting-in period, because the user is familiar with this method. Maybe his method changes from handwriting to typing, but the basic idea is the same. Drawings require a more complex type of hardware or a more complex method of creating it. In the first case, a special hardware simulates a sheet of paper and the user can make 28 REQUIREMENTS ON A NEW SYSTEM notes the way he is used to. The other method uses a common hardware device which is easier to handle for the system, but more complicated for the user. Two examples for these cases may be a pen driven drawing device (which is familiar to the user) and an ordinary computer mouse. It is more difficult to draw a simple rectangle using the mouse then using a pen. When using the keyboard for entering text, the index-generation process is simple. Whenever the user presses a key a new index is created. Selecting the best input device for this indexing method is the users job. Maybe the usage of a keyboard is not suitable because of the slower typing speed or the annoying noise of the keyboard. • markers By pressing a button, the user can set a marker to mark a specific event. Usually different markers are used to mark different events. These events may include the change of subject, a question, simple comment or action to be taken. When reviewing the document, the user can search for specific events. Every marker can contain some additional notes to make the retrieval process easier. A full-text search within these marker- descriptions can help to find a specific marker faster. 4.2.2 Hardware / Software Since probably many single documents belong to one great project and many different persons use this system at maybe various locations, a client server architecture can be very helpful. 4.2.2.1 Server The server should store the data files onto his system and he is responsible for the security services and backup processes. The usage of a network connection for remote locations can be taken for granted. The minimum requirements include a simple file server. Regarding to access control, private and public documents, backups, etc... a Hyperwave server or something similar is recommended. 4.2.2.2 Client The client application must be able to record all data which are part of one documentation. The recording shall include the raw data, all index information and the connections between the raw data and the index information. 29 REQUIREMENTS ON A NEW SYSTEM The client application also has to provide a mechanism, that allows the user to review a documentation, search within the index data and listen / view to the raw data depending on the index information. The system should enable the user to edit and change the document if he has granted access rights. The edit procedure is the same as the index editing functions when recording a documentation. This means creating new indexes, add comments to one index and delete indexes. Many different types of input devices are able to create index data. The keyboard will lead to textual information, the mouse or a pen device is more useful for generating graphics and drawings. Simple push buttons are easy to use with any of the above devices. These different types of information should not be used separately, they shall rather complement one another. A simple button may lead to a marker, which can be linked together with textual or graphical information. 4.3 Client Hardware discussion As seen above, choosing the hardware and software concerning the server side of the application should be no problem. I guess every common server hardware will fulfill the minimum system requirements. Every common personal computer with a built in sound card will meet the system requirements for a stationary client. As such hardware is currently available I will suggest it. The following section will now discuss the different types of client hardware for portable systems only. The goal of this part is not to list as many different devices as possible, but rather to show the advantages and disadvantages of the different types of devices. [PENC] is a good starting point to look for hardware reviews and comparison tables between different devices. This will help choosing the best device. 4.3.1 Laptops, Notebooks When speaking of a notebook, we think about a standard PC with an average size of 320 * 250 mm. The display should be able to show a minimum of 1024 * 768 pixels and the operating system and processor is of a similar type as in a desktop PC. 30 REQUIREMENTS ON A NEW SYSTEM The common available notebooks are often equipped with more powerful hardware than many desktop systems. Also a built in sound card can be taken for granted. Assuming the notebook is running the same operating system as the stationary system, this will lead to the following advantages: 4.3.1.1 Advantages A notebook (and all following devices with this operating system) will be a good solution for the developer. There is no need to port the system from the stationary system to the portable system. A simple installation of the program will make it available on the road. Another advantage is, that the user is familiar with the operating system so that he doesn’t have to learn how to use it. Since many companies are using notebooks for some reason, there is no need to buy a new hardware. 4.3.1.2 Disadvantages The disadvantage using notebooks is not one that meets technical requirements but business politeness. It is not seen as very respectful to “hide” behind an opened notebook display while negotiating. A technical aspect is the usage of batteries. Using this device on the road for a long time may cause problems with the power supply. For example this problems may occur at long lasting conferences. 4.3.2 Sub-Notebook A sub-notebook is a little bit smaller and less powerful than a standard notebook. The size will be about 220 * 180 mm and the display will be able to show a maximum of 800 * 600 pixels. Many of these devices are equipped with a touch screen, which will give new opportunities in entering index data. [PANA] will be a good example for such a device. Devices are available with Windows 95/98/NT or Windows CE as operating system. 31 REQUIREMENTS ON A NEW SYSTEM Figure 4-1 : Examples for subnotebooks: [PANA] and [LIBR] 4.3.2.1 Advantages This device will be a good alternative to a standard notebook. The operation system Windows 95/98/NT will allow to quickly install the ConfDoc and other common applications. Many devices with this size have no built-in CDROM or diskette drive but using a serial interface or USB connection to other PC’s will help. 4.3.2.2 Disadvantages Due to the missing CDROM and the small size of the keyboard, these devices look like underpowered notebooks. Regarding the lower power and almost the same price in comparison to standard notebooks I think these devices will be used for some special reasons only. Why buy a sub-notebook when you can get a standard notebook for the same price ? A subnotebook also works with batteries, so that the same problems regarding the power supply as described above for notebooks may occur. 4.3.3 Pen Computer A pen computer looks like a notebook with removed keyboard. The operations system is often Windows 95/98/NT, the display is equipped with a touch screen and the input is made with the help of a pen. A handwriting recognition system will substitute the keyboard. Another method is a software keyboard. This means, that a keyboard layout is displayed on the screen and can be operated with the pen by just tipping on the right button. 32 REQUIREMENTS ON A NEW SYSTEM Figure 4-2 : Example of a pen device: [FUJI] 4.3.3.1 Advantages Due to the operating system Windows 95/98/NT there is no need to port the application. These devices are made for walking so they are robust and lightweight. They are a good combination of a powerful PC and a handy device. 4.3.3.2 Disadvantages The missing keyboard will make it difficult to enter textual notes. A further problem due to the lack of a keyboard is to use it for standard applications like word processing. Although this device type is running a standard operating system it will rarely be used as a standard PC. The reason for this is that it is hard to use e.g. a word processing system without a keyboard. The built in handwriting recognition system is too slow to really use it comfortably during a meeting. 4.3.4 Handheld PC The features of a handheld computer include a nearly full-featured keyboard, a touch screen of half height VGA screen and Windows CE as operating system with some built-in standard applications as Pocket Word or Pocked Excel. Handheld PC Pro devices are available with a full-size VGA screen. 33 REQUIREMENTS ON A NEW SYSTEM Figure 4-3 : Example of a handheld PC : [JORD] 4.3.4.1 Advantages The device contains a keyboard and a touch screen for user input. The small size of the device makes it easy to take it along when you are on the road. Also a handheld computer will be a lot cheaper than a standard notebook or a sub-notebook. 4.3.4.2 Disadvantages The operating system Windows CE requires special versions of applications. Standard Windows programs can not be run on this device. They must be developed especially for this operating system Although a keyboard is available, it’s size of about 17 * 9 cm makes it difficult to enter a big amount of text. 4.3.5 Palmtop PC This device type represents the smallest piece of equipment. With an average size of 14 * 8.5 cm it will fit in every pocket. The intention to built only a personal information manager (PIM) without a keyboard resulted in just being equipped with a touch screen. Those Palmtop PCs run with the operating system Windows CE. Based on Windows CE and available software development kits (SDK’s) many freeware and shareware programs were written around the world to let these devices be more functional. 34 REQUIREMENTS ON A NEW SYSTEM Figure 4-4 : Example of a palmsize PC: [AERO] 4.3.5.1 Advantages The small size offers a discreet device which can easily be carried around. This device should fit in every pocket so no additional carrying case is necessary. The nature of Windows CE allows to use it quickly. There is no time consuming boot sequence to perform as it is using standard PC’s. 4.3.5.2 Disadvantages The biggest advantage can also be a great disadvantage. The small display provides only a small user interface and entering a big amount of text with the tip of a pen can be boring. Palmtops are battery driven. Therefore the well known problems also apply to this device. 4.3.6 Crosspad This device type is a little bit different. It is a pad which captures handwritten notes in a graphical form. The pad is handy enough to carry it around. After connecting the Crosspad to a standard PC the captured pages are transferred to the computer. The available software allows to easily store, search, cut and paste and send digital notes. 35 REQUIREMENTS ON A NEW SYSTEM Figure 4-5 : The Crosspad: see [CROSS] 4.3.6.1 Advantages Remember the third conditions for a perfect documentation system mentioned at the start of this chapter: “conventional documentation techniques are preferred ” . This is the only device which fulfills this request. The user can write with a traditional pen on a sheet of paper. The biggest advantage may be the power supply. The battery life time of about 3-4 months is unbeatable. 4.3.6.2 Disadvantages As a disadvantage can be mentioned that a special software is needed to import the data into ConfDoc. Since no audio recording is provided, an additional device is needed for that purpose. The Crosspad could only be used as a supplemental device not as a standalone tool. 4.3.7 Summary There are some different devices available which could be used for ConfDoc. The devices power and also price vary in a wide range which makes it almost impossible to generally suggest a specific type of device. Each user must choose a device type by himself. To choose the best device depends on one´s individual wishes and requirements. A good idea before purchasing a specific device is to take a look at the present habits and procedures when documenting meetings and agreements. The goal is to find out where, when and how situations occur to use ConfDoc. On the basis of such conclusions and also regarding technical demands and financial aspects a decision can be reached. 36 REQUIREMENTS ON A NEW SYSTEM 4.4 Extended terms of reference Using audio or video data may lead to a very large amount of data. A problem that should not be underrated is the usage of compression methods to reduce the amount of data. This topic is discussed in detail in Chapter 5.3.1.1 , “Audio compression”. 37 DEVELOPING A NEW SYSTEM 5 Developing a new system As treated above we need a system which is capable of recording and compressing audio or video data, create so-called index data and additionally link these types of information together. For creating the index data, we want to use the keyboard, a pen device and a mouse. So we have textual information entered via keyboard, drawings or handwritten notes entered using the pen and additionally markers for simply indicating significant positions within the recording. For the usage on the road, a portable device will be selected. 5.1 Platform, Operating system Choosing the best platform and operation system for this application depends on several different aspects. • Audio or video recording capabilities for the raw data. This should be simple, because nearly every common hardware and operation system enables the recording and compressing of audio or video data. • Making the system portable demands handy hardware. Common notebooks or subnotebooks may be too large for a suitable usage. A smaller hardware would be better. 38 DEVELOPING A NEW SYSTEM • Identical or similar programming techniques for desktop and portable system. This is an important point of view especially for a developer. Identical programming techniques require only one development for all hardware platforms. In the best case, the application is developed on one hardware platform and is ported to the other. This shortens implementation times • Availability for the developer and user Available hardware avoids additional costs for the developer and user. Looking at the above aspects, a personal computer in combination with a personal digital assistant may be the best solution. The usage of Windows 95/98 or Windows NT for the personal computer and Windows CE for the PDA provides similar programming techniques. The widely available operating system Windows CE allows to choose between many different devices and different models. The range lasts from handheld PC’s, handheld pro PC’s to palm-size PC’s. Selecting the best device is up to the user and depends on his preferred input method and the main usage of this system. Using this application for a stationary phone call documentation, a personal computer will also fit the requirements. If a mobile system with keyboard is required a notebook or handheld computer will be the best choice. Some users prefer the smallest available device and do their documentation job without a keyboard. For them a palmtop device will suit their needs. 5.2 Additional Hardware To record telephone calls, it is necessary to capture the audio from the telephone. For this purpose, a special hardware is required. The main goal of this circuit is to connect the line in jack of the sound card potential-free to the a,b lines of the analog phone line. The Audio Transformer (TFÜ) is current-limited by the R-C circuit and the two diodes cut the output voltage to a maximum of 0.7 V. The attached variable resistor is used as a volume control. This simple but effective circuit would do the job for the first tests. 39 DEVELOPING A NEW SYSTEM Figure 5-1 : audio capture circuit for a analog phone 5.3 Functionality The first implementation of this project has less requirements than the whole project. When using Windows NT or Windows 95 as operating system the server application is covered by a simple file server. So it is not necessary to implement the server side of this application. The client saves it’s information locally with files that can be stored later onto a server if necessary. 5.3.1 Raw data For this version an audio stream represents the raw data. An audio recording is provided with at least 8 kHz3 sampling rate. The recording of the audio stream will be done with double buffering. This is a method where two memory buffers are used to receive the audio data from the audio device. While one buffer is filled with new recorded data, the second buffer can be compressed, written to file 3 8 kHz means 8000 samples per second 40 DEVELOPING A NEW SYSTEM and cleared for the next usage. The computer which runs this application must be fast enough to process this second buffer while the first buffer is filled with new data. 5.3.1.1 Audio compression Since simple PCM streams lead to a very large amount of data, a compression algorithm is used. The compression allows to trade the audio quality with file size. The Windows 95 / 98 / NT / CE operating systems API provides an audio compression manager (ACM) which allow to use so called codec4 drivers for compression and decompression. When recording, the raw data is taken from an audio source, compressed by the codec driver and can then be stored into a file. The playback is as simple as the recording. The compressed file is read, decompressed by the codec driver and then sent to the audio device for playback. This implementation allows to choose a specific codec driver from the available drivers on the system. The file size of the resulting audio file will not only depend on the recording time but massively on the chosen codec. Be sure to select a suitable codec driver which helps to get a good audio quality and an acceptable file size. A good introduction in speech coding and commonly used coders can be found at [MMRUS]. I will not get in detail of audio compression codecs and the three classes (waveform codecs, source codecs and hybrid codecs). As a help for choosing a suitable compression driver some common codecs will be listed below. 5.3.1.1.1 PCM Standard PCM streams are not compressed. Depending on the sample rate, the bit resolution and the number of channels the memory consumption per second (kB/s) may vary within a long range. The kB/s is calculated with the following formula: bits 8 The following table compares some possible combinations: kB / s = sample rate ⋅ number of channels ⋅ NR Comment, Usage Sample rate [1/sec] channels bits Approx. Size [kB/s] 1 CD, highest quality 48000 2 16 192 2 CD quality 44000 2 16 176 3 Radio quality 22050 1 8 22 4 Telephone quality 11025 1 8 11 5 Minimum quality 8000 1 8 8 4 codec: compressor and decompressor driver 41 DEVELOPING A NEW SYSTEM Due to the very big resulting file size, this format is not recommended. There will be better codecs which produce nearly the same audio quality while needing less disk space. 5.3.1.1.2 Mobile Voice The Mobile Voice codec compresses and decompresses audio data using an HP/CU proprietary algorithm. This codec was developed especially for handheld and palmtop devices which have very few memory and disk space. The resulting file size is below 1 kB/sec. The quality of the recorded audio is not satisfactory, so this codec is not recommended. 5.3.1.1.3 Truespeech from DSP group TrueSpeech is a family of speech compression and decompression algorithm and also a software. It has been designed for personal computers and personal communication devices. With the high compression rates ranging from 15:1 to 27:1, TrueSpeech improves the storage and communication transmission of digital voice information and it can also be used in the integration of personal computers and telephones. This could be a suitable driver, but it is not the driver of my choice. 5.3.1.1.4 MPEG Audio MPEG (Moving Pictures Experts Group) is a group of people that meet under ISO (the International Standards Organization) to generate standards for digital video and audio compression. In particular, they define a compressed bit stream, which implicitly defines a decompressor. However, the compression algorithms are up to the individual manufacturers, and that is where proprietary advantage is obtained within the scope of a publicly available international standard. Real time encoding with this algorithms requires very fast machines so this codec is not recommended. 5.3.1.1.5 ADPCM Codecs Adaptive Differential Pulse Code Modulation (ADPCM) codecs are a waveform codec which quantize the difference between the speech signal and a prediction that has been made of the speech signal instead of quantizing the speech signal directly, like PCM codecs do it. If the prediction is accurate then the difference between the real and predicted speech samples will have a lower variance than the real speech samples, and will be accurately quantized with fewer bits than would be needed to quantize the original speech samples. The best compression rates that are achieved with this codec were about 4 kB/sec. The next codec will do a better job. 42 DEVELOPING A NEW SYSTEM 5.3.1.1.6 GSM 6.10 GSM 06.10 is a standardized lossy speech compression used by most European wireless telephones. It uses RPE/LTP (residual pulse excitation / long term prediction) coding to compress frames of 160 13-bit samples (8 kHz sampling rate, i.e. a frame rate of 50 Hz) into 260 bits. For more details see [JS96] and [GSM610]. The quality of the algorithm is good enough for reliable speaker recognition, even music often survives transcoding in recognizable form (given the bandwidth limitations of 8 kHz sampling rate). This driver produces approximately 1.6 kilobytes data per second. This is about five times smaller than PCM streams that are not compressed. Since this was the only codec driver, which was working correctly on a Windows NT machine, I would recommend this codec as a good deal between audio quality and file size. 5.3.1.2 Audio file format To be compatible with common audio applications a standard file format is implemented. The preferred format for multimedia files is the resource interchange file format (RIFF). The RIFF file I/O functions work with the basic buffered and unbuffered file I/O services. [MSDN01] describes this format in detail. RIFF files use four-character codes to identify file elements. These codes are 32-bit quantities representing a sequence of one to four ASCII alphanumeric characters, padded on the right with space characters. The basic building block of a RIFF file is a chunk. A chunk is a logical unit of multimedia data, such as a single frame in a video clip. Each chunk contains the following fields: • A four-character code specifying the chunk identifier • A doubleword value specifying the size of the data member in the chunk • A data field The following illustration shows a "RIFF" chunk that contains two subchunks. 43 DEVELOPING A NEW SYSTEM Figure 5-2: structure of a RIFF chuck A chunk contained in another chunk is a subchunk. The only chunks allowed to contain subchunks are those with a chunk identifier of "RIFF" or "LIST". A chunk that contains another chunks is called a parent chunk. The first chunk in a RIFF file must be a "RIFF" chunk. All other chunks in the file are subchunks of the "RIFF" chunk. "RIFF" chunks include an additional field in the first four bytes of the data field. This additional field provides the form type of the field. The form type is a four-character code identifying the format of the data stored in the file. For example, Microsoft waveform-audio files have a form type of "WAVE". 5.3.2 Index data Three different types of index data are implemented. All indices work the same way using so-called events: An event occurs, whenever an index is created and the current running time of the documentation is saved with this event. With this running time, the audio playback can be positioned right to this event. As an addition to the recording, the finished documentation can be extended at a later time. This is helpful when reviewing the documentation after a meeting and correcting some typing errors, making some extensions or carrying out some topics more detailed. For this reason the same events are created as they were created during the recording. The only difference is, that the stored running time represents the end of the documentation. With this trick it is possible to distinguish between actions which were stored during the recording and actions which were stored during the extension. 44 DEVELOPING A NEW SYSTEM 5.3.2.1 Keyboard entry The first type of index data is a camcorder like function which records all pressed keys in an edit window. Pressing a key during the recording function causes the application to enter a subclass function, where the current key-code and the status of the Toggle Keys (Shift, Ctrl and Alt Keys) are stored to a file. All keyboard actions belonging to the edit window are stored in this manner. This shows, that pressing the backspace key not only deletes one character in the edit window, but also is stored with the key-code for later playback. The contents of the edit window itself are never stored. It is only a result of playing back the recorded keyboard actions. When seeking to a specific position within the documentation during playback, the edit window is cleared and all recorded keyboard actions are repeated till the current position. This leads to a content of the edit window, which is the same as it was at the recording time. Looking at this some will see, that the file-size of the keyboard file will grow at every keyboard tip in contrast to the content of the edit window. For example pressing the letter ‘A’ and then the ‘BACKSPACE’-key results in an empty edit window, but leads to two stored keyboard actions in the file. 5.3.2.2 Drawings For this type of event a new window is created which can be used as a sketch pad by the user. Drawing in the window can be done with different input devices. The only important thing is, that the device has to be mapped as a mouse. Some manufacturers provide good hardware to simulate a pencil and a sheet of paper. A good example is [WAC]. The mapping as a mouse causes the input device act like a mouse. So the application can not and has not to distinguish between different input devices. From now on there will be no difference between the several input devices and all further explanations will be done with a mouse as an input device. So clicking a mouse button when using a standard mouse is the same as tapping the tip of the pen onto the tablet when using a pen device. All lines the user draws are stored using events. For a better understanding of the storage mechanism let’s have a look at the different phases of drawing a line: • clicking the mouse button down • moving the mouse around (with the button still pressed) • releasing the mouse button This represents the drawing of a line, which has not to be a straight one. It can be constructed by adding several short straight lines to one connected line. See the figure below: 45 DEVELOPING A NEW SYSTEM Figure 5-3: drawing a line Every mouse movement generates a WM_MOUSEMOVE event. These events are only processed, when the mouse button is down. When the mouse button is up, the events are ignored. Looking at the above phases, two different drawing events are generated and stored into the file: • A start point of a line • A point inside a line (including the end point of a line) A start point of a line is detected, when the mouse button was up before the mouse movement, and is now down. Then a simple point is drawn and the position is saved for the next event as the starting point of a single line. Every following WM_MOUSEMOVE event with the mouse button still down is stored as an inside point of a line. The line ends, when the mouse button is released. With this ability to differ between recordings during the meeting and recording while extending some information, the extended strokes can be displayed in a different color. 5.3.2.3 Markers A marker can be used to mark a specific passage of the conference. Six different types of marker are supported. These types are: • Remark • Question • Agree • Disagree • Action to be taken • Change of subject 46 DEVELOPING A NEW SYSTEM 5.4 Scenario with this application The following description shows a scenario, which the application should enable: The target platform for the client application is a common personal computer with Windows 95 or Windows NT as operation system. 5.4.1 Recording a document The audio source is recorded in the background. Common information like actual recording time is displayed continuously. During the conference it is possible to write down textual notes in an edit window. Every action in this edit window is recorded including backspace, cursor keys or cut and paste actions. A second window is provided, which can be used as a sketch pad. The lines are drawn with the usage of a standard mouse as input device. Image editor specific tools like an eraser, shapes or something else is not supported. Additional markers can be set by pressing a suitable push button and every marker can contain some extra textual information which can be displayed and edited when clicking on a marker. The different types of markers are displayed on a time line at the corresponding time position with characteristic icons for that marker type. 5.4.2 Playback and usage of the document The playback is similar to the operation of a video recorder. The playback can be started or stopped with buttons and a slider, which indicates the actual playing position, can be moved within the document. The playback can the be started at the new slider position. With the index information of the markers, the drawing tool and the textual information, the slider is repositioned to the corresponding time. Once again, playback can be started at this position. The textual information and the drawings are restored during playback. This looks like a ghost writer is using the keyboard and the mouse. The markers, which are set during the recording phase, are displayed on the timeline and can be clicked to show or edit the additional information. Some search mechanism are provided. When the slider is at the end of the document, the final text in the edit window and all the drawings are visible. By clicking into the drawing area, the slider is repositioned to the time, where the drawing event occurs that is the closest to the 47 DEVELOPING A NEW SYSTEM clicking position. The same procedure is used with the textual information. By setting the cursor into the edit window, the slider is set to the time, when this word was typed. 5.5 Purpose of this application This application is used to test the habits of different users when documenting a conference or meeting. The different methods to create index information should be examined and compared to obtain a method, which is easy to use but provides a sufficient way to retrieve information from the raw data with an acceptable effort. Porting the application to a Windows CE device should point out the advantages and disadvantages of a small but portable system. 48 USER GUIDE 6 User guide The user guide should help users to handle this system. The different chapters are arranged in an order the user should need it. So this would be a complete walkthrough. 6.1 Running the application After the installation of the program, it can be started using the start menu of Windows. There are no command line options necessary to run this application successfully. Figure 6-1 : The Main window before ... Figure 6-2 : ... and after the configuration When the application is started for the first time, the button for a new documentation recording and the button to review a recording are disabled. First a correct configuration is needed to enable these buttons. For this purpose, select “Configuration ...” from the Menu. Figure 6-3 : The main menu to bring up the configuration 49 USER GUIDE 6.2 Configuration The configuration window includes sections for a data path, the audio compression and a section to define the colors for the drawing controls. Figure 6-4 : The configuration window The Data Path represents the folder on the client machine, were the newly created documentation files are stored. When reviewing a documentation, this folder is the default for searching for files. The Browse Button on the right side of the data path lets the user select a specific path without entering it’s name into the edit control. So typing errors are avoided. Note: The Data Path can also be a network path, but it is recommended to use a local path in order to avoid loss of data during network problems. The section for the audio compression lets the user select a specific audio codec for the compression and decompression of the audio streams. Pressing the Select Button on the right side of the line brings up a standard window to select a codec. Figure 6-5 : Selecting a codec for the compression and decompression 50 USER GUIDE The third and last section in the configuration window contains three customizable color fields. These colors stand for: • Main stroke color This is the color of the main strokes. All lines, which will be drawn during the playback are drawn with this color. This should be a dark color on the white background to achieve a good contrast. • Preview stroke color The preview gives the user a good outlook of the whole page of the drawing. The preview color should be a light color with a low contrast to the background, till it represents not the main strokes. • Extended stroke color The extended strokes are not recorded during the conference, but after it. A different color to the main stroke color lets the user quickly recognize, which strokes are made during the conference and which strokes are the result of the extension process. The colors can easily be chosen by clicking onto the colored rectangle. By performing this action a standard color choose dialog pops up and a new color can be selected. Figure 6-6 : Selecting a color 51 USER GUIDE 6.3 Creating a new documentation To start a new documentation session simply press the New Button in the Main Window. Note: This Button is only active after a successful configuration. When pressing this button, the three windows for a documentation are displayed and a new recording starts immediately. Feel free to move around the three windows on the screen. The position is stored and used for all following documentation sessions. 6.3.1 The main window The main recording window contains: • six buttons for the markers, • some information about the current running time and space consumption of the audio recording, • a slider for the current position and • two buttons to terminate the recording. Figure 6-7 : The main recording window Pressing one of the six marker buttons creates a new marker and opens a window to enter some additional notes for this marker. The newly created marker is displayed in a rectangular region below the slider using a small icon. This icon can be clicked to reopen the additional marker notes window. 52 USER GUIDE Figure 6-8 : The additional marker notes In the recording mode, the slider can not be moved. This feature is only enabled during the playback mode. Pressing the OK button causes the application to save all the data. The Cancel Button allows to discard the recordings. In both cases the recording is terminated and the three windows are closed. 6.3.2 Handwriting and Notes To create the important index data, two windows are provided: • Handwriting • Notes The Handwriting window is used like a sketch pad. The best results can be obtained using a pen device. In the lower right corner some buttons are available to leaf through the pages. When pressing the next button at the last page, a new page is created automatically. 53 USER GUIDE Figure 6-9 : The Handwriting window The Notes window is used for textual information. The content can be used later for a full text search over several documents because it is available as a simple text file, too. All common keyboard shortcuts are provided including selecting, cut and paste. Figure 6-10 : The Notes window 6.4 Reviewing a documentation To review a documentation open a previously recorded document using the Open Button in the main window. Similar to the recording process, three windows are opened. The 54 USER GUIDE Handwriting and The Notes window are exactly the same, but the main playback window differs from the main recording window. Figure 6-11 : The main playback window When seeking within the document with any of the available methods, all the windows are synchronized depending on the current start position after that seek: • The slider is set to the new position • The handwriting window displays the contents as they were at the recording time and • The Notes window shows the same text as at the recording time. 6.4.1 Random seek Using the operator buttons on the top left of the main window the playback can be controlled. In the following, each individual button is explained: Seek to the beginning of the document Seek back approximately 2 seconds Seek back approximately 2 seconds and then start playback Start playback at the current position Seek forward approximately 2 seconds Seek to the end of the document Alternatively it is possible to move the slider to seek within the document. Note: Seeking is only possible, when the playback is stopped. During an active playback, seek actions are rejected. 6.4.2 Accurate seek When looking for specific information, the random seek methods would not meet the requirements for an effective search. There are three better seek / search methods available and these methods represent the key features of this application. 55 USER GUIDE 6.4.2.1 Markers Markers, which were set during the recording phase, are stored with time information. Clicking a specific marker seeks the slider to the corresponding time, where the marker was set and a window is opened to review the additional marker information. 6.4.2.2 Handwriting The Handwriting window provides a simple search method using the mouse. First of all, the page which is used for the search should be displayed. The buttons on the bottom of the Handwriting window can be used to leaf through the pages. When the appropriate page is displayed, a mouse click within the window starts the seeking process. Starting from the click position the nearest point on a line is used to get the time information and the new start position is set to this time. Figure 6-12 : Seeking using the Handwriting window 6.4.2.3 Notes Seeking by using the notes requires to display that part of the text where you want to go to. This can be achieved by seeking to the end of the document using the button or moving the slider to the right end. Now use the mouse to set the text cursor on any position within the text in the Notes window. Using this cursor position, the new start position is updated to that time, where these letters were typed. 6.5 Extending a documentation An existing documentation can be extended by pressing the Extend Button. This enters into an extending mode, which is similar to the recording phase. 56 USER GUIDE In this mode additional textual and handwriting notes can be entered in the appropriate windows. The time information for these extended actions is set to the end of the document. So for the whole documentation this looks like the information was entered at the stop time at the recording. In the Handwriting window, this additional information is displayed in a different color, so when reviewing the document at a later time, it is very easy to see the extended information. Pressing the Extend Button a second time leaves the extend mode and reenters the normal playback mode. 57 THE IMPLEMENTATION 7 The Implementation ConfDoc is programmed in C++. This chapter deals with the actual implementation and describes some important processes with the help of state diagrams and flow charts. 58 THE IMPLEMENTATION 7.1 The main parts of the program Idle Main menu Playback Recording Extend Figure 7-1: main state diagram The above state diagram shows the different modes of this application. When starting the program the idle mode is started. All further program states are reached through the main menu. Pressing the New Button causes the application to enter the recording mode, pressing the Open Button leads to the playback mode. The Extend mode can only be entered when the playback state is already active. Pressing the Extend button a second time leaves this mode. Note: The configuration window is ignored in the above illustration. 59 THE IMPLEMENTATION 7.2 Creating a new documentation Because the audio recording is done in the background, only the creation of the three index data is described. 7.2.1 Audio The audio recording is realized with double buffering. For this reason, memory is allocated for two buffers which are filled alternately from the audio device. While one buffer is filled with new recorded data, the second buffer can be written to file and cleared for the next usage. The wave audio API supports a callback function, which is called from the audio device driver whenever a buffer is full and ready to be processed. begin recording terminate recording open wave device wait till de vice drive r is finis he d prepare buffers unprepare buffers s e nd buffe rs to de vice drive r close wave device Figure 7-3 : Terminating the recording wait till callback is e nte re d Figure 7-2 : Initializing the recording The recording starts with an initialization procedure where the audio device is opened, the buffers are allocated and both buffer are sent to the audio device. Now the recording starts and 60 THE IMPLEMENTATION there is nothing else to do than to wait till the first buffer is full and the callback function is entered. begin callback buffer 1 is full ? yes save buffer 1 to file no buffer 2 is full ? no prepare buffer 1 yes save buffer 2 to file s e nd buffe r 1 to de vice drive r prepare buffer 2 s e nd buffe r 2 to de vice drive r leave callback Figure 7-4 : Callback function of the recording The callback function is responsible for saving the full buffer to file, preparing and sending it to the audio device for the next recording phase. The termination of a recording is done when the user presses the OK or Cancel Button. In this case, the audio device notifies the termination and then it is necessary to wait till the audio device has actually finished. This is essential to let the audio device enough time to stop the recording and compress the audio data in the buffer so far. 61 THE IMPLEMENTATION 7.2.2 Markers Creating a new marker requires no demanding activities. The only two actions to be taken are: • Get the type of the marker and • Retrieve the current audio position from the recording. This is all necessary information which is stored with a new marker. Marker Button was pressed get type of marker get audio position save data to file *.mrk Figure 7-5 : Recording a marker event 7.2.3 Handwritten notes Saving handwritten notes is done using a subclass function. This function is called whenever the mouse is moved or a mouse button is pressed. This requires to filter the incoming mouse events and discard all events where no mouse button is pressed. The remaining events can be divided into two sections: • The mouse button was pressed right now • The mouse button was down and the mouse was moved. 62 THE IMPLEMENTATION The first case represents a start of a new line, the second is equivalent to a single straight line in-between a curve. Remember “Figure 5-3: drawing a line” for an illustration of drawing a line. enter Subclass event WM_MOUSE MOVE ? yes no yes no no event WM_LBUTTON DOWN ? is mouse button pressed ? yes get cursor position get cursor position save start point for line draw line get audio position save data to file leave Subclass Figure 7-6 : handwritten notes *.drw 63 THE IMPLEMENTATION 7.2.4 Textual notes The textual notes are also captured using a subclass function. The function captures all WM_KEYUP, WM_KEYDOWN, WM_SYSKEYUP and WM_SYSKEYDOWN messages. Every event requires to get and save the keycode, the keyboard flags including SHIFT and CTRL key state and the event itself to reproduce the press of this key at later time. Any other event is ignored during this subclass function. enter Subclass event: WM_KEYDOWN WM_KEYUP no yes get keycode get keyboard flags get audio position save data to file *.eky leave Subclass Figure 7-7 : Textual notes 7.3 Reviewing a documentation Opening an existing documentation leads to two different situations. The first is a simple playback, where the audio data is played and the previously recorded events are faked at the appropriate running time. 64 THE IMPLEMENTATION The second, and the more important, is the search capability of the three different index data types. 7.3.1 Playback When playing back a documentation, the events are faked by the system at the appropriate time. The key procedure for this faking is a timer event at the interval of 250 ms. In other words, every 250 ms the keyboard and the handwriting event queues are checked for pending events. The marker events need not to be faked, because all markers are visible on a timeline. The additional marker remarks are displayed by clicking onto the appropriate marker. During playback, no actions need to be taken regarding the markers. enter Timer event pending handwriting event ? yes fake handwriting event read next handwriting event yes fake keyboard event read next keyboard event no pending keyboard event ? no leave Timer event Figure 7-8 : Playback of a document 65 THE IMPLEMENTATION 7.3.2 Searching for information Searching for information can only be done when the current playback is stopped. At a running playback it is necessary to press the Stop Button before the search capabilities are enabled. Searching can be done with all three index data types. The markers, the textual and the handwritten notes. Each index requires a different search method. 7.3.2.1 Markers Since the time position for a marker is stored directly, searching with markers is a straight forward method. Strictly speaking it is more a seek than a search. Clicking on a specific marker causes the application to seek to the position where the marker was set and then opens the window with the additional marker information to review them. Mouse button clicked ge t time pos ition of marke r fake slider event with time ope n marke r re mark window end Figure 7-9 : Searching using markers 66 THE IMPLEMENTATION 7.3.2.2 Handwritten notes The handwritten notes provide a visual search method. As described in the User Guide (Chapter 6), a mouse click within the window starts the seeking process. Using the nearest point to click position is searched by examining all draw events in the recorded document. Starting at the file beginning all events are examined one by another and the distance between this event and the click position is calculated. When one event is closer to the click position than all other previous events, the event is stored and the search is continued. At the end of the file the event with the smallest distance to the click position is used to redraw the handwriting window. 67 THE IMPLEMENTATION Mouse button clicked event =NULL as far a possible seek to first event all events processed ? yes redraw till saved event no end read next event position is closer than closest yet ? yes save event no Figure 7-10 : Searching with handwritten notes 7.3.2.3 Textual notes Searching with available textual notes is the most difficult way and the only one with ambiguities. The information which is taken for the search is the cursor position and the two characters before and after the cursor position. For searching, the textual notes are reproduced character by character and the text at the cursor position is tested for equality. 68 THE IMPLEMENTATION The following example should show the ambiguity of this search method: Let’s suppose the user entered some text, pressed the Backspace key to delete the last character and then enters the same character again. The same text will then exist at the same cursor position at two different recording times. For this example the search will result in the first occurrence of this text. Mouse button clicked ge t chars be fore and afte r curs or clear control seek to first event read and fake next event ge t chars be fore and afte r curs or chars are the same ? yes end no Figure 7-11 : Searching using textual information 69 MAKING THE SYSTEM PORTABLE 8 Making the system portable Making the system portable is an essential part of this project. The best system can not be used efficiently, if it is restricted to one location. So a portable version of this software is necessary. 8.1 Hardware The current available handheld or palmtop computers are powerful enough to fit the hardware requirements for this system. 8.1.1 Operating system The operating system Windows CE seems suitable, because of the similarity to Windows 95/98. Some important aspects are: • Connectivity to Desktop PC’s A Windows CE device is well integrated into the desktop of Windows 95/98 and the communication between these two devices is managed by the operating systems. There is no need to backup the recorded data manually. The active sync programs can be configured in that way that they synchronize at every connection. Using a docking station makes this process as easy as it can be. • Memory expansion Recording audio streams will produce a big amount of data, but Windows CE devices will have much less RAM available than a desktop PC. In addition, most devices will have no disk drive. So this amount of data can be a problem. 70 MAKING THE SYSTEM PORTABLE With the availability of an compact flash adapter this problem can be solved. CompactFlash storage cards range from 2 MB up to 320 MB, as supplemental removable storage for storing data, including audio, images, files, video, and applications. In addition, CompactFlash microdrives are available from 160 to 340 MB of storage. 8.1.2 Device Type When choosing a PDA to purchase, the following characteristics and uses should be considered. [CEWIN] recommends that users make a list of the intended uses for their PDA and consider the following characteristics: • Size/Weight Does the PDA fit in the pocket? Is it portable enough? Is it light enough to carry when using it? • Input Methods Does the PDA have a keyboard? Is a handwriting recognition for input desired ? Does the PDA have a touch screen, glidepad or a trackpoint for mouse functions? Does the PDA support an external mouse or external keyboard ? • Features Is a CompactFlash slot needed ? Is a modem necessary - if yes then what speed? Is the unit fast enough to meet the requirements? Does the PDA have enough RAM for running programs and storage? Will a flash card suffice for additional storage or is the PC Card slot occupied? What kind of connectivity is needed - Ethernet, IrDA, RAS, wireless LAN, wireless nationwide? Which battery capacity is needed ? • Uses Will a big amount of data be entered ? If so, is a keyboard important? How important is a large screen? How many colors does the screen need to support - 256 colors, 65536 colors? Is image editing or printing important? • Applications Is Pocket Word, Pocket Excel, Pocket Access or Pocket PowerPoint needed? If so then consider the H/PC Pros since the P/PCs do not offer these capabilities. Is a special application needed ? The following table gives an built in application comparison between the two common hardware types: Application Palm-size PC Handheld PC 71 MAKING THE SYSTEM PORTABLE Pocket Outlook (Calendar, Contacts, Tasks) Yes Yes Pocket Applications (Excel, Word, Powerpoint, Access) No Yes Notetaker/Inkwriter Yes, Notetaker Yes, Inkwriter Pocket Internet Explorer (Web Browser) No, Mobile Channels Yes Inbox (E-Mail) Yes Yes Accessories (Calculator, Voice Recorder, World Clock) Yes Yes Communications (PPP, Terminal Emulator, Ethernet, PPP, Ethernet Wireless, Redirector) Yes Games Yes Yes Maps & GPS Yes - Pocket Streets only Yes - Pocket Streets Only Voice Commands No No Each of these different hardware systems includes different applications. For some systems, 3rd parties have added capabilities that exist for other platforms. 8.2 Software Many Windows 95 applications can be ported to Microsoft Windows CE. This costs less effort than developing those from scratch. The major issues when porting to Windows CE are ([MSDN02]): • Differences between the Microsoft Win32® application programming interface (API) and Windows CE application APIs • Differences between the standard Microsoft Foundation Class Library (MFC) and MFC for Windows CE. This can be ignored, because MFC is not used for this application. • Memory limitations and out-of-memory recovery • Energy limitations • Widely varying hardware characteristics and limitations • Differences in testing and debugging 72 MAKING THE SYSTEM PORTABLE 8.2.1 Differences between the Win32 and Windows CE APIs The Windows CE API differs from the Win32 API in several important respects: • It is smaller. Only a subset of Win32 API is supported, and some of what is supported has a reduced feature set . Some Win32 functions are not supported at all—and none of the 16-bit Windows functions. These functions must be replaced with alternative ones, if available, or a work-around must me created. • Some Win32 functions have been substituted by Windows CE equivalents. For example, tool and menu bars have been combined into a single Command Bar, which has a new API. • Some Win32 functions are supported, but in a limited way. Some may have one or more parameters completely disabled. Others may have parameters with a reduced range of options. • Supported data types may need a modification. All necessary Win32 structures are supported, but some members may not be used. Other structures may not accept the full range of options. • Some messages are not supported—including many WM_* and EM_* messages. Some that are supported have been modified. For example, the content of wParam or lParam may be different. Some Windows CE–specific messages have been added— WM_HIBERNATE, for example. • There are Windows CE–specific extensions. Many of these—including the touch screen and notification—support the hardware capabilities of the various devices • There are limitations concerning the use of exception handling. While there is support for Win32-structured exception handling, Windows CE does not support C++ exception handling. When porting existing Win32 applications from the PC platform to Windows CE, the primary issue will usually be the smaller API. Applications will need to accommodate the limitations of the Windows CE API and the capabilities of the target devices. 8.2.2 Memory Limitations Windows CE devices will, in general, have much less RAM available than a desktop PC. In addition, most devices will have no disk drive or other mass-storage device. In most cases, porting an application to Windows CE successfully will require reducing its size. 73 MAKING THE SYSTEM PORTABLE When porting an application to Windows CE, you should focus on the features that are used most frequently. Microsoft Pocket Word and Microsoft Pocket Excel are examples of how to reduce the feature set of an application while still maintaining its essential functionality. Applications should be written in that way that they need as little memory and storage as possible. They must also be able to cooperate with the system in managing memory shortages. The amount of available memory depends on the particular device, so be aware of the capabilities of the target platforms. Windows CE makes no distinction between using mass storage (temporary files, for instance) and using RAM. The usage of memory and mass storage in the Windows CE application should be minimized and memory-intensive features like bitmapped graphics should be simplified or eliminated. Temporary files are not necessary. In some cases, the code can be rewritten to reduce memory usage at a cost of somewhat lower speed, which may be an acceptable tradeoff. If memory resources become tight, Windows CE has a procedure to reduce memory usage and restore available memory to acceptable levels. 8.2.3 Energy Limitations Windows CE devices may have very limited energy resources. The Handheld PC (H/PC), for instance, runs on two AA batteries. Programs should be written to minimize energy consumption as much as possible. In order to conserve energy, many Windows CE devices will shut down if they are not used for a certain period of time. Windows CE applications are expected to resume where they left off following a shut down. If a critical power shortage occurs while an application is running, that application must be able to handle the situation gracefully. Windows CE displays warning messages when the batteries start to run low but it does not send any warning to the applications. For many applications, these messages may be sufficient to ensure that the user takes appropriate action to avoid loss of data. An active (running) CPU consumes significant quantities of energy, so avoid coding practices that use CPU cycles unnecessarily. However, some hardware—modems, for instance—can drain batteries rapidly. If the program is going to place significant demands on the batteries, checking the state of the batteries first may be very helpful. If they are too low to complete the procedure, the user can be advised to take appropriate action. 8.2.4 Hardware Characteristics Windows CE is designed to run on devices that are, in general, smaller and less powerful than desktop PCs. For example: 74 MAKING THE SYSTEM PORTABLE • Screens are typically smaller, have fewer pixels, and may not support color. • CPUs are slower. • User-interface hardware such as keyboards may be less flexible. On the other hand, some devices may include hardware that is not standard on a PC—the infrared transceiver found on H/PCs, for instance. In any case, do not assume that all Windows CE–based devices are essentially similar to either a desktop PC, or each other. Keep the hardware characteristics of the target device firmly in mind. When porting an application to more than one class of devices, it will be necessary to find a "lowest common denominator" to ensure that the application will work successfully on all its target platforms. Although emulation is an important development tool, applications must ultimately be tested on actual devices to make sure that they perform properly. 8.2.5 User Interface The user interface may be one of the more difficult issues you will have to deal with. Virtually all PCs have a mouse and a keyboard for input, and a screen for output that is more or less similar on virtually all machines. A program that works on one machine will usually work on all. Windows CE involves a class of devices that not only differ from PCs in many ways, but may also differ significantly from each other. In short, there are no simple rules that will work for all Windows CE devices. To port a program successfully, one should know the peculiarities of the target platforms, and make whatever modifications are necessary. While PC-based emulators are valuable development tools, they have their limits. To be sure that the user interface is well designed and functional, it is essential to test it on the actual devices it is intended for. 8.2.5.1 Creating and Managing Windows Creating and managing windows with Windows CE is almost the same as it is with Win32. However, one will have fewer window-style and management options. Probably the most noticeable difference is that a user cannot resize windows. A window can only have the size specified at the time its creation. When targeting devices with small screens, the application should use full-screen windows. However, one should not code static layouts because different devices can have different screen dimensions. Most currently available H/PCs are 240 × 480 pixels and roughly 2.5 × 4 inches. Some have 240 × 640-pixel screens that are also somewhat wider, changing both the pitch and the aspect ratio. The GetSystemMetrics function can be called to get the relevant screen dimensions and to define window size. 75 MAKING THE SYSTEM PORTABLE 8.2.5.2 Using Windows CE Dialog Boxes Windows CE supports both modal and modeless dialog boxes and the predefined controls found in Windows 95. However, not all control styles are supported. Message Boxes are also supported, although with fewer styles than for Windows 95. Windows CE supports simple implementations of the common dialog boxes, Open and Save As. Their on-screen appearance is similar to Windows 95, but there are fewer controls available. 8.2.5.3 Porting User Interface Controls Most of the standard Windows controls and common controls are supported, but there are some limitations. One major difference is that the menu and toolbars are combined in a single command bar that occupies the top of the window and has its own API. Tool tips are only supported for buttons on the command bar. In general, there is a more limited range of options available. 8.2.5.4 Porting the Graphics Device Interface Most PC applications have a graphics device interface (GDI) that is inappropriate for Windows CE, so that they have to be modified before being ported. To keep the footprint of the operating system small, a number of Win32 GDI functions are not supported at all. In addition, the limitations imposed by hardware on many devices—limited screen size, palette, and aspect ratio, just to name a few—will require a somewhat different approach. 8.2.5.4.1 Adapting bitmaps and icons The available palette can be quite restricted. Some devices support a color screen, although probably with a more restricted palette than a typical PC. Many will have only grayscale graphics. So, for instance, the first generation of H/PCs supported only two-bits-per-pixel grayscale LCD displays. Bitmaps and icons should be in an appropriate format for the target device. LCD screens can be difficult to view in some settings, so keep the contrast as high as possible. 8.2.5.5 Using Unicode Windows CE is an Unicode environment—it supports ASCII functionality to allow the exchange of text files, but the native text format is Unicode. Some general guidelines for converting an ASCII application to Unicode are: • Include Tchar.h. It has all the necessary conversions. 76 MAKING THE SYSTEM PORTABLE • Use the Win32 string functions (lstrlen, for instance) rather than those from the C runtime library. • Use TCHAR, LPTSTR, and so on, for declarations. The code can then be easily compiled for either ASCII or Unicode. • Use the TEXT macro for strings (for example, TEXT("Your Text")). • Remember that a character is no longer one byte in length, and strings end with two zeros rather than one. • When incrementing an array pointer or character count, use sizeof (TCHAR) to ensure that it is valid for either ASCII or Unicode. 8.2.5.6 Managing Windows CE Threads Windows CE is a multithreaded operating system, but there are some limitations relative to the Windows 95 and Microsoft Windows NT® operating systems. Probably the biggest difference is that semaphores are not supported. If the application uses semaphores, for example, to manage device resources, it will need to be modified to use some other method of coordination between threads. For example, the application can use critical sections for thread synchronization. 8.2.6 Testing and Debugging Developing applications for Windows CE is very much like developing applications for other Win32 targets, but there are important differences concerning the testing and debugging methods. If you are developing for a standard Windows CE target (such as the H/PC), then much of your development and testing work can be done using the Windows CE emulation environment provided with your development tools. If, however, you are developing an application for a nonstandard hardware platform (such as a custom-embedded application), then you have to consider alternate methods for verifying the correctness of your application. The Windows CE API includes interfaces for debugging that can be used to create in-system debugging tools. Depending on your target hardware and application, you can also use the Remote API (RAPI) features of Windows CE to assist in debugging. In any case, you will need to thoroughly test your application on all classes of device that the application is expected to operate on. You should not rely on emulation environments to provide adequate testing. 77 MAKING THE SYSTEM PORTABLE 8.3 Porting ConfDoc 8.3.1 Preparing the implementation Before starting to implement this application, I tried to acquire information about coding a Windows 95 application in a way to easily port it to Windows CE. There were really less information available and the development for Windows CE was in the child’s shoes. Referring to [DDJ] the most important two tips were: • Do not use MFC (Microsoft Foundation Class) The MFC classes for Windows CE are not fully functional. Using MFC on for Windows 95 can lead to big problems if the used classes are not available on the Windows CE device. • Do not use STL (standard template library) The Microsoft C++ CE compilers does not support the STL. Some differences between the Win32 API and the Windows CE API were listed, but not all. Especially for the audio recording and the standard wave file format was no information obtainable. 8.3.2 The porting process 8.3.2.1 UNICODE The Unicode environment was a bit confusing the first time, but these problems were solved really fast. Especially the usage of string functions caused some problems because the Unicode environment needs some rethinking. 8.3.2.2 User interface Porting the user interface was a straight forward procedure. Although no problems occurred during the porting process, the finally user interface is not as good as it can be. The user interface for Windows CE must be completely redesigned to fit to such a small display. 78 MAKING THE SYSTEM PORTABLE 8.3.2.3 Windows CE API The most known functions not supported by the Windows CE system were omitted during the coding process but some of them were still used. They caused some additional porting expenses. The bigger part of necessary porting activities were caused by function which are not supported by Windows CE and were no information was available at the start of the coding. These functions include the Shell – extensions for selecting a directory (SHBrowseForFolder) and for a different color (ChooseColor) in the configuration dialog and all functions regarding the resource interchange file format services and the buffered services for audio recording. Since the audio recording and the wave file functions are a big part of this project new classes with the same functionality were implemented. 8.3.2.4 Conclusion The project was started using Microsoft Visual C++ 5.0 and the Windows CE Toolkit for Visual C++ 5.0. The help library for the Win32 functions did not point out anything about the availability on the Windows CE operating system. So when coding the Win32 version I had to keep in mind, which functions are available on Windows CE and which should not be used. This was not very easy as the results show. Upgrading to Microsoft Visual C++ 6.0 and the appropriate Windows CE Toolkit led to a better help system for the developer. The included MSDN Library contains an essential part called “Requirements” in every documentation of a function, were the availability of this function is listed. Figure 8-1 : Requirements from MSDN online Unluckily this toolkit was only available after the porting was done. A tip for developers: If no MSDN Library is available it is also possible to retrieve the information at “MSDN online”5. There you can get always the newest version of the documentation. 5 “MSDN online”, http://msdn.microsoft.com 79 MAKING THE SYSTEM PORTABLE 8.4 Porting a Windows NT ACM Driver to Windows CE 8.4.1 Motivation As covered in chapter 5.3.1.1 “Audio compression”, it is necessary to compress the audio data to avoid a huge amount of data. Since palmtop devices are equipped with a small quantity of memory (4 MB to 16 MB) and this memory is shared between main memory and file system, compression is essential. Especially the first test device, a Philips NINO 300, has a total amount of 4 MB memory. Dividing the memory into file system and main memory results in a maximum of 2 MB space for data files. When buying a Palmtop PC, there are only two different codec’s preinstalled: (refer to chapter 5.3.1.1 “Audio compression” for details) • PCM • MobileVoice Remember, the lowest quality PCM stream produces 8 kB data each second. With reference to the available 2 MB disk space this leads to a maximum recording time of 256 seconds, or 4 minutes and 16 seconds. This is not a satisfying recording period. On the other hand, the MobileVoice provides the best compression rates and the longest recording times with the given memory limitations. But the quality of this compression method is not acceptable and the recorded words are often difficult to understand. It is now the goal to find a compression algorithm, which fits all the needs: • Good audio quality • Acceptable disk space consumption. The best codec in the prior comparison was the GSM codec. So it is obvious to use this codec on the Windows CE device too. But this requires to port this codec from Windows NT to Windows CE. 80 MAKING THE SYSTEM PORTABLE 8.4.2 The theoretical porting [MSDN03] describes the procedure as follows: “Since the compression of the raw data is done via an ACM driver, this porting is necessary. To port an ACM driver from Windows NT to Windows CE, link the driver with the Acmdwrap.lib file, which provides the driver with the appropriate stream I/O interface. This library not only handles all interactions with the ACM and the Device Manager, but also passes messages from its ACM_IOControl function to the Windows NT ACM driver’s existing DriverProc function.” [MSDN04] and it’s subtopics list all necessary registry entries on the Windows CE device to install the ACM codec. 8.4.3 The practical porting So far so good. The changes were quickly done with the help of a sample source codes on the Windows CE Toolkit CD. Now the compilation could start. Two different methods were tried: • Creating a makefile and compiling via commandline • Creating a new project using the Developer Studio It took several attempt and a long time to build a cegsm.dll file. But the journey should start by now with the intention to install this brand new codec on the device. The installation should be done by simply copying the file to the /Windows/System directory and entering some registry entries to activate it. These registry entries are listed below. HKEY_LOCAL_MACHINE [Drivers] [BuiltIn] [CEGSM] SZ: Dll = cegsm.DLL DWORD: Order = 0x00000000 (0) SZ: Prefix = ACM DWORD: DeviceArrayIndex = 0x00000000 (0) DWORD: DeviceType = 0x00000000 (0) SZ: FriendlyName = GSM codec for WinCE Creating these registry entries is possible with some freeware programs or the remote registry editor from the Windows CE toolkit. I tried both methods, but the device driver won’t work. 81 MAKING THE SYSTEM PORTABLE The adapted source code, the compilation method and the created registry entries are three part of the development process. These parts can not be tested separately. Testing requires to perform all three steps at once. If the result is not working, it is hard to find out which part is erroneous. Debugging was not easy, because the device driver is loaded at boot time. This means, the driver is activated, when the device starts from a soft- or hard-reset. Since debugging is practically only possible via the remote connection and this connection is not available at boot time, the debugging was not working. The best thing I could do is contacting a specialist. I tried to receive support from Microsoft but my multiple mails were not answered. Another source for help is the internet. Searching some proper newsgroups I found a few articles from users with the same or similar problems, but no solution to this. Posting my problem and waiting for help was not successful unfortunately. At that point of development is skipped porting the device driver and bought a compact flash card with 16 MB of RAM. Using the installed PCM codec at 8 kHz sampling rate I achieved an average recording time of 34 minutes. This should be enough for testing. 8.5 Conclusion Choosing Windows CE was an easy decision, because the Windows CE devices were the only one with multimedia support. At that time it was not possible to record audio streams using a 3COM Palm or PSION device. Furthermore Windows CE allows a wide range of different devices. Selecting a palmtop device as portable system led to a lightweight and handy device. The disadvantage using a palmtop is a very small user interface which makes it difficult to create the essential index data. 82 ANALYSIS OF THE NEW SYSTEM 9 Analysis of the new system The system was tested on a stationary computer as a phone call documentation and as a portable system running on a Palmsize PC. 9.1 Stationary System 9.1.1 Hardware The stationary test PC meets the following specifications: • Intel Pentium at 350 MHz • 64 MB RAM • 8 GB Hard Disk • SoundBlaster 16 Stereo Sound Card with microphone and line in jack • Wacom Tablet as Pen Device 6 with an additional inking pen 7 • Ordinary phone with a special circuitry to capture the audio data 8 • Operating System Windows 95b 6 7 8 WACOM Intuos A5, see http://www.wacom.de/Products/intuos/intuosA5.htm for details WACOM Inking pen, see http://www.wacom.de/Products/intuos/pens.htm#ink for details Plantronics Vista Universam Amplifier, see [PLAN] 83 ANALYSIS OF THE NEW SYSTEM 9.1.2 Recording a documentation On the desktop computer, the application can be started on system boot time. In the idle mode, where only the main window is active, the memory consumption and CPU usage of the program is small enough to run in the background. So the system is quickly accessible every time and there is no need for manually starting the application at the beginning of a phone call. When an incoming or outgoing phone call begins, a new documentation can be started easily by pressing one button. The recording is started immediately and the notes and handwriting windows are shown. Because of the usage as a phone call documentation system and an ordinary phone, the user of this system has only on hand free to generate the index data. For this reason, the pen device is very helpful and the notes window is rarely used. By using an inking pen for the pen device, the notes can be written directly on a sheet of paper and the user is able to write in a familiar manner on a sheet of paper and there is no need to look on the screen while writing on the tablet. As seen from the tests, the procedure of recording a document is the following: • Starting the documentation with a single mouse click • The audio recording is done in the background – there is no need to interact. • Using the inking pen, the notes are made in the handwriting window. Because there is only one hand free, the textual notes are rarely used. The ability to create more than one page is necessary, because the size of the paper on the tablet is A5. • After the phone call the documentation can be saved using the OK button, or can be dismissed using the Cancel button. This must be done manually. • Because of the searching can only be done using the textual notes, some of these should be provided. For this reason the current recording is saved, reopened and extended. Now the recording is not running any more, and the textual notes window can be used to enter a short summary or conclusion of the call. 9.1.3 Reviewing a documentation The review process is started by clicking on the open button in the main window. After selecting a file the three windows are displayed. During the tests, there had never been the need of reviewing the whole recording. The handwritten and textual notes provide a good overview of the document and the index data allows a precise access of specific parts of the recording. A simple click into the 84 ANALYSIS OF THE NEW SYSTEM handwriting window performs a seek to the specified location. The audio recording can now be used to retrieve a conversation word by word. 9.1.4 Searching for information This function is not directly implemented in the application. Since the textual notes and the additional notes for markers are stored in text files, the windows search function can be used to search for information. With this search dialog, the filenames of the search results can be used to review a document. So the textual notes are used to find the correct documentation and the handwritten notes are used to locate specific information within this documentation. 9.1.5 Suggestions for improvement • Starting a documentation should be done with a hot key. On a multitasking system there are running more than one applications at the same time. Sometimes it is difficult to find the Conference Documentation Application to start a new recording. • To go one step further, the starting and stopping of a recording should be done automatically. • The size of the tablet is too small. A bigger tablet would be better. • The best results were achieved by making handwritten notes during the conversation and creating a quick summary at the end of the recording. For this reason an extra dialog for the summary should be provided at the end of a recording. The audio recording should be stopped before showing this dialog. 9.2 Portable System 9.2.1 Hardware The portable version was tested on a Philips Nino 3009 Windows CE Palmsize PC. This device meets the following specifications: • RISC processor at 75 MHz • 4 MB main memory 9 PHILIPS Nino 300, see http://nino.philips.com/ for details and datasheets 85 ANALYSIS OF THE NEW SYSTEM • grayscale display with 240 x 320 pixels and four shades of gray • built in microphone • One CompactFlash Card slot 9.2.2 Recording a documentation A windows CE device has not as much memory as a desktop computer. The documentation system is not running all the time – it is started when a documentation is needed. All devices with this operating system have four programmable buttons. The conference documentation application can be assigned to one of these buttons. This makes it easy to start the application when needing it. The two main differences between the desktop application and the portable system are the screen size and the keyboard. The small display with only 240 x 320 pixels provides a very small space for handwriting. The lack of a keyboard makes it very difficult to enter textual notes. Although a software keyboard and a handwriting recognition are built in to simulate a standard keyboard, the usage of these abilities results in a very slow writing speed. So the main method for creating index data is the handwriting window which provides only a small area to make notes. 9.2.3 Reviewing a documentation The review process including the search and seek capabilities is done in exactly the same way as on the desktop PC. The only difference of the two systems is the size of the devices. 9.2.4 Searching for information Because of the small amount of available memory on the palmsize PC, there were never more than two or three documents on the system. Therefore it was not necessary to search for a single documentation. The Windows CE device provides also a standard search dialog, which can be used. The small size of the user interface makes the search dialog difficult to handle. 86 ANALYSIS OF THE NEW SYSTEM 9.2.5 User Interface Figure 9-1 : Main menu The main window and the main menu are exactly the same as at the desktop version. Figure 9-2 : Main recording window Figure 9-3 : Handwriting window The above two illustrations show the main recording window and the handwriting window. Because a Palmtop device has no keyboard attached, the text input window is omitted. Figure 9-4 : Review windows 87 ANALYSIS OF THE NEW SYSTEM When reviewing or recording a documentation the two windows are displayed concurrently. Because of the small screen size it is necessary to put one window in the background when using the other. Both windows can not be displayed entirely at the same time. In the above situation, the marker line is hidden behind the handwriting window. 9.2.6 Suggestions for improvement • The main problem for the portable system is the small dimension of the Windows CE device. Creating index information is very difficult. • The user interface must be redesigned for Windows CE. • Since practically no textual information is stored, searching for a single document will be very difficult.. 9.3 Combining the two systems As seen above, the portable system has essential disadvantages. The best results can be achieved, when using both systems in conjunction. The fact that both systems create files of the same structure makes it possible to review documents on the desktop PC, which are recorded previously on the portable system. The integration of the Windows CE device into the desktop of the stationary computer allows a simple data synchronization between the two devices. So the portable system can be used to create a documentation on the road and the stationary system, which is a more powerful and user friendly system, can be used for searching and reviewing information from these documents. 88 CONCLUSION 10 Conclusion Using this system simplifies the process of documenting meetings and conferences but the results depend on the field of application and the hardware equipment. The basic idea of this system shows that the index data is the most important part of these documents. The raw data is created in the background both on the desktop system and on the portable system, so the main attention will be turned on the creation of the index data. The quality of the whole documentation always depends on the quality of the indices. The desktop system provides an appropriate sized interface, sufficient possibilities and workspace to create the index data. But depending on the type of index data some requirements will be necessary. Creating textual information requires both hands free to use the keyboard effectively and entering handwritten information demands a pen device as mouse replacement or supplementation. The beta tests with the desktop system has confirmed that using this application as a phone call documentation system can only be done when using some additional hardware. Either a headset is used to have both hands free to create the index data or, when using a standard phone, a pen device will be used to create handwritten notes. In contrast to the desktop system the portable system has two main differences: • smaller amount of memory and • smaller device dimension The smaller amount of memory will play no role because of the usage of compression algorithms and flash card memory extensions the available memory will be sufficient to record even long meetings. 89 CONCLUSION The smaller device dimensions may be suitable to carry the device when using it, but smaller device dimensions subsequently include smaller display dimensions and a smaller user interface. This makes it hard to create the necessary index data. 10.1 Further topics and future work The documents created by ConfDoc seem to be sufficient to retrieve specific information within a reasonable time. Using the keyboard for notes results in textual information which can be directly used for a full-text search over several documents. This enables a good document wide search. The handwriting window allows to create index data in a very familiar and easy way by simulating a pen and a sheet of paper. The search capabilities within a single document are sufficient but the handwritten notes produce no textual information even when using the capability for handwritten text. The main problem for this application is the user interface especially when using the portable system. When using the textual notes, which provide more search capabilities, the writing speed will be very slow and the user will need both hands free for typing. The second index generation method is easier to use but it results in less searchable data because no textual information is generated. On the other hand it seems reasonable to take different hardware into consideration. The usage of Palmsize PC’s leads to a very small user interface which are too small for a comfortable use. The introduced CrossPad (see Chapter 4.3.6 ) could be an alternative. Of course this device will not solve all problems. The user interface for the handwritten notes will be more comfortable than with the Palmsize PC, but textual notes are not created. The implementation of a handwriting recognition suggests itself to combine the handwriting with the creation of textual information. To go one step further, a voice recognition system could do a great job for producing textual information. List of figures: Figure 2-1: screenshot of the movie and the accumulated whiteboard ......................................7 Figure 3-1 : Abbildung für Patentantrag...................................................................................11 Figure 3-2 : Figure 1 of EP0495612.........................................................................................16 Figure 3-3 : Figure 2 of EP0495612.........................................................................................16 Figure 3-4 : Figure 3 of EP0495612.........................................................................................17 Figure 4-1 : Examples for subnotebooks: [PANA] and [LIBR]...............................................31 Figure 4-2 : Example of a pen device: [FUJI] ..........................................................................32 Figure 4-3 : Example of a handheld PC : [JORD]....................................................................33 Figure 4-4 : Example of a palmsize PC: [AERO] ....................................................................34 Figure 4-5 : The Crosspad: see [CROSS].................................................................................35 Figure 5-1 : audio capture circuit for a analog phone...............................................................39 Figure 5-2: structure of a RIFF chuck ......................................................................................43 Figure 5-3: drawing a line ........................................................................................................45 Figure 6-1 : The Main window before .....................................................................................48 Figure 6-2 : ... and after the configuration................................................................................48 Figure 6-3 : The main menu to bring up the configuration ......................................................48 Figure 6-4 : The configuration window....................................................................................49 Figure 6-5 : Selecting a codec for the compression and decompression..................................49 Figure 6-6 : Selecting a color ...................................................................................................50 Figure 6-7 : The main recording window................................................................................51 Figure 6-8 : The additional marker notes .................................................................................52 Figure 6-9 : The Handwriting window .....................................................................................53 Figure 6-10 : The Notes window..............................................................................................53 Figure 6-11 : The main playback window...............................................................................54 Figure 6-12 : Seeking using the Handwriting window.............................................................55 Figure 7-1: main state diagram .................................................................................................58 Figure 7-2 : Initializing the recording.......................................................................................59 Figure 7-3 : Terminating the recording ....................................................................................59 Figure 7-4 : Callback function of the recording .......................................................................60 Figure 7-5 : Recording a marker event .....................................................................................61 Figure 7-6 : handwritten notes..................................................................................................62 Figure 7-7 : Textual notes.........................................................................................................63 Figure 7-8 : Playback of a document........................................................................................64 Figure 7-9 : Searching using markers.......................................................................................65 Figure 7-10 : Searching with handwritten notes.......................................................................67 Figure 7-11 : Searching using textual information...................................................................68 Figure 8-1 : Requirements from MSDN online........................................................................78 Figure 9-1 : Main menu ............................................................................................................86 Figure 9-2 : Main recording window........................................................................................86 Figure 9-3 : Handwriting window ............................................................................................86 Figure 9-4 : Review windows...................................................................................................86 References: [AERO] Compaq Aero 2100 palmsize PC http://www.compaq.com/products/handhelds/2100/index.html [BM97] Chr. Bacher, R. Müller, Th. Ottmann, M. Will An Integrated Environment for Mbone Session Recording and Replay Submitted for EDMEDIA ' 97, the World Conference on Educational Multimedia and Hypermedia (June 14-19, Calgary, Canada) [CEWIN] Chris De Herrera's Windows CE Website http://www.cewindows.net/ “ Choosing a PDA” http://www.cewindows.net/choosingpda.htm [CROSS] cross pen computing group http://www.crosspad.com “CrossPad” http://www.crosspad.com/products/crosspad/index.html [DDJ] Dr. Dobbs Journal, April 1998, p.62ff Bruce Radtke “WINDOWS CE WIN32 API PROGRAMMING” http://www.ddj.com [FUJI] Fujitsu personal systems, Inc. - pen computers http://www.fpsi.fujitsu.com/ “Stylistic 2300” http://www.fpsi.fujitsu.com/product/st2300.htm [GSM610] GSM 06.10 lossy speech compression http://kbs.cs.tu-berlin.de/~jutta/toast.html [HM96] H. Maurer, Hyper-G now Hyperwave: The next generation web solution, Addisson-Wesley Publishing Company, Harlow, England, 1996 ISBN 0-201-40346-3 http://www.iicm.edu/hwbook [JORD] Hewlett Packard HP Jordana 680 Handheld PC http://www.hp.com/austria/pc/jornada_680.html [JS96] John Scourias Overview of the Global System for Mobile Communication http://www.shoshin.uwaterloo.ca/~jscouria/GSM/gsmreport.html [LIBR] Toshiba Libretto http://www.toshiba.com [MMRUS] Mobile Multimedia Research at the University at Southampton http://www-mobile.ecs.soton.ac.uk/speech_codecs/ [MSDN01] Microsoft Developer Network http://msdn.microsoft.com “Resource Interchange File Format Services” at the location http://msdn.microsoft.com/library/psdk/multimed/mmio_2uyb.htm [MSDN02] Microsoft Developer Network http://msdn.microsoft.com “Porting Windows 95 Programs to Windows CE” at the location http://msdn.microsoft.com/library/backgrnd/html/msdn_porting.htm [MSDN03] Microsoft Developer Network http://msdn.microsoft.com “Porting a Windows NT ACM Driver to Windows CE” at the location http://msdn.microsoft.com/library/wcedoc/wceddk/acmdrv_10.htm [MSDN04] Microsoft Developer Network http://msdn.microsoft.com “Registry keys for Device Drivers” at the location http://msdn.microsoft.com/library/wcedoc/wceddk/rkeys_1_2.htm [OB95] Thomas Ottmann, Christian Bacher: Authoring on the Fly J.UCS Vol. 1, No. 10, Oct 28, 1995 - Also available as a technical report (No. 72). http://www.iicm.edu/authoring_on_the_fly [OEPD] Austrian patent database http://www.patent.bmwa.gv.at/ and http://at.espacenet.com/ [PENC] Pen Computing covering mobile computing & communication http://pencomputing.com/WinCE [PANA] Panasonic “ToughBook 17” http://www.panasonic.com/computer/notebook/products/toughbook17.htm [PLAN] Plantronics - Headsets http://www.plantronics.com “Vista universal amplifier M12” http://www.plantronics.com/emea2/products/product_sheets/m12.html [WAC] WACOM Europe Ges.m.b.H http://www.wacom.de [WEBPAD] Portable Web Access http://www.national.com/webpad