ad-hoc shared state for web applications
Transcription
ad-hoc shared state for web applications
A D - HOC SHARED STATE FOR WEB APPLICATIONS Literature Study into applicable existing technology Jack Jansen Vrije Universiteit <[email protected]> student number 0197254 Fall 2008 Literature Study, VU Computer Science, Jack Jansen 1 On the cover picture: this shows an ad-hoc shared space for paint application. The closest thing I could find to the subject matter that was not a boring diagram. Used without permission, $om www.tynedale.gov.uk. Literature Study, VU Computer Science, Jack Jansen 2 A D - HOC SHARED STATE FOR WEB APPLICATIONS Literature Study into applicable existing technology Jack Jansen Vrije Universiteit <[email protected]> student number 0197254 Fall 2008 I. Introduction P RO B L E M S TAT E M E N T Web applications are usually structured in one of two ways: centralized or client-server. The first case is exemplified by shopping sites such as Amazon 1, social networking sites like MySpace2 or sharing sites such as Flickr 3, and many more. In this model, the business logic runs on the central infrastructure and the local web client provides only the user interface, possibly augmented with some auxiliary non-critical functionality (such as MySpace widgets). An example of the second case is Google Earth4, where the main logic runs in the browser, using the central server as a data repository. Applications that run completely in the browser, such as some Dashboard widgets5 like calculators can be seen as a special case of the latter. With the availability of many devices that are internet-connected and have the ability to run software such as a web browser, it is becoming more and more common that people have simultaneous access to a number of such devices: not only a traditional (laptop or desktop) computer but also a mobile phone, PDA, portable media player, home entertainment center, etc. If web applications running on these devices could easily find each other and share information this could lead to a richer experience than is currently possible. 1 http://www.amazon.com 2 http://www.myspace.com 3 http://www.flickr.com 4 http://earth.google.com 5 http://developer.apple.com/macosx/dashboard.html Literature Study, VU Computer Science, Jack Jansen 3 For the purpose of this study I am particularly interested in applications where the main logic is expressed in a declarative web language such as XForms6, SMIL7, SCXML8 or VoiceXML9. The reason for this pruning is that distributed applications written in a procedural language like JavaScript fit the general distributed computing model, and there is ample research in that field. Other factors that are important are the fact that we are interested in a robust solution, because components may be unavailable or become unavailable, and some form of ad-hoc addressing. E X A M P L E A P P L I C AT I O N S Let us now examine some example application areas, starting with some work I was personally involved in. In [4], [7] and [6] the authors sketch a personal remote control device for use while watching television in a home setting. The idea is that while everyone in the room watches the same main content, on the family television set connected to a central set top box or home entertainment system, there is the additional option of using personal devices. These personal devices (handhelds, mobile phones, tablet PCs) communicate with the central home system, and allow personal interaction without disrupting the shared experience. This interaction could be in the form of viewing additional content, such as background information on the program being watched, closed captions or voiceover in a different language. It could also be additional interactive functionality, such as enabling the user take personal notes, or share a pointer to the program being watched with a friend who is not present. In [22] another application area is sketched: web applications that consists of a number of relatively loosely coupled components that communicate through a shared data space. The paper sketches a video-based tourist guide application, shown in figure 1, consisting of a multimedia playback component, a map viewer component, an interaction form and some dynamic content. The components interact by storing and retrieving values in the shared data space. For example, the multimedia component stores GPS coordinates which are then picked up by the mapping component to show the cur6 http://www.w3.org/TR/xforms/ 7 http://www.w3.org/TR/SMIL2/ 8 http://www.w3.org/TR/scxml/ 9 http://www.w3.org/TR/voicexml20/ Literature Study, VU Computer Science, Jack Jansen SMIL plugin Customized content XForms form Mapping applet Figure 1. Tourist guide application 4 rent location of the presentation on the map. The global architecture is shown in figure 2. This paper does not discuss distributed applications, but from the diagram it is easy to envision the extension in that direction. HTML SMIL XForms map applet Forms are a paradigm that is shared among many web applications, and it is an area that Glue data model could also benefit from ad-hoc sharing. XForms provides a good starting point here, Figure 2. Guided tour document model because its model-view-controller architecture already separates the rendering from the underlying data model. In [21] the authors sketch a multimodal extension to XForms whereby the user can provide data either through the traditional XForms visual interface or through spoken dialogs. Such spoken dialogs would be a prime candidate to run on a separate handheld device such as a mobile phone. A final application area is ad-hoc access to local services. The Universal Remote Console10 is an example of such an application: an architecture that allows users to carry a single device that will interface to many different services. The device is tailored to the user so it would cater for specific needs of that user, such as speech input/output for a blind person or one-button operation for someone with motor disabilities. The local services advertise their availability and functionality, allowing either user to operate anything from a television to an elevator (and allowing them to tell the difference:-). There are ample examples of other local services, such as museum or shopping guides. The Cooltown project [24] is an architecture for such services, or see [26] for an overview of projects in this area. OUTLINE The remainder of this paper looks at existing research, projects and technologies that could be applied to the design of an architecture to enable the types of applications sketched above. It is broadly structured along the lines of the computer science area from which we expect help. Due to the breadth of this study it does not go very deep into any particular area, in stead pointing out starting points for further research, as well as pointing out directions that turned out to be dead ends. In section 2 we will look at language research, to determine what solutions are available for similar languages. In section 3 we will look at distributed computing paradigms that fit the problem space of shared state for declarative web languages. In section 4 we look at XML da- 10 http://myurc.org/ Literature Study, VU Computer Science, Jack Jansen 5 tabases. In section 5 we look at ubiquitous computing, where we expect to find some help for locating services, authentication and fault tolerance. II. Language research The languages we want to extend with shared data are all declarative, so it may be interesting to look at how other declarative languages have integrated shared data. In the area of functional languages, most current research into shared data is done in the context of Haskell. The seminal paper, [15], discusses the novel concept of Composable Memory Transactions, a way to easily allow nested transactions in a functional language. Transactional memory is discussed in-depth in the next section, here it suffices to understand that it is a way to enable a mechanism similar to database transactions in a programming language. The basis for their work is the observation that “a purely-declarative language is a perfect setting for transactional memory” because of the implicit distinction between operations with and without side effects, and the relatively few accesses to mutable variables. Disallowing operations with side effects inside a transaction allows transactions to be nested. In addition, an orElse construct allows alternative transactions, where the second one is attempted when the first one aborts. A follow-on paper, [11], explores how traditional concurrent algorithms using locks can be expressed with composable memory transactions, and what the resulting performance is. Thiemann [35] examines how the model can be used in real-world applications, and how it contrasts with the traditional database ACID transaction model. I have also looked at other categories of declarative languages, but this turned out to be much less successful. In the field of logic languages, [31] is an overview of concurrent logic programming languages, and how they relate to one another. The concurrency constructs discussed, however, are firmly rooted in the backtracking semantics of logic languages, and do not seem to be applicable to our problem area. The languages we want to extend can be seen as examples of Domain Specific Languages (DSLs), so it is interesting to see whether there are any common paradigms to add shared variables to DSLs. [33] and [28] enumerate a number of common design patterns for DSLs, but these turned out to be too abstract to be applicable directly. The latter also contains references to a large number of example DSLs, and another such list was obtained from [39]. These languages were - cursorily - inspected for potential overlap with our application area, but the few matches (insofar as they have not been discussed elsewhere in this study) turned out not to be applicable. SYNTHESIS Transactional memory operations are well-suited for embedding in the types of languages we are interested in: the declarative nature of the hosting language probably allows for a relatively Literature Study, VU Computer Science, Jack Jansen 6 simple distinction between operations that are allowed inside a transaction and those that are not. The structured nature of our target languages should allow us to expose a transaction mechanism to document authors in a reasonable way. III. Communication Paradigms Communication paradigms for distributed programming fall into one of three broad categories: message passing, shared memory, and separate coordination languages for controlling communication. MESSAGE PASSING Message passing, being a procedural concept, is used almost exclusively in procedural languages. The main exception to this statement is the Erlang [40] language and some of its descendants: Erlang is a functional language where all concurrency is derived from message passing. Unfortunately, it turns out this has little applicability to our problem area: all concurrency happens within a single program (whether distributed or parallel) and for this study I am specifically interested in cooperating applications, which have different requirements for service location, failure semantics, etc. S H A R E D M E M O RY Shared memory is a more promising area: our target languages all have some concept of variables, which would provide a good starting point for data sharing. Our target languages also share a high degree of abstraction, so we are mainly interested in concepts that match this abstraction level. With shared variables, the first design issue that comes to mind is concurrency control: it can be either implicit or explicit, and in the latter case it may take the form of locking or transactions. An area that seems well-matched to our abstraction level is that of transactional memory. Transactional memory was introduced by Herlihy and Moss in [20], as a hardware concept for allowing lock-free shared data structures on multiprocessors. The define a small number of new machine instructions, which basically work by recording the whole transaction in local cache. Load-transactional reads shared data, load-transactional-exclusive reads shared data with the intent of rewriting it later and store-transactional writes shared data (to the cache only). These instructions operate on the local CPU cache, which also stores the transaction state of each memory location. Three more instructions handle committing changes: commit attempts to write the update set back to global memory (failing if the state in the cache is inconsistent), abort discards the update set and validate does only the consistency check (allowing an early abort for a long transaction). Literature Study, VU Computer Science, Jack Jansen 7 A software implementation, So*ware Transactional Memory (STM) was demonstrated by Shavit in [32]. It requires only standard load-linked and store-conditional instructions, and basically works by recording all original values during a transaction and comparing these to current values during the commit. Methods are also provided to forestall starvation, etc. A distributed implementation of transactional memory is outlined in [19]. The keystone of this work is that the items taking part in a transaction are migrated to the host doing the transaction, together with a distributed cache coherence protocol that implements this efficiently. For well-connected networks simpler implementations may be possible, see for example [25]. C O O R D I N AT I O N L A N G UAG E S Shared memory and message passing attempt to graft parallel and distributed programming onto existing language constructs with (hopefully) minimal impact to the underlying language. Coordination languages, introduced by Carriero and Gelernter in [13], take a different approach: they separate the computation model and the coordination model. In its purest form, the computation model is a set of pure sequential side-effect free activities. The coordination model is the glue that binds these activities together and to their environment. This separation enables not only parallelism, distribution and fault-tolerance, but also heterogeneity in the computational components (because communication is handled in the coordination model). Linda [5] is probably the earliest example of a coordination model, and many data sharing models are based on it. The central concept in Linda is a tuple space, a shared associative data store. The tuple space has three main operations: • out(v1, v2, v3, ...) inserts a new tuple into the space, • in(t1, t2, t3, ...) searches for any matching tuple (where each ti can be either a value to match or a wildcard) and destructively reads it, potentially blocking until a matching tuple becomes available, • rd(t1, t2, t3, ...) is like in() but non-destructive. Tuple spaces are suitable for a whole range of problems, from tightly-coupled parallel programs to loosely coupled ad-hoc data sharing. The tuple space concept has since then been extended and integrated in many other languages and platforms, see for example JavaSpaces11 or T Spaces [41]. A number of people have looked at the integration of tuple spaces in the XML world. In [37], Tolksdorf et. al. describe XMLSpace, a tuple space that can hold XML documents as tuple field data, along with various ways for matching these. Aside from the obvious match (two 11 http://java.sun.com/developer/products/jini/ Literature Study, VU Computer Science, Jack Jansen 8 identical XML documents match) it is also possible to match on XML query expressions such as XPath. In a later work, [36], this work is extended under the name XMLSpaces.NET. Besides the port from Java to C#/.NET, which is not very interesting in the context of this study, here the underlying model is also extended. Tuples are allowed as values in tuple fields, giving rise to a tupletree. It is then shown that the whole tuple space is itself such a tupletree, and is representable in XML. This gives rise to new ways of matching, including shallow and deep matching. Similar to XMLSpaces.NET is xSpace [2], but the latter is set in the web services field. This results in xSpace being XML-document centric (as opposed to object-centric), with more emphasis on matching with XPath-like regular expressions. xSpace does not treat the tuple space as a single XML document but as a collection of XML documents. Papadopoulos and Arbab have provided an overview of coordination models and languages in [29]. They separate the playing field into two areas: data-driven coordination and control-driven coordination. Of these, the first one is interesting to our problem. Most of the data-driven coordination languages and models they discuss are based on tuple spaces, with the remainder geared towards massive parallelism more than ad-hoc data sharing. SYNTHESIS There are two prime candidate models for adding distributed shared data to our target languages: • simple shared XML data, designed along the lines of software transactional memory, fits our problem space and is implementable; and • integrating tuple space access into XML is another option, and in can be done in a way that matches the XML mindset fairly well. These two models have very different characteristics from an application point of view, so whether one, the other or both should be used requires further study. IV.XML Databases Another area that we may learn things from is the field of XML databases. Specifically, concurrency control in the form of locking or transactions is an area where we may be able to find relevant work. A lot of work in this area is related to efficiently locking in an XML front-end where the data storage back-end has only limited XML-awareness, but the last few years there has also been some interesting papers on how the XML semantics can be used to enable more parallelism. In [23], the authors describe the XPath Locking Protocol (XLP), aimed specifically at locking data that is accessed through XPath expressions. They observe that the various nodes involved in an Literature Study, VU Computer Science, Jack Jansen 9 XPath-based update or retrieval do not all require the same locking semantics. For example, if a node is locked because one transaction’s XPath expression traverses it the only operation that is disallowed is deletion of that node. They proceed by defining 5 types of locks (pass-by, read, write, insert and delete), a matrix of how these locks exclude each other and how XLP implements this. XLP is then compared with other concurrency control mechanisms such as 2phase locking and tree-locking. It performs well, with respect to all of document size, readwrite ratio and XPath length. Helmer et. al. [18] compare various locking strategies for XML document, from simple wholedocument R/W locks to various ways of locking at the node or arc level. They then come up with 4 types of locks (shared, exclusive, modify, traverse) and show how well their solution performs. Unfortunately, their solution cannot be directly compared to XLP, because they use DOM access methods (as opposed to XPath expressions) as their basic operations. They do make two observations that should be interesting to XPath-based locking too: • If direct node references by xml:id are allowed this leads to serious complications and requires extra locks; • knowledge of the DTD allows extra parallelism (because some tree-modifying operations can be known not to conflict with some traversal operations, for example). Finally, [17] compares 11 XML database locking protocols for performance. The winner, taDOM3+, is also by far the most complex, with 20 lock modes. SYNTHESIS What we can learn from XML databases for our problem area is the fact that if our underlying data is XML structured this allows for optimizations that are not possible in the general case of unstructured data. However, we have to be aware that we are not targeting doing thousands of transactions per second on terabyte databases, so we should decide where the sweet spot in the tradeoff between performance and complexity lies. V. Ubiquitous Computing Our area of interest has a number of overlaps with ubiquitous computing: • heterogeneity of platforms and languages needs to be catered for, • applications (or components of an application) need to find one another to be able to communicate, • fault tolerance is important, because components may disappear (or not be available at all), and Literature Study, VU Computer Science, Jack Jansen 10 • authentication and security issues need to be addressed. The Cooltown project [24] aims at bridging web technology and ubiquitous computing, and as such is relevant to our scope. Cooltown uses beacons to make users’ aware of the services available in the vicinity: these beacons are infrared transceivers that will provide an entry point URL upon request. From this URL, the user can the get at other related services. The URL may contain a capability that allows access to the local services through a “reverse proxy”, so services are also available if your network connection happens to be non-local (through your mobile phone carrier, for example). As an aside, they note that the standardized data format, interfaces and middleware on the web should solve most of the heterogeneity issues. The Intentional Naming System (INS) [1] is a service location system that is (in their own words) expressive, responsive, robust and easily configurable. The expressiveness comes from the naming scheme, which is a hybrid between attribute-value naming and hierarchical naming. Figure 3 shows an example of this: the name for a publicly accessible camera in the oval office with some specific resolution. Their system also allows for different types of resolution: early binding (traditional lookup, which is then followed by Figure 3 - INS name communication), intentional anycast (late binding at the time of message transmission, sending to any one recipient) and intentional multicast (same, but sending to all easily accessible recipients). Edwards, in [12], gives an overview of service discovery issues for ubiquitous systems. Three of the current systems discussed seem applicable to our situation: uPnP SSDP 12, Bonjour 13 and SLP14. These share some characteristics, such as being able to operate using multicast only. SLP and Bonjour can also use a directory service, when available. SSDP and SLP use URIs as identifiers, with SLP providing an additional LDAP-like search capability on attributes, Bonjour uses DNS-style dotted names. All three provide a way for clients to be notified of changes in service availability. 12 http://www.upnp.org/ 13 http://developer.apple.com/networking/bonjour/ 14 http://www.ietf.org/rfc/rfc2608.txt Literature Study, VU Computer Science, Jack Jansen 11 SYNTHESIS There are various options for a naming scheme through which our components can find one another, each with their own strength and weaknesses and, therefore, application areas for which they are best suited. A searchable namespace may work best for loosely coupled web applications whereas more tightly coupled components may be better off with a hard mapping. A similar reasoning may be true for early or late binding: a more loosely coupled shared data architecture may benefit from late binding. Investigating decoupling of access privileges from physical network connection is worthwhile, due to the nature of the devices we envision. VI.Conclusion This study has provided some very useful starting points for further research into an architecture for enabling ad-hoc shared web applications: • Tuple spaces are a paradigm that seems to be applicable to declarative languages, and would work well in the case of loosely coupled web applications. Various attribute-based discovery schemes are also available, with the possibility of doing late binding (which would again benefit loosely coupled applications). • Software transactional memory is another paradigm that maps well to declarative languages, and seems more easy to integrate into an XML-centric setting. Therefore, STM may be more applicable to web applications where the participating components are more tightly coupled, or - at least - have a more integrated view of the data model. • If we want to do shared data with locking there are various optimizations possible because we use XML-structured data. • Future research will need to determine the boundaries of the application area beforehand (such as expected data size and levels of concurrency and access rates), to forestall going overboard on complexity with little benefit. • Physical proximity does not equal network proximity. Therefore, decoupling authentication and security from physical network connection, if possible, would be beneficial. VII.References 1. Adjie-Winoto et al. The design and implementation of an intentional naming system. ACM SIGOPS Operating Systems Review (1999) vol. 33 (5) pp. 186-201 2. Bellur et al. xSpace: a tuple space for XML & its application in orchestration of web services. SAC '06: Proceedings of the 2006 ACM symposium on Applied computing (2006) pp. 766-772 3. Boyer. Interactive office documents: a new face for web 2.0 applications. DocEng '08: Proceeding of the eighth ACM symposium on Document engineering (2008) pp. 8-17 Literature Study, VU Computer Science, Jack Jansen 12 4. Bulterman et al. Enabling Pro-Active User-Centered Recommender Systems: An Initial Evaluation. Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007) (2007) pp. 195-200 5. Carriero et al. Linda in context. Communications of the ACM (1989) vol. 32 (4) pp. 444-458 6. Cesar et al. Enhancing Social Sharing of Videos: Fragment, Annotate, Enrich, and Share. ACM MM'08 (2008) pp. ??? 7. Cesar et al. Social Sharing of Television Content: An Architecture. Ninth IEEE International Symposium on Multimedia Workshops (ISMW 2007) (2007) pp. 145-150 8. Chawathe et al. Change Detection in Hierarchically Structured Information. Proceedings of the ACM SIGMOD International Conference on Management of Data (1996) pp. 493-504 9. Dekeyser et al. Conflict scheduling of transactions on XML documents. ADC '04: Proceedings of the 15th Australasian database conference (2004) vol. 27 pp. 93-101 10. Dekeyser et al. Peer-to-peer form based web information systems. ADC '06: Proceedings of the 17th Australasian Database Conference (2006) vol. 49 pp. 79-88 11. Discolo et al. Lock Free Data Structures Using STM in Haskell. Lecture Notes in Computer Science (2006) vol. 3945 pp. 65-80 12. Edwards. Discovery Systems in Ubiquitous Computing. IEEE PERVASIVE COMPUTING (2006) vol. 5 (2) pp. 70-77 13. Gelernter et al. Coordination languages and their significance. Communications of the ACM (1992) vol. 35 (2) pp. 97-107 14. Grabs et al. XMLTM: efficient transaction management for XML documents. CIKM '02: Proceedings of the eleventh international conference on Information and knowledge management (2002) pp. 142-152 15. Harris et al. Composable memory transactions. Proceedings of the tenth ACM SIGPLAN symposium on Principles … (2005) pp. 48-60 16. Harris et al. Lightweight object-oriented shared variables for distributed applications on the Internet. OOPSLA '98: Proceedings of the 13th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (1998) pp. 296-309 17. Haustein et al. Contest of XML lock protocols. VLDB '06: Proceedings of the 32nd international conference on Very large data bases (2006) 18. Helmer et al. Lock-based protocols for cooperation on XML documents. Proceedings of Workshop on Web Based Collaboration (DEXA’03) (2003) pp. 23019. Herlihy et al. Distributed transactional memory for metric-space networks. Distrib. Comput. (2007) vol. 20 (3) pp. 195208 20. Herlihy et al. Transactional Memory: Architectural Support for Lock-Free Data Structures. Proceedings of the 20th Annual International Symposium on Computer Architecture (1993) pp. 289-300 21. Honkala et al. Multimodal interaction with xforms. ICWE '06: Proceedings of the 6th international conference on Web engineering (2006) pp. 201-208 22. Jansen et al. Enabling adaptive time-based web applications with SMIL state. DocEng '08: Proceeding of the eighth ACM symposium on Document engineering (2008) pp. 18-27 23. Jea et al. Concurrency control in XML document databases: XPath locking protocol. Proceedings of the 9th International Conference on Parallel and Distributed Systems (ICPADS'02) (2002) pp. 551 24. Kindberg et al. People, Places, Things: Web Presence for the Real World. Mobile Networks and Applications (2002) vol. 7 (5) pp. 365-376 25. Kotselidis et al. Designing a Distributed Software Transactional Memory System. ACACES '07: 3rd International Summer School on Advanced Computer Architecture and Compilation for Embedded Systems (2007) 26. Krüger et al. Adaptive Mobile Guides. Lecture Notes in Computer Science (2007) vol. 4321 pp. 521-549 Literature Study, VU Computer Science, Jack Jansen 13 27. Luttenberger et al. XML Language Binding Support for Pervasive Communication in Distributed Virtual Shared Information …. Second IEEE International Conference on Pervasive Computing and Communications Workshops pp. 18128. Mernik et al. When and how to develop domain-specific languages. ACM Computing Surveys (CSUR) (2005) vol. 37 (4) pp. 316-344 29. Papadopoulos et al. Coordination Models And Languages. CWI Report (1998) (SEN-R9834) 30. Raper. Applications of location–based services: a selected review. Journal of Location Based Services (2007) 31. Shapiro. The family of concurrent logic programming languages. ACM Computing Surveys (CSUR) (1989) vol. 21 (3) pp. 413-510 32. Shavit et al. Software transactional memory. PODC '95: Proceedings of the fourteenth annual ACM symposium on Principles of distributed computing (1995) pp. 204-213 33. Spinellis. Notable design patterns for domain-specific languages. The Journal of Systems & Software (2001) vol. 56 (1) pp. 91-99 34. Tatarinov et al. Updating XML. SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data (2001) pp. 413-424 35. Thiemann. User-level transactional programming in Haskell. Haskell '06: Proceedings of the 2006 ACM SIGPLAN workshop on Haskell (2006) pp. 84-95 36. Tolksdorf et al. XMLSpaces.NET: An Extensible Tuplespace as XML Middleware. .NET Technologies (2004) 37. Tolksdorf et al. XMLSpaces for coordination in web-based systems. Enabling Technologies: Infrastructure for Collaborative Enterprises, 2001. WET ICE 2001. Proceedings. Tenth IEEE International Workshops on (2001) pp. 322-327 38. Trewin et al. Abstract user interface representations: how well do they support universal access?. Proceedings of the 2003 conference on Universal usability (2003) pp. 77-84 39. van Deursen et al. Domain-specific languages: an annotated bibliography. ACM SIGPLAN Notices (2000) vol. 35 (6) pp. 26-36 40. Wikstrom. Distributed programming in Erlang. Proceedings of PASCO'94 - First International Symposium on Parallel Symbolic Computation (1994) pp. 412-421 41. Wyckoff et al. T Spaces. IBM Systems Journal (1998) vol. 37 (3) pp. 454-474 Literature Study, VU Computer Science, Jack Jansen 14