here - Computer Science
Transcription
here - Computer Science
Trends in Information Management (TRIM) is a biannual journal of the Department of Library and Information Science, University of Kashmir, India aimed to publish original papers on various facets of Information and Knowledge management. The Journal welcomes original articles, reviews of books, professional news etc., relevant to the focus of the journal. TRIM ISSN: 0973-4163 July – Dec 2011 Vol. 7 No. 2 © 2011, Department of Library and Information Science Frequency: Subscription Annual: Biannual (June & December) India: INR 500.00 Foreign: USD 20.00 EDITOR Professor S.M. Shafi Department of Library and Information Science University of Kashmir, Srinagar, J&K India 190 006 ASSOCIATE EDITOR Dr. Sumeer Gul Assistant Professor Department of Library and Information Science University of Kashmir, Srinagar, J&K India 190 006 Online Access to TRIM is available through: EBSCO HOST’s Database: Business Source Complete www.ebscohost.com TRIM is indexed in o ulrichsweb.com(TM) -- The Global Source for Periodicals o Cabell Directory: Educational Technology & Library Science database Editorial Board Professor P.B. Mangla Former Head, Department of Library and Information Science University of Delhi, Delhi, India Professor Shabahat Husain Department of Library and Information Science Aligarh Muslim University, Aligarh, U.P, India Professor N. Laxman Rao Department of Library and Information Science Osmania University, Hyderabad Andhra Pradesh, India Dr. Vittal S. Anantatmula Associate Professor Director, Graduate Programs in Project Management College of Business, Western Carolina University North Carolina, United States Dr. Deva E. Reddy Associate Professor Texas A&M University University Libraries College Station. Texas USA Dr. Muhammed Salih Medical Reference Librarian National Medical Library Faculty of Medicine and Health Sciences United Arab Emirates University Al-Ain, United Arab Emirates Dr. Jagdish Arora Director, Information and Library Network (INFLIBNET), Ahmedabad, Gujarat, India Professor M.P. Satija Department of Library and Information Science GND University, Amritsar, Punjab, India Professor Jagtar Singh Dean, Faculty of Education and Information Science, Punjabi University, Patiala Punjab, India Professor I.V. Malhan Head, Department of Library Central University of Himachal Pradesh H.P, India Editorial ‘O pen access’ and ‘collaboration’ may help societies to grow and prosper in free exchange of ideas and opinions, resulting in establishment and evolution of an informed society. It is here that Open Source Software systems (OSS) have derived much support from this premise to become a global phenomenon by fuelling development and research in wide areas of applications across academic, professional and social initiatives .These are emerging and nurturing parallel to commercial endeavors with varied levels of success and failures. Among many, Library and Information Studies have a natural synergy with Open Source Movement and consequently libraries are frequent users of OSS though staff may often be unaware of utilities derived from such options. Hence the Department of Library and Information Science (University of Kashmir) in collaboration with Department of Computer Sciences (University of Kashmir) decided to organize a National Seminar on the theme to have wider understanding and demonstration of far reaching implications in the field and applications to mainstream Library landscape. We received more than 100 papers, posters for the presentation for three days National Seminar on Open Source Software Systems: Challenges and Opportunities (OSSS 2011) (20-22 June, 2012) which were deliberated upon in different technical sessions. Later, it was decided to publish select useful papers in the present special issue of TRIM. These papers can be broadly categorized into three groups throwing light on philosophy, technical developments and applications. Another cluster is devoted to focused development about Open Journals / Repositories – a blossom of Internet and Open Digital Library convergence – in an Open Access mode. It will not be out of place to lay down a brief background about the developments of OSS systems and applications especially Digital Library (DL) landscape here to assist a reader to navigate the journal issue in a purposeful manner. The systems that hold the trade mark of the term i OSS, defines it not only in terms of availability of source code but include free distribution as well as its future use with modification and derived works thereof without any discrimination, of course, with a licensed system. Hence, Stallman (2011) has rightly identified four kinds of freedom for open Source applications supported by licensing that include freedom to run the program for any purpose, adapt it to serve ones needs, redistribute copies and improve the programme to community. This extensive use of freedom has led to recent shift to the acronym FLOSS (Free / Libre and Open Source Software) (Ghosh, 2002) but it seems that OSS is still more familiar acronym among many initiatives / users. OSS licenses are of particular concern to make it available to others and need to understand them before committing to use the software for any project. The most common license is GNU (General Public License, 2007) (GPL) which is based on concept of “Copy Left” that attempts to negate copyright for the purpose of collaborative software development. ‘Creative Commons’ (n.d) is similar to that of GPL which is meant for creative works like research papers, and often utilized within software projects. The others include (a) GNU, Lesser General Public License (2007) (LGPL) (b) Berkley System Distribution (BSD) License (Open Source Initiative, 2010) (c) Mozilla Public License (MPL) (2011) (d) Netscape Public License (NPL) (2011) (e) and OCLC Research Public License (2002). Besides, the Internet has become so ubiquitous that greatest participation on open software’s occurs over it. Some of the pillars of internet computing such as Send mail sever & BIND, the software that runs the web Domain Name Systems (DNS) are OSS applications. Apache, the most popular server in the world is both maintained and enhanced through open source model. Out of all OSS systems available, it is LINUX that is most recognizable and identified as the poster child of OSS (Rhyno, 2004). In addition libraries have a natural synergy with open source movement, for being available to a wide community of users of non-profit, publicity funded basis and like most organizations frequent users of open source software. Besides, in the present era the emergence of digital libraries ii has become a main bridge to connect open source with intellectual property of sharing the main collection of libraries. It poses tribulations but empower one to be to be part of knowledge society. Hence, much work and research has emerged from slop out content for digital libraries to use of authoring tools, protocols for exchange purpose, long term preservation etc. The content of digital libraries vary greatly particularly in media type but one format that comes across media is XML which has timed a key enabler along with Meta data for realizing value of digital libraries and paved way for development of different tools of semantic web like Resource Description framework (RDF) and Topic maps. Important protocols and OSS options for using them have revolutionized to help to communicate with many external systems. It is the Hypertext Transfer Protocol (HTTP) which powers the web to exchange files (text, graphic images, sound, video and other multimedia files). The other distinguished ones include OAI–PMH (the open archives initiative protocol for metadata harvesting), serving mainly as a transport mechanism between digital libraries, including Apache, Z39.50, SOAP, RSS, ATOM and Shibboleth etc. Authoring tools have developed in different dimensions for creating digital versions of the object – mainly image tools and editors to address graphics in reality with colour depth and compression. For commercial package, flagship image tool is Adobe Photoshop, a full featured and powerful image editing tool but reasonable alternative is GIMP (GNU Image Manipulation Program). The other tools for image in OSS include Image Magical, GNU Paint, SANE, Sweep Sox, Bender etc. The most interesting and challenging from OSS perspective is Open Office especially for archiving and preserving contents of documents, even produced in any other word processing package. The open source has made much headway in other aspects like Relational Databases where My SQL has been most popular and long lasting application and runs on all major platforms. The others include Postgre SQL, Berkley DB etc. It has served well in programming languages as well like Perl, PHP, Python etc besides building specific systems in many public iii systems like digital libraries by developing popular and well known systems like D Space, Greenstone, Fedora, Eprints etc. Many more open Systems like OJS, OCS, OMP and OHS are offered through PKP (Public Knowledge Project, 2011) breaking ground for managing and disseminating freely scholarly publications at different levels eventually turning into a boom for academic and research endeavors. I hope that more seminars / symposia need to concentrate over the challenges and explore opportunities to address key issues in emerging society focused to make waves more transparent, responsive and user friendly. I am grateful to my all colleagues especially, Dr. Sumeer Gul, Mr. Nadim Akhtar Khan and Mr. Tariq Ahmad Shah without whom it was not possible to organize the seminar and bring out this issue. They have burnt their midnight oil in coordinating activities and later selecting and editing papers for the present issue. My sincere gratitude goes to University of Kashmir authorities, J&K Government (IT Department) and J&K Bank Ltd. for sponsoring the event which also proved supportive in bringing the present publication. S. M. SHAFI References Creative Commons. (n.d). About The Licenses. Creative Commons. Retrieved from http://creativecommons.org/licenses/ Ghosh, R. A. (2002). Workshop Report on Advancing the Research Agenda on Free / Open Source Software. University of Maastricht, The Netherlands: International Institute of Infonomics. Retrieved from http://flossproject.org/report/workshopreport.htm GNU Lesser General Public License. (2007). GNU Lesser General Public License. GNU Operating System. Retrieved from www.gnu.org/copyleft/lesser.html General Public Licence (GNU). (2007). General Public Licence. GNU Operating System. Retrieved from iv http://www.gnu.org/licenses/gpl.html Mozilla Public Licence (MPL). (2011). Mozilla Public Licence. Mozilla. Retrieved from http://www.mozilla.org/MPL/ Netscape Public Licence. (2011). Amendments. Retrieved http://www.mozilla.org/MPL/NPL/1.1/ OCLC Public Research Licence. (2002). OCLC Public Research Licence. OCLC. Retrieved from http://www.oclc.org/research/activities/software/license/v2fina l.pdf Open Source Initiative. (2010). Open Source Initiative OSI - The BSD License: Licensing The BSD 2-Clause License. Open Source Initiative. Retrieved from http://www.opensource.org/licenses/bsd-license.php Public Knowledge Project. (2011). Public Knowledge Project. Retrieved from http://pkp.sfu.ca/ Rhyno, A. (2003). Using open source systems for digital libraries. Westport, Conn: Libraries Unlimited. Stalman, R. (2011). Richard Stallman's Personal Home Page. Retrieved from http://stallman.org/ v CONTENTS Editorial i-v S. M. Shafi Changing Designer-Function in Open Source 74 Gagan Deep Kaur Open Access Journals in Library and Information Science: The Story so Far 87 Reyaz Rufai, Sumeer Gul and Tariq Ahmad Shah Graph Based Framework for Time Series Prediction 98 Vivek Yadav and Durga Toshniwal Quality Practices in Open Source Software Development Affecting Quality Dimensions 108 Sheikh Umar Farooq and S. M. K. Quadri Open Source Tools for Varied Professions 127 Nadim Akhtar Khan Analysis of Operating Systems and Browsers: A Usage Metrics 139 Mohammad Ishaq Lone and Dr. Zahid Ashraf Wani Institutional Repositories: An Evaluative Study 152 Tabasum Hashim and Tariq Rashid Jan Open Source Code Doesn’t Always Help: Case of File System Development 160 Wasim Ahmad Bhat and S.M.K. Quadri A New Approach of CLOUD: Computing Infrastructure on Demand 170 Kamal Srivastava and Atul Kumar Einstein’s Image Compression Algorithm: Version 1.00 Yasser Arafat, Mohammed Mustaq and Mohammed Mothi 179 Open Source Software (OSS): Realistic Implementation of OSS in School Education 186 Gunjan Kotwani and Pawan Kalyani Measurement of Processes in Open Source Software Development 196 Parminder Kaur and Hardeep Singh Strengths, 206 Appraisal and Dissemination of Open Source Operating Systems and Other Utilities 216 Open Source Systems and Weaknesses and Prospects Engineering: Javaid Iqbal, S.M.K.Quadri and Tariq Rasool Satish S. Kumbhar, Santosh N. Ghotkar and Ashwin K. Tumma Morphological Analysis from the Raw Kashmiri Corpus Using Open Source Extract Tool 225 Manzoor Ahmad Chachoo and S. M. K. Quadri Open Access Research Output of the University of Kashmir 237 Asmat Ali, Tariq Ahmad Shah and Iram Zehra Mirza Book Review 244 Iram Zehra Mirza News Scan Sumeer Gul 246 Changing Designer-Function in Open Source Kaur Changing Designer-Function in Open Source Gagan Deep Kaur * Abstract Purpose: Open source, since its arrival in 1990’s has been instrumental in challenging the copyright owned by traditional texts and narratives. Dissolving of author-function heralded by post-modern philosophers like Roland Barthes has been materialized by open source technology completely. However, along with textual narratives, similar dissolution can be seen for pictorial representations as well like the designing of a web-site for instance. Open source makes available technologies which along with, the content, the presentation, for instance, of the website in equal dissolution stage. This paper seeks to explore this trend for presentation part and argues how, like author-function, designer-function, is in jeopardy Design/Methodology/Approach: For this purpose, the two open-source tools AddArt and ShiftSpace, the Mozilla Firefox extensions have been used to bring home the impact of these in terms of altering the presentation of various websites. Findings: The tools have been found to be significantly impressive in modifying the way the content is presented on the website without the owner’s permission. The modifications effected by these tools are presented at the end of the paper in the form of various snap-shots. Research Implications: The implications of these tools can range from purely webethical to political. The tools are not only found to raise ethical concerns of privacy and ownership of the way information is to be presented by the owners, but politically also present a trespassing avenue through which capitalist ideas of ownership of information through copyright can be effected. Originality/Value: The paper has attempted to bring to the forefront the ethicopolitical concerns associated with naively appearing open-source browser extensions like AddArt and ShiftSpace. The considering of concerns can provide a further criteria for evaluating such extensions apart from purely technical which is a norm at present. Keywords: Designer-Function, AddArt, ShiftSpace, Web Aesthetics, Open Source Paper Type: Research Section -1: Salvaging the Author-Function isruptive technologies like internet led to subversion of power structures on the one hand, and the mechanism of knowledge production and access on the other. Stallman’s Free Software movement, in 1980s, opened up a new horizon where knowledge access, generation and distribution became a collective concern. With Open Source (OS) Initiative opening wings in early 1990s, the users marveled and appreciated the easy access of knowledge sources which came to be D * Research Scholar. Indian Institute of Technology, Bombay, India. e-mail: [email protected] TRIM 7 (2) July - Dec 2011 74 Changing Designer-Function in Open Source Kaur seen as a societal property, instead of one individual or corporation’s intellectual one Vainio and Vaden (2007). The benefits of freedom to view source code, modifying and redistributing the derivative works, integrating and experimenting with different products, doing away with huge licensing hassles instilled the collective imagination where collaborative sharing came up as new model of production. It quickly seeped to non-techie sharing when artistic works like texts, sound recordings, songs, movies, paintings, came to be shared, modified and distributed online. Creative Commons (CC) supported the free access and distribution of original and derivative works in these spheres through a range of licenses. At the same time, however, we see apprehensions growing in non-OS 1 quarters regarding the disappearance of ‘author’ from the scene as derivative works could be easily made from the original, at low cost, without having to have prior permission and commercially distributed as well. Post-modern philosophy of authorship enunciated by Barthes and Foucault had been major precursors of free software initiatives and such apprehensions resultantly. Prior to the emergence of cyberspace, Barthes (1977) persuasively argued about the death of author in narratives as text is continually re-interpreted by readers. Author was reduced to authorfunction following Foucault (1979) proposal as author draws heavily from the background cultural context for creating anything. These acted as catalyst for doing away with the copyright regime in the debate that ensued between the advocates of free software and proprietary ones. However, the emergence of open source materialized their apprehensions. Since open source is a platform of sharing and distributing intellectual material freely without the proprietary rights associated with their authors as strictly as in copyright regimes the apprehension of death of the original author having proprietary rights on its intellectual property by default, comes out as inevitable. In the case of non-textual, aesthetic works on web like paintings, designing websites, etc. open-source renders the designer of the original work (painting or a webpage) virtually non-existent. In analogy to author-function, it is designer-function that is at loss in web aesthetics. It is argued in this context that apprehension of disappearance of author from the scene is by and large a myth regarding open source as author is very much at the centre in this model of knowledge production. Though, open source does allow the freedom of accessing and modifying the textbased works, narratives or software programs, it does not do so at the cost of deleting original author from the scene. It is proved from the fact that only those materials are left for free accessing and modifying which 1 Author read here as writer, coder, programmer, as related to the particular kind of work. TRIM 7 (2) July - Dec 2011 75 Changing Designer-Function in Open Source Kaur already bear the stamp of the author, i.e. which are already protected under copyright as shown by Liang (2007). Even when released under Creative Common’s license, the derivative works and the fresh manuscripts put author in the centre and grant a series of protective filters, which can be granted as rights to users regarding use of these products. These filters include granting users various rights ranging from right to free access to copy, to modify, to all of them. This means that if a work has been released with license allowing users only to access the work freely, the modification and distribution done by them would be considered illegal. How far the author is comfortable with user’s using the work depends entirely on author himself. A cursory glance at various licenses will show this: Creative Commons Attribute License (CCAL) – Authors retain ownership of the work, but freedom to access, modify, distribute the works is allowed without prior permission. CC itself has range of freedoms associated with their license from mere accessing to full use of the work by the user. The range include Attribution by (cc by) – Full rights to the user including distribution, remix, tweak, using commercially, provided credit is given to the original author Attribution share alike (cc by-nd) – Allows redistribution commercial or non-commercial, as long as credit is given to the original author. Attribution non-commercial (cc by nc) – Allows access, distribution and building upon the work, but with noncommercial usage. Derivates works made from user’s modified works don’t carry the non-commercial condition Attribution non-commercial non share alike (cc by nc-sa) – All above freedoms with the condition of non-commercial usage. Also, derivates works can’t be used commercially as well. Attribution non-commercial no derivates (cc by nc-nd) – Strictest of all. Allows redistribution with crediting the author, but no derivatives works or their commercial utilization allowed. Licenses (2010) GNU-GPLv3 – Freedom to access, modify and redistribute the code, without commercial gains GNU General Public License (2007). This is generally applied to the software codes, programs, etc. Free Art License – Allows user to access, modify and redistribute creative works of art as long as proper credit is given to the original creator/designer/author. It is in line with Berne Convention for the Protection of Literary and Artistic Works. (See Akinci, 2007 for detailed discussion of these licenses). Thus, the original author is preserved in the scene by way of giving TRIM 7 (2) July - Dec 2011 76 Changing Designer-Function in Open Source Kaur appropriate credits, and also the credits of the authors of derivative works are preserved similarly. Author-function is not at loss even if the work is released commercially via open source. Rather, author is now more secured as it is now author who decides what range of freedom he wishes the users to have regarding the appropriation of his work. The broad range of author’ rights for creative works provided in Creative Commons license testify to this security. Section – 2: Designer-Function in Cyberspace The designer-function pertaining to non-textual, especially visual arts is represented by webpage designing, graphics, image-generation and is analogous to author-function for textual content. Contemporary culture is highly visual-symbol centric (Thorlacius, 2007a) and fast replacing the information presented via textual narratives. A simple case of news websites would illustrate where it is almost, “impossible to read history from an internet archive [for instance] without constant intrusions by the latest news and banners… the event is grasped visually and there is nothing to comprehend or interpret in it”. Banners, thumbnails, creeping lines with breaking news, photo galleries are dominant elements of visual aesthetics on a news portal. Similar stories can be seen in mass-serving portals like shopping, entertainment, consumer-product sites etc. Web aesthetics pay a crucial attention to these visual symbols while designing such portals and is fast emerging and important aspect of HumanComputer Interaction (Tractinsky, 2005). Rather, in any website, visual aesthetics are considered at par with site’s functionality, user-friendliness etc. However, the emergence of rapidly increasing user-control over visually presented content like images, advertisements etc. is replacing the designer with designer-function. This is because not only the content, but form as well of the site is coming under user’s hold. Earlier it was the owner of the site who decided what and how the information was to be displayed. The opening of the code empowered the widely distributed network of users to access, modify and redistribute the what-part or content of the software. How-part relates to the way information is presented on a single webpage in terms of its layout, designing etc. which was earlier by and large in the control of web designer at the back. Modifications to this form were delegated to the designer. A bit of customizing ability would be given to users which mostly related to changing the default colors or background themes as exemplified in email accounts or social networking profiles. Still, what kind of themes or colors a user could possibly use on a site was by and large decided by the designer-cum-owner of the site at the back. Moreover, this customization was facilitated for user’s personal accounts only, not the general form of the website. For example, users are not permitted to change how the TRIM 7 (2) July - Dec 2011 77 Changing Designer-Function in Open Source Kaur Gmail webpage looks! A steady form over the years builds an image of the product in the mind of users as compared to the ones changing their look every two months. A blue broad horizontal strip with a wide patch of world map dotted with orange busts on the left instantly connects the users to Facebook. Since the form is designed by the site-owner-cum-site-designer, user has to agree to whatever form he is presented with. Even if a large body of users may not like a particular form, they have no choice but to yield. They can’t decide for example what images should appear on a webpage, how page logo should appear, what color pattern should be adopted and so on. Ideologically, the complete ownership of form by the designer presents a dictatorial regime where traffic is unidirectional – from one designer to many users, from closed top to expanding bottom in a pyramid. ShiftSpace, an open source Firefox extension, inverts this style by allowing users to be participatory designers in this power structure. By allowing users to alter the way web page appears to them, it challenges the monopoly of the owners as far as form of the work is concerned. In the process, it replaces the traditional notion of designer (webpage designer at the back) with designer-function, as any lay user is given opportunity as well as resources to design the form for him and share that with others. It empowers the lay user to be a designer for form as well as content not owned by him. The ideology behind invoking this trend is allowing user more freedom in regulating their web experience. Piggybacking on Greasemonkey script, ShiftSpace, is a javascript program that once invoked modifies the web page’s code and installs its own functions which enable the user to do various actions on the web page like highlighting the text, creating sticky notes and swapping images. These functions are called Spaces and content generated by them are called Shifts which are publicly available. That means, other users with the same script installed can see your Shifts. By overturning the balance in the favor of the user, ShiftSpace makes the web experience literally interactive as compared to earlier times, when web pages were given to the users by the designers and users reacted to them passively, i.e. users had no authority how they wished to see a particular web page. That was decided by the owners at the back. Co-founder Mushon Zer-Aviv explains ShiftSpace’s ideology as, “While the internet’s design is widely understood to be open and distributed, control over how users interact online has given us largely centralized and closed systems. ShiftSpace attempts to subvert this trend by providing a new public space on the web” ( Zer-Aviv, 2007). ShiftSpace is supposed to be a utilitarian add-on with the explicit aim of providing user an edge over the webpage form. However, its intrusion into web space seems more like an overthrown of privately-owned TRIM 7 (2) July - Dec 2011 78 Changing Designer-Function in Open Source Kaur capitalist space by forcible open-source socialism. It allows user to create spaces barging into the content provided by the webpage owners by altering their form. User may add personal comments in these spaces and see others personal comments. It is its Image Swap function that alters the form of the page by altering the visual elements like images to be swapped with any other images on the web. Once Image Swap function is invoked, top left corner of every image on the page, be it a simple image, logo or advertisement, gives user the options of Swap and Grab. With Grab you pick the image and with swap, paste it down on any other image. The end may see user viewing the web page in a completely different way. User is more in a control of the aesthetic elements of the page than before. The aesthetic experience is undoubtedly enriched; however, Image Swapping could be potential harm to various stakeholders. First, as obvious, it is an intrusion into the traditional designer’s zone which was earlier inaccessible. It is more like a trespassing into a forbidden area whatever the ideology behind may be. As discussed earlier, open source is not against copyrighted material per se, making these open without the consent of their legal owners. It is against the proprietorship of knowledge, the restrictions posed by copyright regime, which restrains the free distribution of knowledge. If form and content both make a work complete, sanctity should be preserved in form as well. Whereas the author-function is not trespassing into private arenas, courtesy open source licenses, designer-function is, as the trespassing is into a protected zone without permission and derivatives made are shared with other users under Shift creators’ Id without any credit to the original designer, of course! Secondly, if the images relate to advertising, Image Swapping tends to be in direct conflict with advertisers. AddArt, another open source Firefox extension, for example, replaces the advertising images on the web page with artistic images provided routinely by its developers. With Addblock Plus in its entourage, the advertising images are first blocked and subsequently replaced by AddArt. The aim is to empower user to regulate what sort of stuff he would like to see on a webpage. Once installed, every ad-image is replaced with the artistic image by AddArt. Together, ShiftSpace and AddArt, with their Image Swapping lead to dissolution of designer-function. It is here argued that AddArt does its so-called utilitarian duty without prior permission either from the owners of the website, or the advertisers. Consequently, it turns out to be more an intrusive than an utilitarian tool. Being an open source product, it creates a conflict between the ideology of OS as a movement and proprietary regimes and makes websites a battle zone of their conflicts. Not only the advertisers, TRIM 7 (2) July - Dec 2011 79 Changing Designer-Function in Open Source Kaur but hosting page-owners stand to opposition. Who would like to advertise on pages where the ads are to be replaced by AddArt? Or swapped by ShiftSpace? OS aims at free distribution of works and with allowing of commercial distribution of derivatives, provided permission for so is obtained from the original authors with appropriate credits. It opposes the proprietorship of software, but not forcibly prohibiting their commercial selling. AddArt precisely does that albeit indirectly. It forcibly inhibits the advertisements appearing on others webpages in the name of providing greater aesthetic experience to the user. Replacing ad-images with their works is preventing them reaching the viewers, and thus the potential sale of their products. It comes out as in contradiction to OS as a movement. Apprehending the flak from advertisers’ side regarding the potential harm to advertising industry, the developers of AddArt defend their right to intrusion by saying, Oh stop, you flatter us! Seriously, here’s some things to consider: You downloaded the page, and you own it. It’s yours and you can do whatever you want to it. Just like if you get a free newspaper, you can read it, or cut it up, or burn it. It’s your life and you have no legal obligation to look at every ad presented to you. People that use Ad Blocking software are not people that click on ads or even respond favorably to them. There is no loss in the market when these users block ads. If we extend the logic of Ad Blockers destroying the free iternet then online ad blocking pales in comparison to the number of people destroying the television industry by going to the bathroom during commercial breaks, thereby stealing that content from the television companies. Don’t waste your time with us and go complain to them. Add-Art just replaces blocked ads with art. We didn’t write the code that blocks ads, we just piggyback onto it …and enthusiastically support it. If you want to complain about Ad Blockers, talk to the people at AdBlock Plus. But know that they’ve heard it all before and after hearing all those areguments they still don’t agree, so you might just save your energy and do something else.” [Spelling mistakes in original] AddArt (n.d) Their newspaper argument is unwarranted because even if we own the newspaper copy we are not entitled to paste our news-items or pieces on it under their banner and re-circulate them. What we are entitled to is ownership of “physical medium” as German philosopher Gottlieb Fichte (1791) says, and not the ownership of “content”. Their further insistence TRIM 7 (2) July - Dec 2011 80 Changing Designer-Function in Open Source Kaur of considering art and advertisement as different is also untenable and meant to tilt towards the justification of their own product. As long as advertisements relate to generating meaningful emotional experience in the viewer, it is definitely an art. Had it not been, there was no point in investing millions on building Brand Image by companies whose success depends upon how much viewers emotionally connect to it. Even if analyzed a bit further their social aim of enriching user’s aesthetic experience, AddArt’s efforts are seriously wanting. All what AddArt does is replacing one image with another image selected by their “curator” which shuffles every two weeks. The images range from realistic portraits of unknown people, to surreal to almost disgusting. The kind of “art” displayed by these images at times is aesthetically obnoxious and unpleasant, let alone appealing (Fig.1). Fig.1: The Washington Post website with replaced image by AddArt (Date: 1109-2010) On the one hand, it ruins the user’s experience, on the other, it distorts the webpage radically. Advertisements are positioned on the website with harmony and balance between the advertising image and the rest of the content. The choice of colors, the layout, the positioning is all specifically designed according to the kind of advertisement and its placement on the site alongwith the site-owner’s public profile. The principles for this are neatly governed by web aesthetics. Further, the aesthetic effects should be, “adapted to the target audience. A presentation site targeting a young audience must be designed in accordance with the contemporary trends in visual aesthetics and should TRIM 7 (2) July - Dec 2011 81 Changing Designer-Function in Open Source Kaur differ from a presentation site that targets the general adult population.” (Thorlacius, 2007b, p.67). AddArt replacements outrageously disregard all these aspects of web aesthetics, namely image-content harmony, the owner’s image, genre and target audience. With apparent incompatibility between original work of art (in terms of shape, coloring, layout of the original image) that AddArt seeks to replace and the ad-space wherein that to be replaced, the result is almost nauseated as is shown in Fig. 2. Fig. 2: Aaj Tak website with replaced images by AddArt (Date: 07-09-2010) Note: The incompatibility between by replaced images and the Ad-space. Consequently, the page is rendered more hideous than what it originally was with advertisement. Advertisements had one stakeholder having gains at least, AddArt leaves no-one, at least not the users! Unlike AddArt, paradoxically, it is advertisements that serve a utilitarian function by giving information about a product even if somebody chooses not to buy. The often disgusting images chosen by AddArt have what? Though majority of the websites are created with the parameters of functionality and user-ease, yet significance of aesthetics, esp. visual aesthetics, can’t be downplayed in the information they aim to convey. This is because of the increasing role the visual symbols play in contemporary culture as discussed above. Moreover, the sites whose primary objective is not functionality, but aesthetics itself, can be TRIM 7 (2) July - Dec 2011 82 Changing Designer-Function in Open Source Kaur seriously damaged by tools like these. The sites in this genre include, cultural & heritage sites, art galleries, art museums sites etc whose main purpose is not to give the factual information to the user as news or shopping portals do, but visually present the cultural repertoire of a specific region or ethnic group. AddArt degrades the user’s experience horribly in those sites (Fig. 3). Fig. 3: Royal Academy of Arts Museum, UK website with replaced images by AddArt. (Dated 10-09-2010) As regards the site-owners, it is a clear intrusion into their rights to ownership of form. Without their permission, contents removed and page distorted in the process. This subversion explicitly goes against the open source ideology. Open source is about creating platform for free sharing of sources, creating spaces and not forcibly trespassing into the others’ private zones. Whereas ShiftSpace creates spaces forcibly into the form not owned by the users, AddArt distorts that form. Image Swap done by both of them is an intrusion into the designer’s right to the form created by him. With designer dissolving into designer-function, the latter too comes to naught with ShiftSpace and AddArt. This dissolution is beyond salvage unlike author-function because the replacements, content-generation and swaps can be observed globally and shared with other users via ShiftSpace Server. That is, the shifts created by one user on BBC’s website, for example, whether notes or swaps, can interactively be shared by another users of ShiftSpace thereby providing an altogether new experience of the site (Fig. 4 & Fig. 5). Once logging-in, one can see all the shifts created on a particular site by all the users using this tool. Unlike AddArt, ShiftSpace empowers users to TRIM 7 (2) July - Dec 2011 83 Changing Designer-Function in Open Source Kaur customize image-replacement and also enables them to share their adventure with other users by saving their shifts in the ShiftSpace’ server. Fig. 4: Shifts created by author and other users on BBC site alongwith Image Swap by SS (Date: 09-09-2010). Fig. 5: Original BBC Page before invoking ShiftSpace (Date: 09-09-2010). Further, there is ideological conflict between the aims of AddArt and its work. Their philosophy is to prevent user from experiencing unwanted advertisements by replacing them with the art images. Since user is not in TRIM 7 (2) July - Dec 2011 84 Changing Designer-Function in Open Source Kaur control to decide the kind of art he is going to see, an equal rejoinder can be made to AddArt that because of it, users are made to see art which the developers want to show, as artistic images are not selected by the users themselves. If their ideology is to ease user from forcible and annoying advertisements, their artistic images come equally in the trap for these too are forcible and annoying most of the time. The conflict is not just between the ideologies of these tools and OS movement as such, but among them as well. The two tools can come into mutual conflict when used one after another. Of what use is the entire effort if AddArt replaces an ad with an artistic image, and ShiftSpace has to change that later with its Image Swap? Say, with the same image picked from another website! This cycle seems to be of nobody’s use in the end (Fig. 6). Fig. 6: AddArt artistic image swapped by ShiftSpace Image Swap Double replacement. First AddArt replaces Ad-image with its art image, then SS replaces that art-image with another Art image (Date: 07-09-2010) Conclusion Though as OS tools, ShiftSpace contribute in creating space over web, thus serving a utilitarian purpose; same can’t be said about its Image Swap function which is more an intrusion into the other’s privately owned space. This intrusion is gross and complete with AddArt, another OS tool. Together, both of these lead to the dissolution of designerfunction in cyberspace, a practice which does not serve anything of utility to either to designer or the end user. On the one hand, these distorts the form of the webpage completely, on the other, ruins the aesthetic experience of the user as well. TRIM 7 (2) July - Dec 2011 85 Changing Designer-Function in Open Source Kaur References AddArt. (n.d). F.A.Q’s. Retrieved from http://add-art.org/f-a-q-s Akinci, I. O. (2007). Politics of Copyleft: How Do Recent Movements Altering Copyright in Software And in Art Differ From Each Other. Artciencia.com. II (6), 1-19. Retrieved from http://www.artciencia.com/index.php/artciencia/issue/view/11 Barthes, R. (1977). Image, Music, Text. Steven Heath. (Trans.). UK: Fontana Press. Fichte, G. (1791). Proof of the Illegality of Reprinting: A Rationale and a Parable. Martha Woodmanse. (Trans.). Retrieved from www.case.edu/affil/sce/authorship/Fichte,_Proof.doc Foucault, M. (1979). What is an author? In Josue V. Harari (Ed.). Textual Strategies: Perspectives in Post-Structuralist Criticism (141-160). USA: Cornell University Press. (Originally published in 1970) GNU General Public License, Version 3 (2007). Retrieved from http://www.opensource.org/licenses/gpl-3.0.html Liang, L. (2007). Free/Open Source Software Open Content (eprimer), UNDP- Asia-Pacific Development Information Programme. Retrieved from http://www.apdip.net/publications/fosseprimers/fossopencontent-nocover.pdf Licenses. (2010). Retrieved from http://creativecommons.org/about/licenses/ Tractinsky, N. (2005). Does Aesthetic Matter in Human Computer Interaction? Mensh & Computers, pp. 29-42. Retrieved from http://mc.informatik.unihamburg.de/konferenzbaende/mc2005/konferenzband/muc200 5_02_tractinsky.pdf Thorlacius, L. (2007a). The Role of Aesthetics in Web Design, Nordicom Review, 28, pp. 63-76 Thorlacius, L. (2007b). The Role of Aesthetics in Web Design, Nordicom Review, 28, p. 67 Vainio, N., & Vaden, T. (2007). Free Software Philosophy And Open Source. In K. S. Amant and B. Still (Eds.). Handbook of Open Source Software: Technological, Economic and Social Perspectives (pp. 1-11). USA: Information Science Reference. Zer-Aviv, M. (2007). ShiftSpace: Thesis Paper. Retrieved from http://itp.nyu.edu/projects_documents/1178942414_ShiftSpace _thesis_paper.pdf TRIM 7 (2) July - Dec 2011 86 Open Access Journals in LIS… Rufai, Gul & Shah Open Access Journals in Library and Information Science: The Story so Far * Reyaz Rufai ** Sumeer Gul *** Tariq Ahmad Shah Abstract Purpose: Internet has triggered the growth of scholarly publications and every discipline is witnessing an unremitting growth in the scholarly market. Open access, the product of Internet has also captured the global disciplines. Library and Information Science, is also witnessing a dramatic growth in the open access field. The study explores the status of open access titles in the field of Library and Information Science (LIS). Various characteristics highlighting open access titles in the field of LIS are featured in the study. Design/Methodology/Approach: A systematic method for characterizing the open access titles in the field of Library and Information Science was carried out by extracting the data from Directory of Open Access Journals (DOAJ), Open JGate, and Ulrichsweb.com. Findings: The results clearly reveal an expounding growth of open access titles in the field of Library and Information Science. Commercial publishers have also joined hands as open access market players. Indexing policies of OA titles in LIS need to be restructured and low income nations have to evolve in the field of OA bazaar. Research Implications: The study will be helpful for the researchers in exploring the open access titles in the field of LIS. Furthermore, it can act as an eye opener to the scholarly world to know about the real status of open access titles in the field of LIS. Future Research: Future research can be carried out to expedite the innovative trends in the LIS open journals. Keywords: Open Access; Library and Information Science; Open Access Journals; Open Access-Growth-Development Paper Type: Research Introduction cientific publishing is undergoing significant changes due to the growth of online publications and increases in the number of open access journals (Voronin, Myrzahmetov & Bernstein, 2011).The concept of open access (OA) that opened new dimensions in the information communication cycle has been widely accepted all over the world. Open access, which provides free access to the information content, is widely expanding its domain because of enormous benefits S * Librarian. Allama Iqbal Library. University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] ** Assistant Professor. Dept. of Library and Information Science. University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] *** Research Scholar. Dept. of Library and Information Science University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] TRIM 7 (2) July - Dec 2011 87 Open Access Journals in LIS… Rufai, Gul & Shah accrued from it. It is a blessing for everyone involved with the information communication process. Their growth and development has been one of the success stories over World Wide Web. With only five journals offering open access mode to their contents in 1992 and 1200 in 2004 (Falk, 2004), the number has reached to more than 7000 as on December 01, 2010 (Directory of Open Access Journals, 2010). Different authorities on open access have highlighted this budding concept in different ways. One of the lucid definitions on open access has been provided by Budapest open access initiative which states that open access is the free availability of articles on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself (Budapest Open Access Initiative, 2002). However, Association of Research Libraries (ARL, 2007) define open access as any dissemination model created with no expectation of direct monetary return and which makes works available online at no cost to the readers. An important and well renowned authority, Suber (2003) on open access, defines open access as free online availability of scholarly literature. Lynch (2006) also comments on open access as an increased elimination of barriers to the use of the scholarly literature by anyone interested in making such use. McCulloch (2006) visualizes that open access movement attempts to reassert control over publicly funded research in order to achieve “the best value” and make such research output transparent and freely accessible. Nicholas, Huntington and Rowlands (2005) elaborate on the value of such activity by stressing that it is possible to “read, download, copy, distribute and print articles and other materials freely”. The free availability of research is tempting the researchers to embrace the open access revolution with warm welcome. Number of advantages ranging from wider visibility to high citation have made open access so popular among the researchers that the heat of open access publishing is accelerating day by day. Highly ranked journals like Nature, Wall Street Journal and The Scientist all ranked open access among their top stories in 2003 (Willinsky, 2006). Initially a strong resentment was seen from the publishing industry, that open scholarship was a great threat to their business venture. But with the passage of time, leading publishers also joined the open access bandwagon because of innumerable potentialities that are adhered to it. Leading publishers like Elsevier, Oxford, Taylor and Francis, Sage, Springer and many more made some of their content freely available to the readers. Projects like HINARI, AGORA, and OARE etc that made the scholarly content freely available to developing economies also helped to propagate the cause of TRIM 7 (2) July - Dec 2011 88 Open Access Journals in LIS… Rufai, Gul & Shah open access, i.e. information for all. Scholarly and scientific journals are now enjoying flavours of open access and are growing at an escalating rate day by day. Open access journals have in this relatively shorter span of time won the hearts of the elements associated with the rim of open access. With leading publishers and reputed universities their count is growing at a very fast rate. The serial crisis that was the outcome of spurting economy has also been solved by open access platform. However, open access is gaining popularity day by day and every subject has been positively affected by it. Social Sciences, which deal with the various facets of society in relation to man, are also embracing this concept with open arms. Scholars in the various fields of Social Sciences, including Library and Information Science are contributing to open access journal revolution because of innumerable benefits adhered to it. Review of Literature A number of studies have been carried that highlight various facets of open access. Falk (2004) studied that 1200 open access journals were available on the Web as compared to a total of only five in 1992. Deals between publishers can be one of the catalytic forces in the increase of open access journals. Development of open access journal publishing has also been researched by Laakso, Welling, Bukvova, Nyman, Björk & Hedlund, 2011). A steady rate of increase of the open access journals has also been witnessed by number of authorities. Many carry on studies were also conducted to trace the growth and development of open access journals (Wells, 1999; Crawford, 2002; Gustaffson 2002 (as cited in Laakso, Welling, Bukvova, Nyman, Björk & Hedlund, 2011; Morris, 2006; Dramatic Growth of Open, 2007; Gul, Wani & Majeed, 2008; Ware & Mabe, 2009) A study by McVeigh (2004) documents that the number of open access journals in the citation indexes provided by ISI Thomson™ is growing, both in terms of creating new titles and conversion of established titles. Open access journal publishing in different fields is also studied by Borgman (2007). The open access platform provided by publishers has also been studied by Dallmeier-Tiessen, et al, (2010). Recent studies have explored a dramatic growth of open access journals (Happy, 2012…, 2011; Provençal, 2011; The challenges of success…, 2011; Illustrations of the global…, 2012). Problem Millions of scholarly articles are appearing on the Web but due to number of restrictions, access to them can’t be availed every time. Out of them, a large number of articles are useful for LIS research and TRIM 7 (2) July - Dec 2011 89 Open Access Journals in LIS… Rufai, Gul & Shah development that appear in different journals from time to time. Open access journals that provide free access to the research have made their debut to provide ease in access to the research. Day by day, these journals increase at a very fast rate on the Web. The study will encompass the development of open access journals in the field of LIS. Objectives The main objective was to study how open access journals in the field of LIS are experimenting with features like publishing origin, publishing models, language usage, visibility, article processing, and status concerns. Scope The study was undertaken to visualize the position of LIS field in this epoch of open access which has revolutionized the entire world in a short duration of time, as it got started with a meeting conveyed by Open Society Institute held at Budapest, Hungary in the month of December 2001 (Budapest Open Access Initiative, 2002). Methodology In order to ascertain the no. of OA journals published in the field of Library and Information Science (LIS), three authoritative and authentic databases were consulted, i.e., Lund University’s Directory of Open Access Journals (DOAJ), Serials Solution’s Ulrichsweb.com, and Informatics India Private limited, Open J-Gate. As on June 10, 2011, DOAJ indexed 117 titles in the field of LIS, 93 by Ulrichsweb.com and 66 peer reviewed journals by Open J-Gate. The titles from the three databases were clubbed together and repeated titles were removed in order to avoid the risk of duplication and to achieve an accurate and realistic number. Each title was further manually checked on their respective websites and a no. of discrepancies were found in the list of Open J-Gate & Ulrichweb.com, like: Wrong Classification Journals that belong to field of Computers and Education were tagged by Open J-Gate under the field of LIS, like Title International Journal of Peer to Peer Networks Transformations: Liberal Arts in the Digital Age International Journal of Educational Technology Journal of Research on Technology in Education Current Issues in Education Turkish Online Journal of Distance Education TRIM 7 (2) July - Dec 2011 Original Subject/s Computers Computers Computers & Education Computers & Education Education Education 90 Open Access Journals in LIS… Rufai, Gul & Shah Trade Journal instead of Scholarly By open access we mean scholarly and peer reviewed publications and not the trade journals. Open J-Gate tagged a journal - Idaho Librarian (ISSN: 2151-7738) – as OA when its contents were supporting trade instead of scholarly nature. Embargo Period/ Access to select issues only Embargo period which denotes a time lag between the most current issue/volume published and the content of the journal freely available on the public web is against the very spirit of open access movement. OA journals provide free access not only to the current issue or current volume but also to back issues. However, in case of Journal of University Librarians Association of Sri Lanka (ISSN: 13914081), it provides free access to back issues only; current issue is available up to abstract level only. Tushu Zixum Xuekan (parallel title: Bulletin of Library and Information Science, ISSN: 10232125) which is tagged by Ulrichsweb.com as Open Access journal also provides free access to back issues only. Besides, Law Library Journal (ISSN: 10246444) does not provide free access to all the issues, users are supposed to subscribe to access its archive. When all these doubtful titles were removed, a total of 144 OA journals in the field of Library and Information Science were obtained. Among these, 32 journals are indexed by all databases while 29 titles are indexed only by DOAJ, 11 only by Urlichsweb.com & 16 by Open J-gate (Fig. 1). Fig 1: Comparative Strength of LIS titles Results & Discussion Country of Publication 144 OA LIS journals are published from 37 countries. Among these, a maximum of 45 titles are published in United States (31.25%), followed respectively by 12 in Brazil (8.33%) and 10 in Spain (6.95%). On the other TRIM 7 (2) July - Dec 2011 91 Open Access Journals in LIS… Rufai, Gul & Shah extreme, five countries publish two journals each while 20 countries including India publish single journal each. If the countries are classified according to four economic zones of The World Bank, i.e., High income, Upper-Middle-income, Lower-Middleincome & Low-income (Country and Lending Groups, 2011), 20 countries that published OA journals fall under High-income economic zone, 12 countries under Upper-Middle-income economic zone and 5 countries from Lower-Middle-income zone while countries from Low-income economic zone have yet to publish any OA journal in the field of Library and Information Science. Publisher Account 129 publishers take active part in the publication of OA LIS journals. Informing Science Institute, USA publishes a maximum of 7 titles followed by American Library Association (USA) which publishes 5 titles while 2 titles are published each by National Taiwan University (Taiwan), Universidad Complutense de Madrid (Spain), Australian Library and Information Association (Australia), Chartered Institute of Library and Information Professionals (UK), and International Consortium for the Advancement of Academic Publication (Canada). Rest of 122 publishers publishes one title each. When it comes to the nature of publishing body, it is found that universities are the leading publishers of OA Journals which publish 55 titles, accounting to 38.19 per cent of the total, followed by library associations and research centers & institutes with 32 (22.22%)and 22 (15.28%) titles respectively. Commercial publishers also offer 9 (6.25%) journals while 5 (3.47%) titles are result of individual efforts. Rest of 21 (14.58%) titles is an endeavour of societies, consortia and others. Lingual Assessment When it comes to the content language(s), 72.92 per cent of journals (105) are unilingual, 19.44 per cent as bilingual (28), 4.17 titles (6) in three languages, 2.78 per cent titles (4) in four languages and a single title (0.69%) is published in a maximum of four languages. Overall, OA LIS journals are represented in 22 different languages. English is the content language preferred by majority of journals (114, 79.17%), followed respectively 23 in Spanish (15.97%) and 15 in Portuguese (10.42%). On the other hand, 2 journals are published each in Catalan, Danish, Romanian, and Swedish languages. One journal each is published in Arabic, Bulgarian, Croatian, Czech, Indonesian, Lithuanian, Polish, Norwegian Slovak, and Slovene (Table 1). TRIM 7 (2) July - Dec 2011 92 Open Access Journals in LIS… Rufai, Gul & Shah Table 2: Lingual Assessment of OA LIS journals Rank Language No. of Journals Percentage 1 English 114 79.17 2 Spanish 23 15.97 3 Portuguese 15 10.42 4 French 11 7.64 5 German 7 4.86 6 7 Italian Turkish 6 3 4.17 2.08 7 Chinese 3 2.08 Visibility DOAJ not only indexes OA journals but also archives material of about 45 per cent of indexed titles. In case of OA LIS journals, 33 per cent of them i.e. 47 titles are searchable to article level in DOAJ. For rest of titles, one has to access them individually at their respective websites. Besides, Seadle (2011) argues that most of the open access titles listed in DOAJ currently have no effective long-term digital archiving. So far as Scopus is concerned, one of the largest indexing and abstracting services in the world; also indexes a few OA LIS journals, i.e. 15.3 per cent (22). This represents a very poor visibility of them. Article Processing Charges / Handling Fee By OA we mean that the journal is freely available to the user on the public web, but the publisher may charge its authors to pay in the form of article processing charges or handling charges. Since managing a journal is a costly affair and the studies have shown that the process of peer review costs on an average 400USD per article (Rowland, 2002). Of 144 journals, only 6 journals charge their authors to pay article processing charges or handling fee. Authors have to pay 1900 USD to get their article published in Journal of Medical Internet Research (ISSN: 14388871), 550USD to published in International Journal of Library and Information Science (ISSN: 21412537) and 50 USD for South African Journal of Information Management (ISSN: 1560683X). However, the fee charged by Anales de Documentación (ISSN: 15752437), Hipertext.net (ISSN: 16955498), and Infodiversidad (ISSN: 1514514X) could not be traced out. Status Managing a journal is not an easy task. Like other ventures, it too requires the active participation of experts (human expertise), material (research contribution) and money (finance). 134 journals (93 %) have TRIM 7 (2) July - Dec 2011 93 Open Access Journals in LIS… Rufai, Gul & Shah sustained their existence and are regularly being published. The remaining 10 titles had ceased their publication and among these, four titles are continued by some other journal name (Table 2). Table 2 Ceased Title Journal of Southern Academic and Special Librarianship Medizin-Bibliothek-Information Journal of Library Science Bulletin of the Medical Library Association Continued by Electronic Journal of Academic and Special Librarianship GMS Medizin-Bibliothek-Information Journal of Library and Information Studies Journal of the Medical Library Association Conclusion and Discussion The sustainability of open access journals in the field of LIS is evident from the study. Countries falling in the low-income economic zones have to come on open access canvas. Use of open journal systems (OJS’s) can be one of the best solutions in the times of economic crisis and especially for those nations which are endemically short of adequate financial resources to cope up with the changing technologies (Gul & Shah, 2011). Though commercial publishers have joined hands in open access market, yet there need to be lots of efforts on their side to remove the economic barrier that has always hindered the researchers from quality research in the LIS field. Not only universities should be the pioneers in highlighting the research in LIS but research institutes and centers, societies and other elements associated with research should actively take part in the research output. The journals offering hybrid or fee based mode should try to slash down the author processing charges so that the article publication can become an affordable job. Assigning the job of article processing on volunteer basis and reduced costs can help in the elevation of OA articles which in turn can benefit the readers to a greater extent. Content availability in more languages with English as one of the languages can help to remove the language barrier between the two ends of information communication process. Indexing the journals in more sources can help to increase the content visibility of OA journals in the field of OA. Even a proper archiving policy in indexing sources can help in long term preservation of the open digital content. To achieve long term sustainability the elements associated with the scholarly publication need to work in a more coordinated manner as researched by Legace (as cited in Gul & Shah, 2010). Marketing the scholarly content in a more organized and coordinated manner can also help in long term sustainability of the journals. Application of Web 2.0 tools for the content promotion and inclusion in different subjective forums and boards can also help in the sustenance of the journals in the present dynamic and ever changing digital environment. TRIM 7 (2) July - Dec 2011 94 Open Access Journals in LIS… Rufai, Gul & Shah References Association of Research Libraries (ARL). (2007). Retrieved from www.arl.org/osc/models/oa.html Borgman, C.L. (2007). Scholarship in the digital age: Information, infrastructure, and the Internet (p.186). Cambridge, MA: MIT Press. Budapest Open Access Initiative. (2002). Read the Budapest Open Access Initiative. Budapest open access initiative. Retrieved from http://www.soros.org/openaccess/read Country and Lending Groups. (2011). The World Bank. Retrieved from http://data.worldbank.org/about/countryclassifications/country-and-lending-groups Dallmeier-Tiessen, S., et al. (2010). Open Access Publishing - Models and Attributes (p.62). Max Planck Digital Library/Informationsversorgung. Directory of Open Access Journals. (2010). Retrieved from http://www.doaj.org/ Dramatic Growth of Open…. (2007). Dramatic Growth of Open Access Series. The imaginary journal of poetic economics. Retrieved from http://poeticeconomics.blogspot.com/2006/08/dramaticgrowth-of-open-access-series.html Falk, H. (2004). Open access gains momentum. The Electronic Library, 22 (6), 527-530. doi: 10.1108/02640470410570848 Gul, S., Wani, Z. A., & Majeed, I. (2008). Open Access Journals: A Global Perspective. Trends in Information Management. 4 (1). 1-19. Gul, S., & Shah, T.A. (2011). Managing Knowledge Repository in Kashmir: Leap towards a Knowledge Based Society. Trends in Information Management, 7 (1), 41-55. Happy 2012…. (2011). Happy 2012 Open Access Movement! December 31, 2011 Dramatic Growth of Open Access. The imaginary journal of poetic economics. Retrieved from http://poeticeconomics.blogspot.com/2011/12/happy-2012open-access-movement.html Illustrations of the global….. (2012). Illustrations of the global reach of the open access movement. The imaginary journal of poetic economics. Retrieved from http://poeticeconomics.blogspot.com/2012/01/illustrations-ofglobal-reach-of-open.html Laakso M,. Welling P,. Bukvova H,. Nyman L,. Björk B-C,. & Hedlund, Turid. (2011). The Development of Open Access Journal Publishing from 1993 to 2009. PLoS ONE, 6(6): e20961. doi:10.1371/journal.pone.0020961 TRIM 7 (2) July - Dec 2011 95 Open Access Journals in LIS… Rufai, Gul & Shah Lynch, C. (2006). Improving access to research results: six points. ARL Bimonthly Report, 248, October, pp. 5-7, Retrieved from http://www.arl.org/bm~doc/arlbr248sixpoints.pdf McCulloch, E. (2006). Taking stock of open access: progress and issues. Library Review, 55 (6), 337-343. doi: 10.1108/00242530610674749 McVeigh, M. E. (2004). Open access journals and the ISI citation database: Analysis of impact factors and citation patterns. Thomson Scientific Whitepaper. Retrieved November 12, 2010 from www.thomsonisi.com/media/presentrep/essayspdf/openaccess citations2.pdf Morris, S. (2006). Personal View: When is a journal not a journal - a closer look at the DOAJ. Learned Publishing 19: doi:10.1087/095315106775122565 Nicholas, D., Huntington, P., & Rowlands, I. (2005). Open access journal publishing: the views of some of the world's senior authors. Journal of Documentation, 61(4), 497-519. doi: 10.1108/00220410510607499 Provençal, J. (2011). Scholarly journal publishing in Canada: Annual industry report 2010-2011. Canada: Canadian Association of Learned Journals. Retrieved from http://www.caljacrs.ca/docs/CALJ_%20IndustryReport_2011.pdf Rowland, F. (2002). The peer-review process. Learned publishing, 15 (4), 247-258. doi: 10.1087/095315102760319206 Seadle, M. (2011). Archiving in the networked world: open access journals. Library Hi Tech, 29 (2), 394- 404. doi: 10.1108/07378831111138251 Suber, P. (2003). How should we define open access? SPARC Open Access Newsletter, 64. Retrieved from http://www.earlham.edu/~peters/fos/newsletter/08-04-03.htm The challenges of success….. (2011). The challenges of success: dramatic growth of open access early year-end edition. The imaginary journal of poetic economics. Retrieved from http://poeticeconomics.blogspot.com/2011/12/challenges-ofsuccess-dramatic-growth.html Voronin Y ., Myrzahmetov. A ., & Bernstein, A. (2011). Access to Scientific Publications: The Scientist's Perspective. PLoS One, 6(11): e27868. doi:10.1371/journal.pone.0027868 Ware M,. & Mabe, M. (2009). The STM report - An overview of scientific and scholarly journals publishing (p.68). International Association of Scientific, Technical and Medical Publishers. TRIM 7 (2) July - Dec 2011 96 Open Access Journals in LIS… Rufai, Gul & Shah Willinsky, J. (2006). The Access Principle – The Case for Open Access to Research and Scholarship. The MIT Press, Cambridge, MA. TRIM 7 (2) July - Dec 2011 97 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Graph Based Framework for Time Series Prediction * Vivek Yadav ** Durga Toshniwal Abstract Purpose: A time series comprises of a sequence of observations ordered with time. A major task of data mining with regard to time series data is predicting the future values. In time series there is a general notion that some aspect of past pattern will continue in future. Existing time series techniques fail to capture the knowledge present in databases to make good assumptions of future values. Design/Methodology/Approach: Application of graph matching technique to time series data is applied in the paper. Findings: The study found that use of graph matching techniques on time-series data can be a useful technique for finding hidden patterns in time series database. Research Implications: The study motivates to map time series data and graphs and use existing graph mining techniques to discover patterns from time series data and use the derived patterns for making predictions. Originality/Value: The study maps the time-series data as graphs and use graph mining techniques to discover knowledge from time series data. Keywords: Data mining; Time Series Prediction; Graph Mining; Graph Matching Paper Type: Conceptual Introduction ata mining is the process of extracting meaningful and potentially useful patterns from large datasets. Nowadays, data mining is becoming an increasingly important tool by modern business processes to transform data into business intelligence giving business processes an informational advantage to make their strategic business decisions based on the past observed patterns rather than on intuitions or beliefs (Clifton, 2011). Graph based framework for time series prediction is a step towards exploring new efficient approach for time series prediction where predictions are based on patterns observed in past. Time Series data consists of sequences of values or events obtained over repeated instances of time. Mostly these values or events are collected at equally spaced, discrete time intervals (e.g., hourly, daily, weekly, monthly, yearly etc.). When there is only one variable upon which observations with respect to (w.r.t) time are made, is called univariate time series. Data mining on Time-series data is popular in many applications, such as stock market analysis, economic and sales forecasting, budgetary analysis, utility studies, inventory studies, yield D * Department of Electronics & Computer Engineering, IIT Roorkee. email: [email protected] ** Assistant Professor. Department of Electronics & Computer Engineering, IIT Roorkee. email: [email protected] TRIM 7 (2) July - Dec 2011 98 Graph Based Framework for Time Series Prediction Yadav & Toshniwal projections, workload projections, process and quality control, observation of natural phenomena (such as atmosphere, temperature, wind, earthquake), scientific and engineering experiments, and medical treatments (Han & Kamber, 2006). Time series dataset constitutes of {Y1, Y2, Y3, …, Yt } values, where each Yi represent the value of variable under study at time i. One of the major goal of Data mining in the time series is forecasting the time series i.e., to predict the future value Yt+1. The successive observations in time series are statistically dependent on time and time series modeling is concerned with techniques for analysis of such dependencies. In time series analysis, a basic assumption is made that is (i.e.) some aspect of past pattern will continue in future. Under this assumption time series prediction is assumed to be based on past values of the main variable Y. The time series prediction can be useful in planning and measuring the performance of predicted value on past data against actual observed value on the main variable Y. Time series modeling is advantageous, as it can be used more easily for forecasting purposes since the historical sequences of observations upon study on main variable are readily available as they are recorded in the form of past observations & can be purchased or gathered from published secondary sources. In time series modeling, the prediction of values for future periods is based on the pattern of past values of the variable under study, but the model does not generally account for explanatory variable which may have affected the system. There are two reasons for resorting to such time models. First, the system may not be understood, and even if it is understood it may be extremely difficult to measure the cause and effect relationship of parameters affecting the time series. Second, the main concern may be only to predict the next value and not to explicitly know why it was observed (Box, Jenkins & Reinsel, 1976) Time Series analysis consists of four major components for characterizing time-series data (Madsen, 2008). First, Trend component- these indicate the general direction in which a time series data is moving over a long interval of time, denoted by T. Second, Cyclic component- these refer to the cycles, that is, the long-term oscillations about a trend line or curve, which may or may not be periodic, denoted by C. Third, Seasonal component- these are systematic or calendar related, denoted by S. Fourth, Random component- these characterize the sporadic motion of time series due to random or chance events, denoted by R. Time-series modeling is also referred to as the decomposition of a time series into these four basic components. The time-series variable Y at the time t can be modeled as either the product of the four variables at time t (i.e., Yt = Tt×Ct× St× Rt) using multiplicative model proposed by (Box, Jenkins & TRIM 7 (2) July - Dec 2011 99 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Reinsel, 1970) where Tt means Trend component at time t, Ct means cyclic component at time t, St means seasonal component at time t and Rt signifies Random component at time t. As an alternative, additive model (Balestra & Nerlove, 1966; Bollerslev, 1987) can also be used in which (Yt = Tt+Ct+St+Rt) where Yt, Tt, Ct, St, Rt have the same meaning as described above. Since multiplicative model is the most popular model, we will use it for the time series decomposition. Example of time series data is the airline passenger data set (Fig. 1) in which the main variable Y is the number of passengers (in thousands) in an airline is recorded w.r.t time, where each observation on main variable is recorded on monthly basis from January 1949 to December 1960. Clearly, the time series is affected by increasing trend, seasonal and cyclic variations. Fig. 1: Time series Data of the Airline Passenger Data from Year 1949 to 1960 represented on monthly basis. Review of Literature In time series analysis there is an important notion of de-seasonalizing the time series (Box & Pierce, 1970). It makes the assumption that if the time series represents a seasonal pattern of L periods, then by taking moving average Mt of L periods, we would get the mean value for the year. This would be free of seasonality and contain little randomness (owing to averaging). Thus Mt=Tt×Ct (Box, Jenkins & Reinsel, 1976). To determine the seasonal component, one would simply divide the original series by the moving average i.e., Yt/Mt= (Tt×Ct× St× Rt)/( Tt×Ct )= St× Rt. Taking average over months eliminates randomness and yields seasonality component St. De-seasonalized Yt time series can be computed by Yt/St. The approach described in (Box, et al, 1976) for predicting the time series, uses regression to fit a curve to De-seasonalized time series using TRIM 7 (2) July - Dec 2011 100 Graph Based Framework for Time Series Prediction Yadav & Toshniwal least square method. To predict the values in time series, model projects the De-seasonalized time series into future using regression and divide it by the seasonal component. The Least Square Method is explained in Johnson and Wichern (2002). Exponential Smoothing has been proposed in (Shumway & Stoffer, 1982) which is an extension to above method to make more accurate predictions. It suggests, making prediction for Yt weighing the most recent observation (Yt-1) by α and weighting the most recent forecast (Ft-1) by (1- α). Note α lies between 0 and 1 (i.e., 0≤α≤1). Thus the forecast is given by Ft+1= Yt-1* α +(Ft-1) * (1- α). Optimal α is chosen based on the smallest MSE (Mean Square Error) value during the training. ARIMA (Auto-Regressive Integration Moving Average Based Model) has also been proposed (Box, et al., 1970, 1976; Hamilton, 1989). ARIMA model is categorized by ARIMA(p,q,d) where p denotes order of autoregression, q denotes order of differentiation and d denotes order of moving averages. The model tries to find the value of p, q, and d that best fits the data. In time series forecasting using a hybrid ARIMA and neural network model has proposed a model that tries to find p, q and d using neural network (Zhang, 2003). Proposed Work: Graph Based Framework for Time Series Prediction In this paper, I propose to use graph based framework for time series prediction. The motivation to use the graphs is to capture the tacit historical pattern present in the dataset. The idea behind creation of graph over time series is to utilize two facts. First, some aspect of time series pattern will continue in future and graph is a data structure that is well suited to model a pattern. Second, similarity can be calculated between graphs to know the similar patterns and their order of occurrence. Thus, graph is created with the motivation to store a pattern over time series and make prediction based on similarity of observed pattern from historical data as an alternative to Regression and curve fitting. The major shortcoming of using the regression and curve fitting is that it requires expert knowledge about curve equation and the number of parameters in it. If parameters are too many there is problem of over fitting and if parameters are too less, model suffers from problem of under fitting (Han & Kamber, 2006). The complete pattern in time series is not known initially and it is affected by random component which makes the regression harder, hence deciding the curve equation and number of parameters in it is a major issue. To further explore the concept of pattern, let there be time series on monthly data of N years where first observation was in first month of m year, Data = {Y1(k)Y2(k)…Y12(k), Y1(k+1) Y2(k+1) …Y12(k+1),…, Y1(k+N)Y2(k+N)…Y12(k+N)} where Y1(k) means value of variable under study for first month of year k TRIM 7 (2) July - Dec 2011 101 Graph Based Framework for Time Series Prediction Yadav & Toshniwal & Y12(k+N) means value of variable under study for twelfth month of year k+N. Note m≤k≤(m+N). In general let d, be the time interval which makes a pattern. If a pattern has to be stored yearly and data is available monthly d=12, data is available quarterly d=4, etc. Each successive observation to Yij (meaning month i and year j) on main variable ordered by time is in general given by Yi’j’ where if Yij 1≤i≤12, k≤j≤(k+N), then for Yi’j’ if i<12 then i’=i+1, j’=j else i’=1, j’=j+1. A graph over each successive d observation is created to store the pattern. This is called ‘last-patternobserved-graph’. To make the prediction we also store the knowledge in each graph that how the last pattern observed effect the next observation. This is called ‘knowledge-graph’. Example If we consider the data {Y1(k)Y2(k)…Y12(k), Y1(k+1) Y2(k+1) …Y12(k+1),…, Y1(k+N) Y2(k+N)…Y12(k+N)}, lastpattern-observed-graph for Jan of year (k+1) will be generated using data {Y1(k)Y2(k)…Y12(k)} and knowledge-graph of Jan for year (k+1) will be generated using {Y1(k)Y2(k)…Y12(k), Y1(k+1)} data. Knowledge graph is created with intuition to capture how the variable under study changed over last d observations and its effect on d+1 observation. In time series data, the graph is created with the motivation to model each observation as vertex and represent the effect of variation in observations with respect to time in form of edges. The number of vertices in graph is equal to time interval over which a pattern has to be stored. The edges are created to take into account the effect of each observation on other. Since the past values will affect the future values, but future values would not affect the past values and hence the edges are created between vertices corresponding to it and all the subsequent observations which measure the change in angle with horizontal. The graphs generated can be represented in computer memory either by using Adjacency matrix representation or Adjacency list representation (Cormen, 2001). I have used Adjacency list representation to save the memory required to store the graph as each graph will have n(n-1)/2 edges thus space required will be n(n-1)/2 using adjacency list 2 representation as compared to n space using adjacency matrix representation. Dataset of N tuples is partitioned into two sets. First set for training data of m tuples and second {N-m} tuples for training and validation of model. During the training phase, a Knowledge-Graph is generated over training data tuples over each subsequent d+1 observation. Yi(k)Y(i+1)(k)…Y(i+12)(k), Y(i+13) (k) where i has bounds 1≤i≤12 and if i>12 then i=1 & k=k+1 for all m tuples in training Dataset. Thus m-12 Knowledge-Graphs are generated. These generated graphs are partitioned into d sets (d=12), where each graph is stored in the interval over which knowledge they have captured (i.e. graph for all Jan’s are stored together, all Feb’s stored together, etc.). To implement this we have used an array of size d of linked list of graphs. TRIM 7 (2) July - Dec 2011 102 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Each linked list stores all the knowledge graph corresponding to interval over which knowledge it represents. The graphs are partitioned with the motivation to ease the search since while making prediction, model will query for all patterns observed w.r.t a particular month, since the graphs are already stored in partitioned form, time taken by model to execute this query will be O(1). To predict the next value in time series, model will take the last d known observations previous to the month on which prediction has to be done and compute ‘last-pattern-observed-Graph’. The model will search for a Knowledge graph (stored in the partitioned form corresponding to month for which prediction has to be made) that is most similar to ‘last-pattern observed graph’, considering only number of vertices equal to ‘lastpattern observed graph’ in Knowledge-Graph. To compute the similarity between two graphs, graph-edit distance technique has been used (Brown, 2004; Bunke & Riesen, 2008). The key idea of Graph-edit Distance approach is to model structural variation by edit operations reflecting modifications in structure and labeling. A standard set of edit operations is given by insertions, deletions, and substitutions of both nodes and edges. While calculating graph edit distance for time-series Graph for g1 (source graph) & g2 (destination graph), requires only substitutions of edges (change in angle) in g 2 to make it similar to g1 and a summation of cost incurred with each edit operation is calculated. The graph with least edit cost is most similar & selected as a graph that will form the basis, of the prediction. To make the prediction, model takes into account the structural difference between two graphs in vertex ordered weighted average manner. To make the prediction on graph g1 (last-pattern-observedGraph) using graph g2 (Knowledge Graph which is most similar to g1), every vertex in g1 predicts the angle between itself and the predicted value using the knowledge of g2 and taking into account the difference of edges between itself & it’s corresponding vertex in g2 in a weighted average manner (where edge difference to vertex that are closer to be predicted are given more weight technique to apply exponential smoothing in Graph based time series prediction approach), and thus in this way each vertex predicts the angle. Every vertex makes the prediction & the predicted value is average of value predicted by each vertex. After making the prediction, once the actual observed value is known, Knowledge graph is generated to capture the pattern corresponding to the last observation and in this way model learns in an iterative manner. Experimental Results The code to implement Graph Based Time Series prediction approach as discussed above is written in java. The Graph Based Time Series TRIM 7 (2) July - Dec 2011 103 Graph Based Framework for Time Series Prediction Yadav & Toshniwal prediction approach was applied on the airline passenger data set, which was first used in (Brown & Smoothing, 1962) and then in (Box, et al., 1976). It represents the number of airline passengers in thousands observed between January 1949 and December 1960 on a monthly basis. I have used 2 years of data for training i.e., 1949 & 1950 and estimated the remaining data on monthly basis implementing iterative learning as an observation is recorded. Fig. 2 represents Actual and Predicted number of Passenger using Graph Based Framework for Time Series prediction applied on the Time Series of airline passenger data set. Fig. 3 represents the corresponding percentage error rate observed on monthly basis. The average error recorded on time-series is 7.05.Fig. 4 represents the Actual and Predicted Number of passenger using Graph Based Framework for Time Series prediction applied on the De-seasonalized Time Series of airline passenger data set (using concept of Moving Average). Fig. 5 represents the corresponding percentage error rate observed on monthly basis. The average percentage error recorded on De-seasonalized Time series is 5.81. Fig. 2: Actual and Predicted number of Passenger using Graph Based Framework for Time Series prediction applied on the Time Series of airline passenger data set (APTS). Fig. 3: Percentage Error between Actual and predicted using Graph Based Framework for Time Series prediction applied on the Time Series of airline passenger data set (APTS). TRIM 7 (2) July - Dec 2011 104 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Fig. 4: Actual and Predicted number of Passenger using Graph Based Framework for Time Series prediction applied on the De-seasonalized Time Series of airline passenger data set (APTS). Fig. 5: Percentage Error between Actual and Predicted values using Graph Based Framework for Time Series prediction applied on the De-seasonalized Time Series of airline passenger data set (APTS). Conclusion & Discussion A new approach for time series prediction has been proposed & implemented which is based on graphs. The results reported show that using graph based framework for time series prediction on Deseasonalized Time Series (Computed Using Concept of Moving Average) on The Airline Passenger Data has 94.19 percent accuracy and on direct Time Series of The Airline Passenger Data has 92.95 percent accuracy. The accuracy on De-seasonalized time series is better since this time series has only two factors, cyclic and trend factors which leads to less error rate as compared to direct application of proposed approach on time-series which has all the four factors cyclic, trend, seasonal and randomness, which makes the prediction difficult. Thus application of Graph based framework in conjunction to Moving average offers good accuracy. TRIM 7 (2) July - Dec 2011 105 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Graph based framework approach for time series prediction has incorporated the concept of exponential smoothing, moving average and graph mining to enhance its accuracy. Graph based framework approach for time series prediction is a good alternative to regression. In the proposed approach there is no need of domain expert knowledge to know the curve equation and number of parameters in it. The result validate that the new approach has good accuracy rate. References Balestra, P., & Nerlove, M. (1966). Pooling cross section and time series data in the estimation of a dynamic model: The demand for natural gas. Econometrica, 34(3), 585-612. Bollerslev, T. (1987). A conditionally heteroskedastic time series model for speculative prices and rates of return. The review of economics and statistics, 69(3), 542-547. Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1970). Time series analysis. Oakland, CA: Holden-Day. Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1976). Time series analysis: forecasting and control (Vol. 16): San Francisco, CA: Holden-Day. Box, G. E. P., & Pierce, D. A. (1970). Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American Statistical Association, 65(332), 1509-1526. Brown, R. G. (2004). Smoothing, forecasting and prediction of discrete time series. Mineola, NY: Dover Publications. Brown, R. G., & Smoothing, F. (1962). Prediction of Discrete Time Series. Englewood Cliffs, NJ: Prentice Hall. Bunke, H., & Riesen, K. (2008). Graph Classification Based on Dissimilarity Space Embedding. In N. da Vitoria Lobo, T. Kasparis, F. Roli, J. Kwok, M. Georgiopoulos, G. Anagnostopoulos & M. Loog (Eds.), Structural, Syntactic, and Statistical Pattern Recognition (Vol. 5342, pp. 996-1007): Berlin / Heidelberg: Springer Clifton, C. (2011). Data Mining. In Encyclopaedia Britannica. Retrieved from http://www.britannica.com/EBchecked/topic/1056150/datamining Cormen, T. H. (2001). Introduction to algorithms. Cambridge, Mass: The MIT press. Hamilton, J. D. (1989). A new approach to the economic analysis of nonstationary time series and the business cycle. Econometrica, 57(2), 357-384. Han, J., & Kamber, M. (2006). Data mining: concepts and techniques: Morgan Kaufmann. TRIM 7 (2) July - Dec 2011 106 Graph Based Framework for Time Series Prediction Yadav & Toshniwal Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (Vol. 5): NJ: Prentice Hall Upper Saddle River. Madsen, H. (2008). Time series analysis. Boca Raton: Chapman and Hall/CRC Press. Shumway, R. H., & Stoffer, D. S. (1982). An approach to time series smoothing and forecasting using the EM algorithm. Journal of time series analysis, 3(4), 253-264. Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175. doi: 10.1016/s0925-2312(01)00702-0 TRIM 7 (2) July - Dec 2011 107 Quality Practices in Open Source Software… Farooq & Quadri Quality Practices in Open Source Software Development Affecting Quality Dimensions Sheikh Umar Farooq S. M. K. Quadri Abstract Purpose: The quality of open source software has been a matter of debate for a long time now since there is a little concrete evidence to justify it. The main concern is that many quality attributes such as reliability, efficiency, maintainability and security need to be carefully checked, and that fixing software defects pertaining to such quality attributes in OSDM (Open Source Development Model) can never be guaranteed fully. In order to diminish such concerns, we need to look at the practices which affect these quality characteristics in OSS (Open Source Software) negatively. This paper presents an exploratory study of the quality dimensions and quality practices and problems in OSDM. An insight of these problems can serve as a start point for improvements in quality assurance of open source software. Design/Methodology/Approach: A survey was administered based on existing literature. On the basis of this survey those practices in OSDM are described which affect quality attributes negatively in OSS. Findings: The quality characteristics which should be taken into consideration to select or evaluate OSS are presented. Furthermore, quality practices in OSDM which affect the quality of OSS in a negative manner have also been highlighted. Research Implications: Further research is suggested to identify other quality problems not found in this paper and to evaluate the impact of different practices on project quality. Originality/Value: As a first step in the development of practices and processes to assure and further improve quality in open software projects, in addition to quality attributes, existing quality practices and quality problems have to be clearly identified. This paper can serve as a start point for improvements in quality assurance of open source software’s. Keywords: Open Source Software; Software Quality; Quality Practices; Quality Problems. Paper Type: Survey Paper Introduction here are more than hundred thousand open source software of varying quality. The OSS model has not only led to the creation of significant software, but many of these software show levels of quality comparable to or exceeding that of software developed in a closed and proprietary manner (Halloran & Scherlis, 2002; Schmidt & Porter, 2001). However, open source software also face certain T Research Scholar. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. E-Mail: [email protected] TRIM 7 (2) July - Dec 2011 108 Quality Practices in Open Source Software… Farooq & Quadri challenges that are unique to this model. For example, due to the voluntary nature of open source software projects, it is impossible to fully rely on project participants (Michlmayr & Hill, 2003). This issue is further complicated by the distributed nature because it makes it difficult to identify volunteers who are neglecting their duties, and to decide where more resources are needed (Michlmayr, 2004). While most research on open source has focused and hyped popular and successful projects such as Apache (Mockus, Fielding & Herbsleb, 2002) and GNOME (Koch & Schneider, 2002), there is an increasing awareness that not all open source software projects are of high quality. SourceForge, which is currently the most popular hosting site for free software and open source projects with over 95,000 projects, is not only a good resource to find well maintained free software applications – there are also a large number of abandoned projects and software with low quality (Howison & Crowston, 2004). Some of these low quality and abandoned projects may be explained in terms of a selection process given that more interesting projects with a higher potential will probably attract a larger number of volunteers, but it has also been suggested that project failures might be related to the lack of project management skills (Senyard & Michlmayr, 2004). Nevertheless, large and successful projects also face important problems related to quality (Michlmayr & Hill, 2003; Michlmayr, 2004; Villa, 2003). In order to ensure that open source software remains a feasible model for the creation of mature and high quality software suitable for corporate and mission-critical use, open software quality assurance has to take these challenges and other quality problems into account and find solutions to them. As a first step in the development of practices and processes to assure and further improve quality in open software projects, existing quality practices and quality problems have to be clearly identified. To date however, only a few surveys on quality related activities in open software projects (that too mostly in successful OSS) have been conducted (Zhao & Elbaum, 2000; Zhao, 2003). This paper presents an exploratory study of the quality dimensions and quality practices and problems in open source software based on existing literature. I Software Quality and its Characteristics Software quality is imperative for the success of a software project. Boehm (1984) defines software quality as “achieving high levels of user satisfaction, portability, maintainability, robustness and fitness for use”. Jones (1985) refers to quality as “the absence of defects that would make software either stop completely or produces unacceptable results”. These definitions of software quality cannot be applied directly to OSS. Unlike CSS, user requirements are not formally available in OSS. We can TRIM 7 (2) July - Dec 2011 109 Quality Practices in Open Source Software… Farooq & Quadri evaluate the project and its program on a number of important attributes. Important attributes include functionality, reliability, usability, efficiency, maintainability, and portability. The benefits, drawbacks, and risks of using a program can be determined from examining these attributes. The attributes are same as with proprietary software, of course, but the way we should evaluate them with OSS is often different. In particular, because the project and code is completely exposed to the world, we can (and should!) take advantage of this information during evaluation. We can divide OSS into two major categories: Type- 1: Projects that are developed to replicate and replace existing CSS software; and Type-2: Projects initiated to create new software that has no existing equivalent CSS software. Linux is an example of Type-1 software, which was originally developed as a replacement for UNIX. Protégé, ontology development software is an example of Type-2 software. Existing quality models provide a list of quality carrying characteristics that are responsible for high quality (or otherwise) of software. Software quality is an abstract concept that is perceived and interpreted differently based on one’s personal views and interests. To dissolve this ambiguity, ISO/IEC 9126 provides a framework for the evaluation of software quality. ISO/IEC 9126 is the standard of quality model to evaluate a single piece of software (Software Engineering-Product Quality-Part 1, 2001; Software Engineering-Product Quality-Part 2, 2001). ISO/IEC 9126 defines six software quality attributes, often referred to as quality characteristics along with various sub-characteristics to evaluate the quality of software as shown in Fig. 1. Fig. 1: ISO 9126 Software Quality Model TRIM 7 (2) July - Dec 2011 110 Quality Practices in Open Source Software… Farooq & Quadri Functionality Functionality refers to the capability of the software product to provide functions which meet stated and implied needs when the software is used under specified conditions. Functionality means the number of functions must be available in software that fulfils the minimum usage criteria of the user (Raja & Barry, 2005). ISO 9126 Model describe the functionality attribute as “a set of attributes that bear on the existence of a set of functions and their specified properties. The functions are those that satisfy stated or implied needs”. This set of attributes characterizes what the software does to fulfil needs, whereas the other sets mainly characterizes when and how it does so (International Organization for Standardization, 1991). It is fundamental characteristic of the software development and it is close to the property of the correctness (Fenton, 1993). The specific functions that we need obviously depend on the kind of program and our specific needs. However, there are also some general functional issues that apply to all programs. In particular, we should consider how well it integrates and is compatible with existing components we have. If there are relevant standards, does the program support them? If we exchange data with others using them, how well does it do so? For example, MOXIE: Microsoft Office – Linux Interoperability Experiment downloaded a set of representative files in Microsoft Office format, and then compare how well different programs handle them (Venkatesh et al, 2011). For Type-1 OSS there are no formal functionality requirements, yet there will be a certain level of expectation in terms of its functionality compared to an existing CSS. Type-1 OSS will be considered of a high quality and new users will adopt Type-1 software, if it provides the basic functionality of its CSS equivalent. In case of Type2 OSS, there is no existing software to derive functional requirements from, thus new users will be defining such requirements according to their own needs. The sub characteristics of functionality attribute specified by Punter, Solingen & Trienekens (1997) are as: Accuracy This refers to the correctness of the functions i.e. to provide the right or agreed results or effects with the needed degree of precision. e.g. an ATM may provide a cash dispensing function but is the amount correct? Compliance Where appropriate certain industry (or government) laws and guidelines need to be complied with, i.e. SOX. This sub characteristic addresses the compliant capability of software. Interoperability A given software component or system does not typically function in isolation. This sub characteristic concerns the ability of a software component to interact with other components or systems. TRIM 7 (2) July - Dec 2011 111 Quality Practices in Open Source Software… Farooq & Quadri Security This sub characteristic relates to unauthorized access to the software functions (programs)/data. Suitability This characteristic refers to the appropriateness (to specification) of the functions of the software. Reliability Reliability refers to the capability of the software product to maintain its level of performance under stated conditions for a stated period of time. The reliability factor is concerned with the behavior of the software. It is the extent to which it performs its intended functions with required precision. The software should behave as expected in all possible states of environment. Although OSS is available free of cost, yet such software needs to have a minimum operational reliability to make it useful for any application. Many of the open source projects do not have resources to dedicate to accurate testing or inspection so that the reliability of their products must rely on community's reports of failures. The reports stored in the so-called bug tracking systems, are uploaded by the community, and moderated by internal members of the open source project. Reports are archived with various pieces of information including the date of upload and the description regarding the failure. What information can be collected from these repositories and how to mine them for reliability analysis is still an open issue (Li, Herbsleb & Shaw, 2005; Godfrey & Whitehead, 2009). Problem reports are not necessarily a sign of poor reliability - people often complain about highly reliable programs, because their high reliability often leads both customers and engineers to extremely high expectations. Indeed, the best way to measure reliability is to try it on a "real" work load. Reliability has a significant effect on software quality, since the user acceptability of a product depends upon its ability to function correctly and reliably (Samoladas & Stamelos, n.d). ISO 9126 defines reliability as “a set of attributes that bear on the capability of software to maintain its performance level under stated conditions for a stated period of time” (International Organization for Standardization, 1991). Further, sub characteristics of reliability attribute stated by Punter, Solingen & Trienekens (1997) are as: Fault Tolerance The ability of software to withstand and maintain a specified level of performance in case of software failure. Maturity The Capability of the software product to avoid failures, as a result of faults in the software. It is refined into an attribute Mean Time to Failure (MTTF). TRIM 7 (2) July - Dec 2011 112 Quality Practices in Open Source Software… Farooq & Quadri Recoverability Ability to bring back a failed system to full operation, including data and network connections. Efficiency Efficiency refers to the capability of the software product to provide appropriate performance, relative to the amount of resources used, under stated conditions. According to the ISO Model, efficiency is “a set of attributes that bear on the relationship between the software's performance and the amount of resources used under stated conditions” (International Organization for Standardization, 1991). Efficiency describes that the response of the software should be faster in the form of any input. The sub characteristics of efficiency attribute are as (Punter, Solingen & Trienekens, 1997): Resource Behavior Amount and type of resources used and the duration of such use in performing its function. It involves the attribute complexity that is computed by a metric involving size (space for the resources used and time spent using the resources). Time Behavior The capability of the software product to provide appropriate response time, processing time and throughput rates when performing its function under stated conditions. It is an attribute that can be measured for each functionality of the system. Usability Usability refers to the capability of the software product to be understood, learned, used and attractive to the user, when used under specified conditions (the effort needed for use). ISO 9126 describe the usability attribute as “a set of attributes that bear on the effort needed for use and on the individual assessment of such use by a stated or implied set of users” (International Organization for Standardization, 1991). The usability of open source software is often regarded as one reason for this limited distribution. The usability problem in most OSS is because of the following reasons: Developers are not users so they usually do not take user perception into consideration. Usability experts do not get involved in OSS projects The incentives in OSS work better for improvement of functionality than usability Usability problems are harder to specify and distribute than functionality problems Design for usability really ought to take place in advance of any coding TRIM 7 (2) July - Dec 2011 113 Quality Practices in Open Source Software… Farooq & Quadri Open source projects lack the resources to undertake high quality usability work OSS development is inclined to promote power over simplicity It's important to note that to improve usability many OSS programs are intentionally designed into at least two parts: an "engine" that does the work and a “GUI” that lets users control the engine through a familiar point and click interface (fragmentation). This division into two parts is considered an excellent design approach; it generally improves reliability, and generally makes it easier to enhance one part. Sometimes these parts are even divided into separate projects: The "engine" creators may provide a simple command line interface, but most users are supposed to use one of the available GUIs available from another project. Thus, it can be misleading if you are looking at an OSS project that only creates the engine - be sure to include the project that manages the GUI, if that happens to be a separate sister project. In many cases an OSS user interface is implemented using a web browser. This actually has a number of advantages: usually the user can use nearly any operating system or web browser, users don't need to spend time installing the application, and users will already be familiar with how their web browser works (simplifying training). However, web interfaces can be good or bad, so it's still necessary to evaluate the interface's usability. The sub characteristics of the usability attribute are as (Punter, Solingen & Trienekens, 1997): Learn ability Learning effort for different users, i.e. novice, expert, casual etc. Operability Ability of the software to be easily operated by a given user in a given environment. Understandability Determines the ease of which the systems functions can be understood, relates to user mental models in Human Computer Interaction methods. Portability Portability refers to the capability of the software product to be transferred from one environment to another. The environment may include organizational, hardware or software environment. ISO 9126 Model defines the portability attribute as “A set of attributes that bear on the ability of software to be transferred from one environment to another (including the organizational, hardware, or software environment)” (International Organization for Standardization, 1991). Portability is also a main issue of today and with respect to it, Open Source Software could run and give better results on different platforms (Loannis & Stamelos, 2011). From its early days, portability has been a central issue in OSS development. Various OSS systems have as first TRIM 7 (2) July - Dec 2011 114 Quality Practices in Open Source Software… Farooq & Quadri priority the ability of their software to be used on platforms with different architectures. Here, we have to stress on important fact, which originates from the nature of OSS, and helps portability, namely the availability of the source code of the destination software. If the source code is available, then it is possible for the potential developer to port an existing OSS application to a different platform than the one it was originally designed for. Perhaps the most famous OSS, the Linux kernel, has been ported to various CPU architectures other than its original one, the x86. In the end, evaluating usability requires hands-on testing. The sub characteristics of portability attribute are as (Punter, Solingen & Trienekens, 1997): Adaptability Characterizes the ability of the system to change to new specifications or operating environments. Install ability Characterizes the effort required to install the software in a specified environment. Replaceability The capability of the software product to be used in place of another specified software product for the same purpose in the same environment. Maintainability Maintainability refers to the capability of the software product to be modified. Modifications may include corrections, improvements or adaptations of the software to changes in the environment and in the requirements and functional specifications (the effort needed to be modified).Maintainability in general refers to the ability to maintain the system over a period of time. This will include ease of detecting, isolating and removing defects. Additionally, factors such as ease of addition of new functionality, interface to new components, programmers ability to understand existing code and test team’s ability to test the system (because of option like test instructions and test points) will enhance the maintainability of a system. ISO 9126 defines it as “A set of attributes that bear on the effort needed to make specified modifications (which may include corrections, improvements, or adoptions of software to environmental changes and changes in the requirements and functional specifications)” (International Organization for Standardization, 1991). Maintainability of OSS projects is a factor that was one of the first to be investigated by the OSS literature. This was done mainly because OSS development emphasizes on the maintainability of the software released. Making software source code available over the Internet allows developers from all over the world to contribute code, adding new functionality (parallel development) or improving present one and TRIM 7 (2) July - Dec 2011 115 Quality Practices in Open Source Software… Farooq & Quadri submitting bug fixes to the present release (parallel debugging). A part of these contributions are incorporated into the next release and the loop of release, code submission/bug fixing, incorporation of the submitted code into the current and new release is continued. This circular manner of OSS development implies essentially a series of frequent maintenance efforts for debugging existing functionality and adding new one to the system. These two forms of maintenance are known as corrective and perfective maintenance respectively. Maintenance is a huge cost driver in software projects. OSS is downloaded and used by a global community of users. There are no faceto-face interactions among the maintainers of the software. They have to rely upon the documentation within the source code and on communication through message boards. Therefore OSS is required to be highly maintainable. Lack of proper interface definition, structural complexity and insufficient documentation in an existing version of OSS can discourage new contributions. Since participation is voluntary, low maintainability will generate minimum participation of active users and hence will have a negative effect on quality. The sub characteristics of the maintainability are as (Punter, Solingen & Trienekens, 1997): Changeability It refers to the capability of the software product to enable a specified modification to be implemented. It also characterizes the amount of effort to change a system. Stability The capability of the software product to avoid unexpected effects from modifications of the software (the risk of unexpected effect of modifications) Testability Characterizes the effort needed to verify (test) a system change. Analyzability It characterizes the ability to identify the root cause of a failure within the software. Different users have different expectations of the same software and user’s expectations of software evolve with time. For instance, some users may view performance and reliability as the key features of software, while others may consider ease of installation and maintenance as key features of the same software. Therefore, software applications today must do more than just meet technical specifications; they must be flexible enough to meet the varying needs of a diverse user base and provide reasonable expectations of future enhancements. The last five characteristics are not related to the task performed by the software and therefore are regarded as non-functional attributes. In many cases though software requirements and testing methodologies are mostly TRIM 7 (2) July - Dec 2011 116 Quality Practices in Open Source Software… Farooq & Quadri focused on functionality and pay little if any attention to non-functional requirements. Since nonfunctional requirements affect the perceived quality of software (quality in use), failure to meet them often leads to late changes and increased costs in the development process. For example Reliability is a non-functional requirement that needs to be addressed in every software project. Therefore badly-written software may be functional, but not a reliable one. II Quality Problems under Open Source Model Although many high profile cases of successful OSSD projects exist (e.g., Apache, OpenOffice, PHP), the harsh reality is that the majority of OSS projects are of low quality. No doubt open source practices have been remarkable success as can be seen in some successful OSS, we believe there are several areas where there are opportunities for improvement. A commonly cited reason for the failure of OSS projects to reach a maturity level is in coordination of developers and project management, leading to some duplication of efforts by multiple developers, inefficient allocation of time and resources, and lack of attention to software attributes such as ease of use, documentation, and support, all of which impact conformance to specifications. Only few projects have explicit documentation describing ways of contributing to and joining a project. One more critical problem due to voluntary nature of open source is that, reliance on project participants can never be guaranteed (Michlmayr & Hill, 2003). Regarding to its distributed nature issues like to identify who gets what to be done or to decide where more resources to break bottleneck need to be examined (Michlmayr, 2004). Following issues usually lead to low quality software’s under OSDM: Missing or Incomplete Documentation Documentation is necessary for every project. Programmers and users have always criticized projects which lacks documentation regarding development practices (Michlmayr, Hunt & Probert, 2005). A study in QA reveals that over 84% of the respondents prepare a ‘‘TODO’’ list including list of pending features and open bugs. 62% build installation and building guidelines, 32% projects have design documents, and 20% have documents to plan releases including date and content (Zhao, 2003). Most of the open source projects / software’s have little or no documentation. However, some projects with a large number of contributors have good documentation about coding styles and code commit (Michlmayr, Hunt & Probert, 2005). Lack of documentation reduces the motivation of new users and programmers, because they always confront the difficulty to understand the project, whatever in order to make usage or improvement. New developers, who would like to TRIM 7 (2) July - Dec 2011 117 Quality Practices in Open Source Software… Farooq & Quadri participate into a project potentially, have to understand a part of the project well enough (Ankolekar, Herbsleb & Sycara, 2003). Volunteers may like to contribute in an area but they might not know how to start and where to start without proper documentation. The lack of developer documentation also implies that there is no assurance that everyone follows the same techniques and procedures. At the very beginning of Mozilla project, the community has faced problem to attract new developers, the situation did slow down the proceeding of project. After more well-formed documentations and tutorials were provided, the number of participants significantly raised (Mockus, Fielding & Herbsleb, 2002). Due to the nature of the open source less attraction to users and developers may leads to low quality product or even abend of project (Zhao, 2003). A survey, which explored QA activities in open source, concluded in that OS project starts regularly without a planning (Zhao & Elbaum, 2000). While there is no specific definition of program, the program varies regularly during the development process. Worse off, those changes are most poorly recorded in documentation. Undocumented planning and program changes make the measure and validation of end product impossible. Problems in Collaboration Software development is an interactive behaviour, often with tight integration and interdependencies between modules, and therefore requires a substantial amount of coordination and communication between developers if they are to collaborate on features (Ankolekar, Herbsleb & Sycara, 2003). Strong user involvement and participation throughout a project is a central view of OSSD. In some projects, there are problems with coordination and communication which can have a negative impact on project quality. It is more difficult to achieve coordination and agreeing to goals in OSS development than in closed source software development. Sometimes it is not clear who is responsible for a particular area and therefore things cannot be communicated properly. There may also be duplication of effort and a lack of coordination related to the removal of critical bugs. Some features may for example be duplicated under open-source development because there is some chance that developers with the same needs will not meet – or will not agree on their objectives and methods when they meet and will end-up developing the same types of features independently (forking). In traditional development team, developers can work effective together, as long as the team members understand with each other. Due to convenient communications possibility those team tends to advance efficiently (Thayer & McGetrick, 1993).Since the team members may cooperate on module or single one feature, to be aware of the activities TRIM 7 (2) July - Dec 2011 118 Quality Practices in Open Source Software… Farooq & Quadri of cooperating members is important (Ankolekar, Herbsleb & Sycara, 2003). Individuals and small teams take the advantages of convenient communication and simpler decision method. In any case, the potential for collaborative and group maintenance in successfully resolving a serious quality assurance issue is obvious and its importance and prominence in successful projects, in one form or another, seems like a good possibility (Michlmayr & Hill, 2003). Lack of global view of system constraints Large-scale open-source projects often have a large number of contributors from the user community (i.e., the periphery). When these users encounter problems, they may examine the source code, propose/apply fixes locally, and then submit the results back to the core team for possible integration into the source base. Often these users in the periphery have much less knowledge of the entire architecture of an open-source software system than the core developers. As a result, they may lack a global view of broader system constraints that can be affected by any given change, so their suggested fixed may be inappropriate. Dependence on Participants No participants in OSS can be held responsible; the strong reliance on individual developers comes to be a consideration of quality assurance. It's a conflict that a project expects predictability and reliability from participants, who claims to be irresponsible for the project (Raymond, 1999). A large user group is usually the fundamental of open source project (Zhao, 2003). Without new volunteers the project seemed hard to proceed, because when project begins, it also starts losing participants. No member is obligated to contribute until the end of project (Raymond, 1999), developers are free to decide, if stay with project or just leave. For open source project, regular demand on new developers keep itself proceeding steadily. A problem some projects face, especially those that are not very popular, is attracting volunteers. A study has confirmed that unlike big and mature projects, small projects may not receive much feedback from developers and co-users (Mockus, Fielding & Herbsleb, 2002). There are usually many ways of contributing to a project, such as coding, testing or triaging bugs. However, many projects only find prospective members who are interested in developing new source code. As a result, developers have to use a large portion of their time for tasks other people could easily handle. Few contributors are interested in helping with testing, documentation and other activities. These are vital activities, particularly as projects mature and need to be maintained and updated by new cohorts of developers. Good documentation, tutorials, development tools, and a reward and recognition culture facilitate the creation of a sustainable community. TRIM 7 (2) July - Dec 2011 119 Quality Practices in Open Source Software… Farooq & Quadri Unsupported Code One of the unsolved problems is how to handle code that has previously been contributed but which is now unmaintained. A contributor might submit source code to implement a specific feature or a port to obscure hardware architecture. As changes are made by other developers, this particular feature or port has to be updated so that it will continue to work. Unfortunately, some of the original contributors may disappear and the code is left unmaintained and unsupported. Lead developers face the difficult decision of how to handle this situation. Release Problems Release management is one of the most important controller to ensure the quality of open source software. The state of release management guidelines remains remarkably informal since the beginning of open source development (Erenkrantz, 2003). Carefully defined criteria are needed to regulate the release management. Oftentimes, release manager are adopted in decentralized open source model to fit the rapidly scaled project dimensions (Zhao, 2003). Under open source, it's recommended to release often and release early (Raymond, 1999). The argument behind this principle is that, users will take the responsibility to find the bugs. It has been confirmed that a good part of debugging tasks are shifted to users (Zhao, 2003). But as new versions are frequently released with poorly tested by core team, users burden the most tasks of debugging. The activities of testing increase, the quality of program gets worse (Hendrickson, 2001). Though software quality investments can reduce overall software cycle costs by minimizing rework later on, many software manufacturers sacrifice quality in favor of other objectives such as shorter development cycles and meeting time constraints. As one of the manager said, "I would rather have it wrong than have it late" (Paulk, Weber, Curtis & Chrissis, 1994). In contrast traditional conception of software quality is centred on a product-centric, conformance view of quality (Prahalad & Krishnan, 1999). Absence of static testing on developer side delivers much more bugs as usually can be caught by a number of users. Often it turns out to be impossible for developers to keep up with a mass of bug reports. Release may be frequently performed, when every claimed stable version fulfills the settled release qualifications. Otherwise, it must be labeled as unstable version. It can be hard, however, to ensure consistent quality of open-source software due to the short feedback loops between users and core developers, which typically result in frequent “beta” releases, e.g., several times a month. Although this schedule satisfies end-users who want quick patches for bugs they found in earlier betas, it can be frustrating to other end-users who want more stable, less frequent software releases. In addition to our TRIM 7 (2) July - Dec 2011 120 Quality Practices in Open Source Software… Farooq & Quadri own experiences, Gamma describes how the length of the release cycles in the Eclipse frame-work affected user participation and eventually the quality of the software (Gamma, 2005). Version Authorization The many different commercial versions of Linux already pose a substantial problem for software providers developing for the Linux platform, as they have to write and test applications developed for these various versions. The availability of source code often encourages an increase in the number of options for configuring and sub setting the software at compile and runtime. Although this flexibility enhances the software’s applicability for a broad range of use cases, it can also exacerbate QA costs due to a combinatory increase in the QA space. Moreover, since open-source projects often run on a limited QA budget due to their minimal/non-existent licensing fees, it can be hard for core developers to validate and support large numbers of versions and variants simultaneously, particularly when regression tests and benchmarks are written and run manually. Smith reports an exchange with an IT manager in a large Silicon Valley firm who lamented, “Right now, developing Linux software is a nightmare, because of testing and QA—how can you test for 30 different versions of Linux?” (Feller, et al, 2005). Testing and Bug Reporting The study of 200 OSS projects discovered that fewer than 20 percent of OSS developers use test plans; only 40 percent of projects use testing tools, although this increases when testing tool support is widely available for a language, such as Java; and less than 50 percent of OSS systems use code coverage concepts or tools. Larger projects do not spend more time in testing than smaller projects. OSS development clearly doesn’t follow structured testing methods. The methodology an OSS project adopts will depend largely on the available expertise, resources, and sponsorship. Formal testing techniques and test automation are expensive and require sponsorship. Some high-profile open source projects can achieve this, but most don’t, so the user base is often the only choice (Aberdour, 2007). As more users with few technical skills use free software, developers see an increase in useless or incomplete bug reports. In many cases, users do not include enough information in a bug report or they file duplicate bug reports. Such reports take unnecessary time away from actual development work. Some projects have tried to write better documentation about reporting TRIM 7 (2) July - Dec 2011 121 Quality Practices in Open Source Software… Farooq & Quadri bugs but they found that users often do not read the instructions before reporting a bug. Many popular open-source projects (such as GNU GCC, CPAN, Mozilla, the Visualization Toolkit, and ACE+TAO) distribute regression test suites that end users can run to evaluate the success of an installation on a user’s platform. Users can – but frequently do not – return the test results to project developers. Even when results are returned to core developers, however, the testing process is often undocumented and unsystematic, e.g., core developers have no record of what configurations were tested, how they was tested, or what the results were, which loses crucial QA-related information. Moreover, many QA configurations are executed redundantly by thousands of users (e.g., on popular versions of Linux or Windows), whereas others are never executed at all (e.g., on less widely used operating systems). Configuration Management Many free software and open source projects offer a high level of customization. While this gives users much flexibility, it also creates testing problems. It is very difficult or impossible for the lead developer to test all combinations so only the most popular configurations tend to be tested. It is quite common that, when a new release is made, users report that the new version broke their configuration. Well-written opensource software (e.g., based on GNU autoconf) can be ported easily to a variety of OS and compiler platforms. In addition, since the source is avail-able, end-users can modify and adapt their source base readily to fix bugs quickly or to respond to new market opportunities with greater agility. Support for platform-independence, however, can yield the daunting task of keeping an open-source source software base operational despite continuous changes to the underlying platforms. In particular, since developers in the core may only have access to a limited number of OS/compiler configurations, they may release code that has not been tested thoroughly on all platform configurations on which users want to run the software. Although in some cases OSS seems to do better than closed source software, there are many things that need to be to be improved and further expanded, so that we avoid typical problems that arise from practices usually employed in OSS. To achieve the maturity level and to produce high quality open source software’s one should also employ proved practices and methods usually employed in closed source software development in beneficial manner. Aberdour (2007) compares quality management practices in open source and closed source software development as shown in Table 1. We should strive to employ these proven practices in all types of projects whether small or large to achieve high quality and matured Open Source Software. TRIM 7 (2) July - Dec 2011 122 Quality Practices in Open Source Software… Farooq & Quadri Table 1: Quality Management in Open Source & closed Source Closed Source Well-defined developed methodology Extensive project documentation Formal, structured testing and quality assurance methodology Analysts define requirements Formal Risk assessment process – monitored and managed throughout project Measurable goals used throughout project Defect discovery from black-box testing as early as possible Empirical evidence regarding quality routinely to aid decision making Team members are assigned work Formal design phase is carried out and signed off before programming starts Much effort put into project planning and scheduling Open Source Development methodology often not defined or documented Little project documentation Unstructured and informal testing and quality assurance methodology Programmer define requirements No formal risk assessment process Few measurable goals Defect discovery from black-box testing late in the process Empirical evidence regarding quality isn’t collected Team members choose work Projects often go straight to programming Little project planning or scheduling Conclusion and Future Work OSS quality is an open issue and it should continue striving for even better quality levels if it has to outperform traditional, closed source development and target corporate and safety critical systems. The quality of selected software and the standards of evaluating the quality of OSS are often wrongly defined. Therefore, in this paper the quality characteristics which should be taken into consideration to select or evaluate OSS are also presented. The paper also presents insights into quality practices of open source software projects which affects the quality of OSS in a negative manner. Avoiding such practices and using proven quality management practices can result in high quality OSS. Further research is suggested to identify other quality problems not found in this paper and to evaluate the impact of different practices on project quality. References Aberdour, M. (2007). Achieving Quality in Open Source Software. IEEE Computer Society. 24 (1), 58-64. doi: 10.1109/MS.2007.2 Ankolekar, A., Herbsleb, J.D., & Sycara, K. (2003). Addressing Challenges to Open Source Collaboration with the Semantic Web. Retrieved from http://www.cs.cmu.edu/~anupriya/papers/icse2003.pdf Boehm, B. W. (1984). Software Engineering Economics. IEEE Transactions on Software Engineering. 10 (1), 4-21. doi: 10.1109/TSE.1984.5010193 TRIM 7 (2) July - Dec 2011 123 Quality Practices in Open Source Software… Farooq & Quadri Erenkrantz, J.R. (2003). Release Management within Open Source Projects. Retrieved from http://www.erenkrantz.com/Geeks/Research/Publications/Relea seManagement.pdf Feller, J., et al (Eds.)(2005). Perspectives on free and open source software. Cambridge, Mass: MIT. Fenton, N. E. (1993). Software Metrics: A Rigorous Approach. London: Chapman and Hall. Gamma, E. (2005). Agile, open source, distributed, and on-time: inside the eclipse development process. Retrieved from http://www.inf.fu-berlin.de/inst/ag-se/teaching/SBSE/034_Eclipse-process.pdf Godfrey, M. W., & Whitehead, J. (2009). Proceedings of the 2009 6th IEEE International Working Conference on Mining Software Repositories, Vancouver Canada, May 16-17. Halloran, T. J., & Scherlis, W. L. (2002). High quality and open source software practices. Retrieved from http://flosshub.org/system/files/HalloranScherlis.pdf Hendrickson, E. (2001). Better Testing – Worse Quality? In International Conference on software Management & Applications of Software Measurement, February 12-16, 2001 San Diego, CA, USA Howison, J., & Crowston, K. (2004). The perils and pitfalls of mining SourceForge. Retrieved from http://msr.uwaterloo.ca/papers/Howison.pdf International Organization for Standardization. (1991). Information technology-Software product evaluation: Quality characteristics and guidelines for their use. Berlin: Beuth-Verlag: ISO/IEC. Jones, C. L. (1985). A Process-Integrated Approach to Defect Prevention. IBM Systems Journal. 24 (2), 150-167. doi:10.1147/sj.242.0150 Koch, S., & Schneider, G. (2002). Effort, cooperation and coordination in an open source software project: GNOME. Information Systems Journal. 12 (1), 27–42. doi: 10.1046/j.1365-2575.2002.00110.x Li, P.L., Herbsleb, J., & Shaw, M. (2005). Forecasting field defect rates using a combined time-based and metrics-based approach: a case study of OpenBSD. 16th IEEE International Symposium on Software Reliability Engineering (ISSRE) (pp. 193-202). Washington, DC, USA: IEEE Computer Society. doi: 10.1109/ISSRE.2005.19 Michlmayr, M. (2004). Managing volunteer activity in free software projects. In Proceedings of the 2004 USENIX Annual Technical Conference (pp. 39-33), FREENIX Track, Boston, MA: USENIX Association. Retrieved from http://dl.acm.org/citation.cfm?id=1247415.1247454 TRIM 7 (2) July - Dec 2011 124 Quality Practices in Open Source Software… Farooq & Quadri Michlmayr, M., & Hill, B. M. (2003). Quality and the reliance on individuals in free software projects. In Proceedings of the 3rd Workshop on Open Source Software Engineering (pp. 105–109). Portland, OR, USA: ICSE. Michlmayr, M., Hunt, F., & Probert, D. (2005). Quality Practices and Problems in Free Software Projects. In Proceedings of the First International Conference on Open Source Systems Geneva, 11th15th July (pp. 24-28) Mockus, A. R., Fielding, T., & Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology. 11 (3), 309–346. doi:10.1145/567793.567795 Paulk, M. C, Weber, C., Curtis, W., & Chrissis, M. (1994). The Capability Maturity Model: Guidelines for Improving the Software Process. Reading, Mass: Addison-Wesley. Prahalad, C. K., & Krishnan, M. S. (1999 September). The New Meaning of Quality in the Information Age. Harvard Business Review. 77 (5), 109-118. Retrieved from http://hbr.org/1999/09/the-new-meaning-of-quality-in-theinformation-age/ar/1 Punter, T., Solingen,R.V., & Trienekens, J. (1997). Software Product Evaluation. 4th Conference on Evaluation of Information Technology (30-31 Oct. 1997). MB Eindhoven Netherland. Raja, U., & Barry, E. (2005). Investing Quality in Large –Scale Open Source Software. U.S.A: Texas A&M University. Raymond, E. S. (1999). The Cathedral and the Bazaar. Sebastopol, CA: O’Reilly & Associates. Samoladas, I., & Stamelos, I. (n.d). Assessing Free/Open Source Software Quality. Retrieved from http://ifipwg213.org/system/files/samoladasstamelos.pdf Schmidt, D. C., & Porter, A. (2001). Leveraging open-source communities to improve the quality & performance of open-source software. In Proceedings of the 1st Workshop on Open Source Software Engineering. Toronto, Canada: ICSE. Senyard, A., & Michlmayr, M. (2004). How to have a successful free software project. In Proceedings of the 11th Asia-Pacific Software Engineering Conference (pp. 84-91). Busan, Korea: IEEE Computer Society. Software Engineering-Product Quality-Part 1. (2001, June). Software Engineering-Product Quality-Part 1: Quality Model. ISO/IEC 9126-1. TRIM 7 (2) July - Dec 2011 125 Quality Practices in Open Source Software… Farooq & Quadri Software Engineering-Product Quality-Part 1. (2001, June). Software Engineering-Product Quality-Part 1: Quality Model. ISO/IEC 9126-2. Thayer, R.H., & McGettrick, A.D. (1993). (Eds.), Software Engineering: A European Perspective. IEEE Computer Society Press, Los Alamitos, CA. Venkatesh, C., et al. (2011). Quality Prediction of Open Source Software for e-Governance Project. Retrieved from www.csi-sigegove.org/emerging_pdf/16_142-151.pdf Villa, L. (2003). Large free software projects and Bugzilla. In Proceedings of the Linux Symposium (July 23-26, 2003) Ottawa, Canada, pp. 447-456. Zhao, L. (2003). Quality assurance under the open source development model. Journal of Systems and Software. 66 (1), 65–75. doi:10.1016/S0164-1212(02)00064-X Zhao, L., & Elbaum, S. (2000). A survey on quality related activities in open source. SIGSOFT Software Engineering Notes, 25 (3), 54– 57. doi: 10.1145/505863.505878 TRIM 7 (2) July - Dec 2011 126 Open Source tools for varied professions Khan Open Source Tools for Varied Professions Nadim Akhtar Khan * Abstract Purpose: The popularity of open source software in contemporary world with the emergence of globally distributed base of developers, contributors and users has given new identity to the present software development industry with growing use of freely available software tools along with the source code by non-profit organisations, universities and commercial establishments to suit their varied requirements. The success of such software tools is evident with the growing number of downloads and users from diverse professions. The present paper makes an attempt to explore some of the most Prominent Open Source tools used for highly specialised professional tasks in the fields of Business Management and Health. Design/Methodology/Approach: Six most prominent open source tools available in both categories have been identified and discussed based on the popularity owing to number of downloads and prominent features that catch user attention for using such software. Implication: These tools not only provide means for managing resources in a more sophisticated manner but also provides ample opportunity for non-profit organisations and commercial establishments to attain their goals by taking advantage of prominent features and utilities of such software. Research Limitations: The paper only highlights prominent open source software tools available in two fields based on their specialised utilities best suited for professional requirements and operations. The scope can further be extended to reveal user satisfaction by way of analysing the experiences of working with such software in different setups. Keywords: Open Source Software, Business Management, Health. Paper Type: Article Introduction he open source software Movement has gained momentum over time and has revolutionized software development approaches throughout the world especially with the distributed developer base and frequent updates. The availability of source code to tailor and T * Assistant Professor. Department of Library and Information Science, University of Kashmir. Hazratbal, Srinagar, 190006. Jammu and Kashmir. India. email: [email protected] TRIM 7 (2) July - Dec 2011 127 Open Source tools for varied professions Khan customize the software to suit needs and requirements of users in different setups has given new dimensions to software development approaches and as such has captured the attention of software developers, information and computer professionals throughout the world. A large number of open source applications are already in the market deriving support and adoption from world bodies like UNESCO, WHO etc besides many Open Source Organizations and forums coming together to conduct research for enhancing the features and functionalities of Open Source Software systems. Multinational corporations, nonprofit research institutes, university libraries, and individual organizations are all using open source software to gather, organize and provide access to information. Open Source software has brought powerful information management tools within reach of organizations that could have never afforded to purchase comparable commercial products (Dunlap, 2006). Open Source advocates argue that OSS is primarily a development methodology grounded in the philosophy of making source code open and free to all who want it. Users and developers co-exist in a community where software grows and expands based on personal needs. These enhancements make the project more globally desirable as it fits more and more requirements. Linus Torvalds, the epitome of the open source developer says: Release early and often Delegate everything you can Be open (Raymond, 2001, as cited in Grodzinsky, Miller & Wolf, 2003) Open source software (OSS) products have rapidly acquired a notable importance among consumers and firms all over the world. They are mostly developed and distributed through online social networks. However, their innovation and development has to face up the existence of free-riders which can benefit from the knowledge developed in the online social network and identifying the factors that moderate the opportunistic behavior in OSS development and distribution for facilitating the OSS innovations (Casaló, Flavián & Guinalíu, 2008) Open source software has the seemingly useful feature that at any point, any one with appropriate technical skills can modify the code and take the project in a direction that diverges from the direction others are taking it (called ‘code forking’). Grodzinsky, Miller and Wolf (2003) stresses that open source project leaders and developers must show a great willingness to take in new ideas, evaluate them thoughtfully, and respond constructively in order to nurture both the idea and the developer of the idea. TRIM 7 (2) July - Dec 2011 128 Open Source tools for varied professions Khan User participation is indeed both direct and indirect in the OSS development context. Some users actively take part in the development work by commenting on the existing solutions, which has been identified as a typical form of user participation in OSS development, others have acquired a consultative role in the development work (Livari, 2009).The European Commission’s (2001) (as cited in Spinello, 2003) open source study declared that this software “permits a greater rate of innovation, with greater efficiency.” Objective The study is undertaken to identify and describe most popular open source tools in Business Management and Health, which best suits the professional demands in two fields. Scope Open Source Software utilization has found its place as a success in almost all disciplines and specializations. However, the present study is confined to Open Source Software tools in the fields of Business Management and Health. Open Sources tools in Business Management Magnolia CMS (http://www.magnolia-cms.com/) Magnolia CMS is an open-source Web Content Management System that focuses on providing an intuitive user experience in an enterprise-scale system. Its combination of ease-of-use and a flexible, standards-based Java architecture has attracted enterprise customers throughout the globe and is widely being used both by government and private enterprises in more than 100 countries. Magnolia CMS is distributed as two web-applications, one acting as the authoring instance and the other as the public environment. This allows for better security, having one application inside your firewall and one outside. It also enables clustering configurations. TRIM 7 (2) July - Dec 2011 129 Open Source tools for varied professions Khan Author instance is where all authors work. It typically resides in a secure location such as behind a corporate firewall, inaccessible from the Internet. The author instance publishes content to public instances. Public instance receives the public content and exposes it to visitors on the Web. It resides in a public, reachable location. You can have more than one public instance, serving the same or different content. Public instances that receive the activated content are known as subscribers. Any number of subscribers can subscribe to a single author instance. Subscribers are key to building high-availability, load-balanced environments. Magnolia CMS stores all content (web pages, images, documents, configuration, data) in a content repository. SugarCRM (http://www.sugarcrm.com/crm/) SugarCRM is open source Customer Relationship Management software for companies of all sizes. It can easily be customized and integrated with other software to allow companies to build and maintain flexible systems. Its core functionality includes sales force automation, marketing campaigns, support cases, project management, leads, opportunities, accounts etc. Ideal for small and medium-sized companies, large enterprises and government organizations, Sugar can run in the Cloud or on-site. It comes in different edition like Sugar Ultimate, Sugar Enterprise, Sugar Corporate, Sugar Professional, & Sugar Community Edition. With over five million downloads and more than 500,000 users, SugarCRM has been recognized for its success and innovation by CRM Magazine, InfoWorld, Customer Interaction Solutions and Intelligent Enterprise. SugarCRM comes with complete marketing and sales force automation features. It helps to share data across individuals and teams, while monitoring business performance and provides a central hub to manage and share all customer service issues to ensure that customer cases are TRIM 7 (2) July - Dec 2011 130 Open Source tools for varied professions Khan handled efficiently and effectively. Open-Source CRM platform lets users quickly and easily customize the system to streamline business processes to match specific requirements. Tomato CMS (http://www.tomatocms.com/) TomatoCMS is an impressive open source Content Management System powered by Zend Framework, jQuery and 960 Grid System. It allows customization of themes rapidly and easily without need of HTML knowledge. Its flexible module system helps to choose the best and most needed components in maximizing target. Drag and drop widget system helps to construct website in minute. Like module system, tomato CMS provides suite of widget that can be flexibly customized. It is designed to operate under many server structures: from share host (micro code), dedicated server and above all, cluster server. It provides many solution to optimise your website: opcode (eAccelerator, xCache, APC), database (memcached, filecached,..), database balancing (Replicate, Shard), web balancing (LVS, Big IP) and resource servers, static servers, etc. CiviCRM: A Free and Open Source eCRM Solution (http://civicrm.org/) CiviCRM is a free, libre and open source constituent relationship management solution. It is web-based, internationalized, and designed specifically to meet the needs of advocacy, non-profit and nongovernmental groups. It allows to record and manage information and execute transactions, conversations, events or any type of correspondence with each constituent and store it all in one, easily accessible and manageable source. It is designed for the civic sector. It integrates directly into the popular open source Content Management systems Drupal and Joomla. Registration and visitor interactions are logged directly into the system, including end-user maintenance of their own addresses and custom fields. It can store data in many localized TRIM 7 (2) July - Dec 2011 131 Open Source tools for varied professions Khan formats and has been translated into a number of languages - including French, Spanish, German, Dutch, and Portuguese. It is affordable and cost effective. Opentaps(http://opentaps.org/) Opentaps Open Source ERP + CRM is a fully integrated application suite that help management of business more effectively. It supports ecommerce, Customer Relationship Management, Warehouse and Inventory Management, Supply Chain Management, and Financial Management to Business Intelligence and mobility integration. It supports physical products, digital and downloadable products, variant products and configurable products. It provides Integration with major payment gateways with Browser-based email server. It provides for Customer services and case management. It manages marketing campaigns, including outbound emails and call management, tracking code reporting and management facilities. It provides Integration with Asterisk open source Voice over IP (VOIP) system and with GetResponse email marketing with module. It has also Support for Value Added Taxes (VAT) through VAT module. TRIM 7 (2) July - Dec 2011 132 Open Source tools for varied professions Khan Joomla (http://www.joomla.org/) Joomla is an award-winning content management system (CMS), which enables one to build Web sites and powerful online applications. Many aspects, including its ease-of-use and extensibility, have made it the most popular Web site software available. Best of all, it is an open source solution that is freely available to everyone. It is the most popular open source CMS currently available as evidenced by a vibrant and growing community of friendly users and talented developers. Its roots go back to 2000 and, with over 200,000 community users and contributors. Its powerful application framework makes it easy for developers to create sophisticated add-ons that extend its power into virtually unlimited directions. The core Joomla framework enables developers to quickly and easily build Inventory control systems, data reporting tools, application bridges, custom product catalogs, integrated e-commerce systems, Complex business directories, Reservation systems and Communication tools. Open Sources Tools in Health OpenEMR(http://www.oemr.org/) OpenEMR is a certified electronic health records and medical practice management application with fully integrated electronic health records, practice management, scheduling, electronic billing and interoperability. OpenEMR is licensed under the GNU, General Public License (General GPL). It is a free open source replacement for medical applications such as Medical Manager, Health Pro, and Misys. Its features support EDI billing using ANSI X12. Its main features include Multilanguage Support, free Upgrades and online support, electronic Billing (includes Medicare),document management, Integrated practice management, ePrescribing, Insurance tracking (3 insurances),Easy to customize, Easy Installation,Voice recognition ready (MS Windows Operating Systems), TRIM 7 (2) July - Dec 2011 133 Open Source tools for varied professions Khan Web based (Secure access with SSL certificates) Integration with external general accounting program SQL-Ledger, Built in scheduler, multi-facility capable, prescriptions by printed script, fax or email. Hospital OS Software(http://www.hospital-os.com/en/) Hospital OSS is a Hospital Information System for managing hospital operations. It is a Client - Server software in which the server works as a central unit that stores all of the information and the clients are the units that feed the information into the server. Hospital OS Server uses the Linux operating system and PostgreSQL as the database. Both Linux and PostgreSQL are open source programs available for download on the Internet. The Client software is developed by using Java and it can be used with Windows XP, 7, MacOS, Ubuntu and other operating systems that have the Java Virtual Machine installed. Hospital OS is being designed to support "Registration, Medical Records, Patient Screening Counter, X-Ray Laboratory, Pharmacy, Medical Statistics, IPD cashier, One Stop Service Point and system administrator. TRIM 7 (2) July - Dec 2011 134 Open Source tools for varied professions Khan OpenMRS (http://openmrs.org/) Open Medical Record System (OpenMRS) was created in 2004 as a open source medical record system platform for developing countries. It is a software platform and a reference application which enables design of a customized medical records system with no programming knowledge (although medical and systems analysis knowledge is required). It is a common platform upon which medical informatics efforts can be built. The system is based on a conceptual database structure which is not dependent on the actual types of medical information required to be collected or on particular data collection forms, so can be customized for different uses. Its main features include Central concept dictionary, Security, Privilege-based access, Patient repository, Multiple identifiers per patient, Data entry, Data export, Standards support, Modular architecture, Patient workflows, Cohort management, Relationships, Patient merging, Localization / internationalization, Support for complex data, Reporting tools, Person attributes. Connect (http://www.connectopensource.org/) CONNECT is an open source software solution that supports health information exchange both locally and at the national level. CONNECT uses Nationwide Health Information Network standards and governance to make sure that health information exchanges are compatible with other exchanges being set up throughout the country. This software solution was initially developed by federal agencies to support their health-related missions, but it is now available to all organizations and can be used to help set up health information exchanges and share data using nationally-recognized interoperability standards. TRIM 7 (2) July - Dec 2011 135 Open Source tools for varied professions Khan PHYAURA EHR (https://www.phyaura.com/) PHYAURA EHR community edition is free and open source software which allows all healthcare practitioners in the United States to document clinical notes, schedule office visits, and bill for medical services, all without any vendor lock in. The PHYAURA community and open source software was built to create a collaborative platform for healthcare practitioners, developers, vendors and staff members aimed at improving the healthcare technology experiences and ultimately patient care. The PHYAURA community is a quick and easy way to read answers to commonly asked questions and post questions that have not already been addressed. Its core practice management software and electronic medical records software is written in open source code conforming to the GNU General Public License. This is the one of the most efficient methods to collaborate and provide a community based EMR. TRIM 7 (2) July - Dec 2011 136 Open Source tools for varied professions Khan OsiriX radiologist workstation (http://www.osirix-viewer.com/) Another example of open-source software success is the OsiriX radiologist workstation. This full-featured radiology viewing and interpretation system integrates 3D and web-access features that are rarely included in commercial workstations that cost tens of thousands of dollars each. OsiriX has been specifically designed for navigation and visualization of multimodality and multidimensional images: 2D Viewer, 3D Viewer, 4D Viewer (3D series with temporal dimension, for example: Cardiac-CT) and5D Viewer (3D series with temporal and functional dimensions, for example: Cardiac-PET-CT). The 3D Viewer offers all modern rendering modes: Multiplanar reconstruction (MPR), Surface Rendering, Volume Rendering and Maximum Intensity Projection (MIP). All these modes support 4D data and are able to produce image fusion between two different series (PETCT and SPECT-CT display support). The OsiriX open-source approach encourages doctors to write their own extensions for image analysis and workflow automation. Because radiology workstations are regulated as medical devices by the FDA, a number of commercial vendors now offer FDA-registered versions of the free open-source OsiriX for a fraction of what proprietary workstations cost. Conclusion Open Source has been growing in popularity owing to its lower cost of development, ease in downloading and installation with no licensing issues. GOOGLE, FACEBOOK, Sun Microsystems, and RedHat are just a few very successful companies using the collaboration of open source software in their products and services (Open Source Technology, 2012). Open-source software offers incredible benefits in all fields of human progress including ethical advantages, access, innovation, cost, TRIM 7 (2) July - Dec 2011 137 Open Source tools for varied professions Khan interoperability, integration, standardization, support and safety. Business Management and Health stand no exception to this scenario. Huge amount is being spent for implementing the electronic health record system owing to EMR license prices and maintenance of commercial Content management systems and these costs tend to recur, while financial advantage of Open Source Software become obvious. Secondly, OSS being generally supported by worldwide users enable companies to reach a broader user base. With more reputed organizations like WHO, UNESCO and other companies adhering to OSS for carrying their vital business and professional operations the success of OSS is becoming more evident. References Casaló, L. V., Flavián, C. and Guinalíu, M. (2008). Towards loyalty development in the e-banking business. Journal of Systems and Information Technology, 10(2), 120 – 134. doi: 10.1108/13287260810897756 Dunlap, I. H. (2006). Open Source Database Driven Web Development: A Guide for information Professionals (pp 11-24). Oxford: Chandos Pub. Grodzinsky, F. S., Miller, K. and Wolf, M. J. (2003). Ethical issues in open source software. Journal of Information, Communication and Ethics in Society, 1(4), 193 – 205. doi: 10.1108/14779960380000235 Iivari, N. (2009). Constructing the users in open source software development: An interpretive case study of user participation. Information Technology & People, 22 (2), 132 – 156. doi: 10.1108/09593840910962203 Open Source Technology. (n.d). PHYAURA. Retrieved from https://www.phyaura.com/resources-2/open_source/ Spinello, R. A. (2003). The future of open source software: Let the market decide. Journal of Information, Communication and Ethics in Society, 1 (4), 217 – 233. doi: 10.1108/14779960380000237 TRIM 7 (2) July - Dec 2011 138 Analysis of Operating Systems and Browsers ... Lone & Wani Analysis of Operating Systems and Browsers: A Usage Metrics Mohammad Ishaq Lone Dr. Zahid Ashraf Wani Abstract Purpose: The purpose of this paper is to examine the growth of FOSS and proprietary operating systems and browsing software used in computers and various types of mobile phone devices around the world. Design/Methodology/Approach: The data is gathered from StatCounter (http://gs.statcounter.com) - one of the biggest web analytics service. The collected data is analysed keeping objectives of the study in view. Findings: It offers a thorough insight of yearly and cumulative growth of software industry. As for as OS market is concerned Mac OSX and Linux have increased their share. Linux has increased from 0.69% in 2009 to 0.78% in 2010. Accordingly year wise growth of mobile operating systems show iOS is losing its market share by dipping to 25.48% in 2010 from 34.01% in 2009, while as BlackBerry and Android have increased their share by 8.34% and 6.41% respectively. Browser Internet Explorer (IE) is showing declining trend with 52.77% share in May, 2010 against 44.52% in April, 2011, whereas Firefox is maintaining a study trend during same period with 31.64% share in May, 2010 with slight depreciation (29.67%) in May, 2011. However, in mobile browser arena all the browsers are showing a declining trend in 2010 when compared to 2009 except Android, BlackBerry, Samsung and NetFront. BlackBerry has increased by 8.15% and Android- an open source mobile browser has increased its market share by 6.63% augurs well for FOSS movement. Originality/Value: The paper explore the market share of FOSS in OSs and browsers. It deciphers in detail the FOSS growth and increasing market share and can help stakeholders to take future course of action in this arena. Keywords FOSS, Proprietary Software, Operating Systems, Web Browsers – Mobile, Web Browsers – computer Paper Type: Research paper Introduction pen Source software can be analysed as a process innovation: a new and revolutionary process of producing software based on unconstrained access to source code as opposed to the traditional closed and property-based approach of the commercial world. The production of Open Source software is a form of intellectual gratification with an intrinsic utility similar to that of a scientific discovery, involving elements other than financial remuneration (Perkins, 1999). Emerging as it does from the university and research environment, O PhD. Scholar, Department of Library & Information Science, University of Kashmir, Jammu & Kashmir. email: [email protected] Assistant Professor, Department of Library & Information Science, University of Kashmir, Jammu & Kashmir. email: [email protected]; [email protected] TRIM 7 (2) July - Dec 2011 139 Analysis of Operating Systems and Browsers ... Lone & Wani the movement adopts the motivations of scientific research, transferring them into the production of technologies that have a potential commercial value. The sharing of results enables researchers both to improve their results through feedback from other members of the scientific community and to gain recognition and hence prestige for their work. The same thing happens when source code is shared: other members of the group provide feedback that helps to perfect it, while the fact that the results are clearly visible to everyone confers a degree of prestige which expands in proportion to the size of the community. In the new paradigm of development, programmers frequently rediscover the pleasure of creativity, which is being progressively lost in the commercial world, where the nightmare of delivery deadlines is transforming production into an assembly line. Proprietary software is primarily perceived as not being very reliable. Produced by a restricted group of programmers in obedience to market laws, it is in diametric opposition to the law expressed by Raymond (1999): “Given enough eyeballs, all bugs are shallow”. So it can be safely concluded that Intellectual gratification, aesthetic sense, and informal work style are all recurrent features of the set of different motivations underlying the invention of Open Source. Over the years use of free and open sources software has increased considerably in every sphere of human activity like education, industry, business, medicine, agriculture etc. It has given strong competition to the proprietary software and is also encouraged at government level in different developing and emerging economies of the world due to its umpteen benefits. FOSS advocate groups round the globe promote its use and motivate programmers to develop new applications for human good. sourceforge.net is on such platform which has united programmers from different countries to develop and improve FOSS in range of areas. Even after umpteen efforts by volunteers proprietary software industry is occupying lion’s share in the market place and is expected to be a dominant player in future as well but endeavours by FOSS advocate groups could be very important for weaker economies and underdeveloped societies therefore should be encouraged. Problem The study is an endeavour to understand and appraise the use of different open source and proprietary browsers and operating systems used in computer and mobile phone devices around the globe. Scope The scope of the study is confined to assess the growth and use of Proprietary and FOSS operating systems and browsers in computer and TRIM 7 (2) July - Dec 2011 140 Analysis of Operating Systems and Browsers ... Lone & Wani mobile phone devices around the globe. The study covered the period of 2009 and 2010 to gauge cumulative growth and April, 2010 to May, 2011 for analysing latest trend. Objectives To understand the use and growth of proprietary and FOSS computer operating systems and browsers. To assess the use and growth of proprietary and FOSS mobile operating systems and browsers. To measure the cumulative growth of these software during 2009 and 2010. Methodology The data is gathered from StatCounter (http://gs.statcounter.com) which is the one of the biggest web analytics service in the form of .csv files. The data as such collected is analysed and compressed keeping objectives of the study in view. Limitations of the study StatCounter tracking code is installed on more than 3 million sites globally. Every month, more than 15 billion hits are recorded to these sites even then the data collected do not claim to be sole representation of whole internet user community. Related work Lehman et al. have built the largest and best known body of research on the progressive use of large, long-lived software systems (Lehman, & Belady, 1985; Lehman M. et al., 1997; Lehman, Perry & Ramil, 1998; and, Turski, 1996). Lehman’s laws of software evolution, which are based on his case studies of several large software systems, suggest that open source systems are growing in size. Turski’s (1996) statistical analysis of these case studies suggests that system growth (measured in terms of numbers of source modules and number of modules changed) is usually sub-linear, slowing down as the system gets larger and more complex. Kemerer and Slaughter (1999) have presented an excellent survey of research on software development. They also note that there has been relatively little research on empirical studies of software progression. Parnas (1994) has used the metaphor of decay to describe how and why software becomes increasingly brittle over time. Eick et al., (2001) extend the ideas suggested by Parnas by characterizing software “decay” in ways that can be detected and measured. They used a large telephone switching system as a case study. They suggest, for example, that if it is common for defect fixes to require changes to large numbers of source TRIM 7 (2) July - Dec 2011 141 Analysis of Operating Systems and Browsers ... Lone & Wani files, then the software system is probably poorly designed. Their metrics are predicated on the availability of detailed defect tracking logs that allow, for example, a user to determine how many defects have resulted in modifications to a particular module. We note that no such detailed change logs were available for our study of Linux. Perry (1994) presented evidence that the use of a software system depends not only on its size and age but also on factors such as the nature of the system itself (i.e., its application domain), previous experience with the system, and the processes, technologies, and organizational frameworks employed. Pfaffman (2008) observes that though many educators are unaware or dismissive of Free/Open Source Software, the number of FOSS tools continues to grow. According to him as Netscape released the source code to its Netscape Communicator package. Netscape’s decision resulted in Mozilla, a full-featured suite of software and, subsequently, the Firefox web browser. These Open Source programs continue to benefit Netscape’s commercial products. Similarly, Google’s servers run the FOSS Linux operating system; when Google’s programmers find problems and their solutions, those solutions are given back to the community so that all may benefit from them. Pearson (2000) concludes that Linux operating system has now reached the stage where it is being adopted commercially by the big computer manufacturers, as a competitor to the Microsoft proprietary Windows operating system in the server market. Mozilla and other important open source software Apache, which runs a majority of Internet servers, SendMail Internet E-mail software and Perl, the standard Internet scripting language. One variant of UNIX, the Berkeley BSD Unix, has been open source for many years. The market share of Windows NT has increased from 25.6% in 1996 to 41.9% in 2003, while the market share of Linux has also increased from mere 6.5% in 1996 to 38.0% in 2003. Indeed the honour of open source software speaks for itself in the busy world of information technology. Apache Web server has over 60% of market share, it’s nearest rival Microsoft’s IIS server has only a 25% share (Bitzer, 2004). With a reputation for speed, reliability, and efficiency, GNU/Linux now has more than 12 million users worldwide and an estimated growth rate of 40% per year (www.linux.org). With more than one-half of Fortune 500 companies now using GNU/Linux instead of Microsoft’s proprietary software, the market threat of F/OSS to Microsoft is more evident. With the recent surge in the use of GNU/Linux by individuals and companies, is it possible that users of F/OSS could eventually surpass those using Microsoft’s proprietary software (Elliott and Scacchi, 2008) TRIM 7 (2) July - Dec 2011 142 Analysis of Operating Systems and Browsers ... Lone & Wani Analysis and discussion Operating Systems Growth – Global Scenario Operating Systems - Monthly Use With the launch of new version of Windows OS i.e. Windows 7, growth of Windows XP has declined from 58.02% in May, 2010 to 46.57% in April, 2011 and growth of Windows 7 has increased from 14.84% in May, 2010 to 31.91% in April, 2011. While as growth of Windows Vista has declined but of Mac OS has remained stable for the same period. Linux, an open source operating system, has shown a slight declining trend. Operating Systems - Yearly Use Use of Windows XP has decreased from 69.57% in 2009 to 56.11% in 2010. Similarly, use of Windows Vista has also decreased. Windows 7, obviously, has increased its usage. Mac OSX and Linux have increased their share. Linux has increased from 0.69% in 2009 to 0.78% in 2010. TRIM 7 (2) July - Dec 2011 143 Analysis of Operating Systems and Browsers ... Lone & Wani Operating Systems –Cumulative Use (2009 – 2010) The overall use of operating system shows that Windows XP is the dominant OS in the market with 59.84% market share followed by Windows Vista with 19.33%. Newly entered Windows 7 is at the 3rd place (13.49%) within few months of arrival and Mac OSX is at 4th place. Linux is having a meagre 0.75% share for this period is little used by people throughout the world but nevertheless is increasing its share. The dominancy of Window based O.S is quite vivid and may stay for a long time to come due to its user friendly features and partly strong promotional marketing. Mobile Operating Systems – Global Scenario Mobile Operating Systems - Monthly Use Monthly use of mobile operating systems shows that Symbian operating system (OS) is maintaining a steady growth. This is followed by iOS which is showing a declining trend as it was having 29.01% share in May, 2010 against 23.34% share in April, 2011. BlackBerry OS shows a fluctuation as it increased from 14.15% in May, 2010 to 19.25% in November, 2010 and then decreasing to 13.54%. Open source operating system Android has shown a tremendous growth by jumping from 3.94% share in May, 2010 to 16.05% share in April, 2011. Android OS has surpassed BlackBerry in Feb, 2011 and is the only open source operating system among the top ranked. A vivid picture is provided in Fig 2.1. TRIM 7 (2) July - Dec 2011 144 Analysis of Operating Systems and Browsers ... Lone & Wani Mobile Operating Systems - Yearly Use Symbian OS, being the most used OS, has lost its market from 35.49% share to 32.29% due to the entry of new OS in the market. Year wise growth of mobile operating systems show that iOS is losing its market share by dipping to 25.48% in 2010 from 34.01% in 2009. BlackBerry and Android have increased their share by 8.34% and 6.41% respectively. The growth of other mobile OS can be seen from the fig.8. The increased market share of android is quite encouraging and is expected go further up in near future given the fact more mobile companies are keen to adopt android for their upcoming smart phones. Mobile Operating Systems - Cumulative Use (2009 – 2010) Symbian OS has the highest market share among all mobile OS with total share of 32.65%. It is followed by iOS BlackBerry with 26.46% and 15.54% th share respectively. Android, the open source OS has retained the 4 TRIM 7 (2) July - Dec 2011 145 Analysis of Operating Systems and Browsers ... Lone & Wani place with total share of 8.08% for 2009 and 2010. Sony Ericsson and th th Samsung are holding 6 and 7 spots respectively. The other open source mobile OS in the list include Linux but is having a negligible share of 0.01%. With the presence of two variants of open source OS the growth of FOSS OS can be expected to improve. Web Browsers – Global Scenario Web Browsers - Monthly Use The global scenario of web browsers use from May, 2010 to April, 2011 shows that Internet Explorer (IE) is showing declining trend with 52.77% share in May, 2010 against 44.52% in April, 2011 whereas Firefox is maintaining a study trend during same period with 31.64% share in May,2010 with slight depreciation (29.67%) in May, 2011. Open source browser - Firefox is followed by Chrome which is showing a huge increase th in usage growing from meagre 8.61% to 18.29%. Safari and Opera hold 4 th and 5 positions with a study growth. Since all browsers are free in nature the use of proprietary browsers due to strong marketing background is not unusual phenomenon while open source browser rd Firefox is occupying 1/3 of market share is a good sign for the promoters of open source movement. TRIM 7 (2) July - Dec 2011 146 Analysis of Operating Systems and Browsers ... Lone & Wani Web Browsers - Yearly Use IE was used more in 2009 (59.71%) than in 2010 (51.45%) while as use of Firefox is observed more in 2010 with 31.27% users worldwide against 30.48% in 2009 indicating steady growth. Likewise Chrome increased its users from 3.27% (2009) to 10.25% (2010). The figure below shows the usage of other browsers in 2009 and 2010 as well. Web Browsers - Cumulative Use (2009 – 2010) IE (53.74%) was the most used browser for 2009 and 2010. Firefox nd (31.24%) holds the 2 spot followed by Chrome (8.32%) and Safari (3.97%) while as Opera has 2.14% share. Among the other open source browsers SeaMonkey (0.03%), Flock (0.02%) Camino (0.01%), Konqueror (0.01%) and Minefield (0.01%) are also showing their presence in the list of browsers but with meagre use. Firefox’s strong presence as an open TRIM 7 (2) July - Dec 2011 147 Analysis of Operating Systems and Browsers ... Lone & Wani source browser augurs well and is expected to increase its share due fast accessibility feature and regular updates. Mobile Browsers – Global Scenario Mobile Browsers - Monthly Use The growth of Opera browser, the leading mobile web browser has declined from 26.68% in May, 2010 to 21.9% in April, 2011.iPhone and Nokia have maintained a steady trend but the growth of BlackBerry is declining after touching the peak in November, 2010. The interesting thing is that Android, only influencing open source mobile browser, has increased its share from meagre 6.3% in May, 2010 to 15.49% beating BlackBerry and inching toward Nokia and iPhone. TRIM 7 (2) July - Dec 2011 148 Analysis of Operating Systems and Browsers ... Lone & Wani Mobile browsers - Yearly Use Use of Opera browser has declined in 2010 (23.9%) as compared to 2009 (25.33%) but still retains the top spot. Almost all the other browsers are showing a declining trend in 2010 when compared to 2009 except Android, BlackBerry, Samsung and NetFront. BlackBerry has increased by 8.15% while as Android, open source mobile browser has increased by 6.63% showing a good promise in the near future. Mobile Browsers - Cumulative Use (2009 – 2010) For the last 2 years, Opera has maintained the first position with 18.63% nd rd th share while as iPhone, Nokia and BlackBerry hold 2 , 3 and 4 places. Android is the lone open source browser among the top 7 mobile browsers as shown in the Fig. 4.3. TRIM 7 (2) July - Dec 2011 149 Analysis of Operating Systems and Browsers ... Lone & Wani Conclusion The growth of FOSS operating systems and browsers in global market augurs well, though there is a plenty of scope for the open source software to expand its reach to length and breadth of the software industry given the fact Microsoft have dominant market share among operating systems and seems it does not confront some tough competition is near future. However, same cannot be said about rd browsers as OSS Firefox has already occupied 1/3 of market approximately giving tough time to proprietary software companies. While as proprietary software are dominant among mobile operating systems with marginal presence of Android. But given the hype and success android has gained in a short period and predictions by experts that android shall occupy future mobile operating system marketing promises well. Although same cannot be said about the mobile browsers where free mobile bowser Opera runs supreme with promising growth of BlackBerry and OSS Android. References Bitzer, Ju¨rgen (2004). Commercial versus open source software: the role of product heterogeneity in competition. Economic Systems, 28, 369–381. Retrieved April 5, 2011 from http://www.sciencedirect.com/science/article/pii/S0939362505 000026 Dalle, J. & Jullien, N. (1999). NT vs. Linux or some explanations into economics of Free Software. Paper presented at “Applied Evolutionary Economics”, Grenoble, June 7-9. Eick, S. G. et al., (2001). Does code decay? Assessing the evidence from change management data. IEEE Trans. on Software Engineering, 27(1). Retrieved April 25 from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.9 674&rep=rep1&type=pdf Elliott, Margaret S. and Scacchi, Walt (2008). Mobilization of software developers: the free software movement Information Technology & People, 21(1): 4-33. Retrieved April 5, 2011 from www.emeraldinsight.com/0959-3845.htm Pearson, Hilary E (2000). OPEN SOURCE — THE DEATH OF PROPRIETARY SYSTEMS? Computer Law & Security Report, 16 (3): 151-156. Retrieved April 5, 2011 from http://www.sciencedirect.com/science/article/pii/S0267364900 889062 Pfaffman, Jay (2008). Transforming High School Classrooms with Free/Open Source Software: It's Time for an Open Source Software Revolution. The High School Journal, 91 (3): 25-31. TRIM 7 (2) July - Dec 2011 150 Analysis of Operating Systems and Browsers ... Lone & Wani Retrieved April 5, 2011 from http://muse.jhu.edu/journals/high_school_journal/v091/91.3pf affman.html Kemerer, C. F., and Slaughter, S. (1999). An empirical approach to studying software evolution. IEEE Trans. on Software Engineering, 25(4). Retrieved April 12 from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.61.1 953&rep=rep1&type=pdf Lehman M. et al. (1997) Metrics and laws of software evolution—the nineties view. In Proc. of the Fourth Intl. Software Metrics Symposium (Metrics’97), Albuquerque, NM. Retrieved April 11from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.26.1 214&rep=rep1&type=pdf Lehman, M. M. & Belady L. A. (1985). Program Evolution: Processes of Software Change. Academic Press. Lehman, M., Perry, M. D. E., and Ramil, J. F.(1998). Implications of evolution metrics on software maintenance. In Proceeding of the 1998 International Conference on Software Maintenance (ICSM’98), Bethesda, Maryland, November . Retrieved April 15, 2011from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.38.3 720&rep=rep1&type=pdf Parnas, D. L. (1994) Software aging. In Proc. of the 16th Intl. Conf. on Software Engineering (ICSE-16), Sorrento, Italy, May. Retrieved April 14, 2011 from http://libresoft.es/grex/seminarios_files/parnas-sw-agingromera.pdf. Perkins, G. (1999). Culture clash and the road of word dominance. IEEE Software, 16(1), 80-84. Retrieved April 25, 2011 from Perry, D. E. (1994). Dimensions of software evolution. In Proc. of the 1994 Intl. Conf. on Software Maintenance (ICSM’94). Retrieved April 11, 2011 from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.25.9 725&rep=rep1&type=pdf Raymond, E., (1999). The cathedral and the bazaar. Retrieved April 5, 2011 from http://www.redhat.com/redhat/cathedral-bazar/ Turski W. M. ( 1996). Reference model for smooth growth of software systems. IEEE Trans. on Software Engineering, 22(8). Retrieved April 7, 2011 from http://www.computer.org/portal/web/csdl/doi/10.1109/TSE.19 96.10007 TRIM 7 (2) July - Dec 2011 151 Institutional Repositories: An Evaluative Study Hashim & Jan Institutional Repositories: An Evaluative Study * Tabasum Hashim ** Tariq Rashid Jan Abstract Purpose: The paper examines five web based open access repositories for the purpose of identifying their strength and limitations, using pre-defined standard parameters. Design/Methodology/Approach: The study used Directory of Open Access Repositories (Open DOAR) as a base for the collection of data. Findings: The analysis found that the repositories are credible and are equipped with rich sets of functionalities to facilitate depositing, accessing and retrieving scholarly materials. Originality/Value: The paper highlights the credibility related issues of institutional repositories in the present web based information retrieval environment. Keywords: Institutional Repositories (IR); Evaluation; Open Access Repositories Paper Type: Research Introduction T he availability of open-source IR systems has encouraged a proliferation of institutional repositories (IRs) worldwide, particularly among academic and research institutions. Based on the number of institutional repositories established over the past few years, the IR service appears to be quite attractive and compelling to institutions. They are beneficial for access to knowledge and the development of Science. These provide a permanent record of the research output of the institution and maximize the visibility, usage and impact of research through global access. Institutional repositories have become a platform of researchers and other academicians worldwide. They have helped the researchers to break the chains of time and space. Exposure to research and long term preservation have tempted the institutional elements to accept the repository technology with open arms. * Senior Professional Assistant. P.G. Department of Chemistry. University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] ** Associate Professor. P.G. Department of Statistics. University of Kashmir. Jammu and Kashmir. 190 006. India. email: [email protected] TRIM 7 (2) July - Dec 2011 152 Institutional Repositories: An Evaluative Study Hashim & Jan With the first academic institutional repository projects, the EPrints archive at Southampton (founded in 2001, and now internationally renowned as e-Prints Soton) and the DSpace initiative at MIT (2002) that begun in parallel with the Open Access Initiative (Cullen & Chawner, 2011), the growth of Institutional repositories has become ceaseless as is evident from sources like Open DOAR (http://www.opendoar.org/) and Open ROAR (http://roar.eprints.org/) . Institutional repositories have been successfully introduced, because of innumerable benefits associated with them they indeed provide a solution to concerns about the system of scholarly publishing (Cullen & Chawner, 2011). They have resulted in institutional progress in general and research community in particular. The development of institutional repositories emerged as a new strategy that allows universities to apply serious, systematic leverage to accelerate changes taking place in scholarship and scholarly communication, both moving beyond their historic relatively passive role of supporting established publishers in modernizing scholarly publishing through the licensing of digital content, and also scaling up beyond ad-hoc alliances, partnerships, and support arrangements with a few select faculty pioneers exploring more transformative new uses of the digital medium (Lynch, 2003). All repositories hold a similar mission to disseminate the research output of the scholarly community. The success of a repository depends on the quality of its content and service it provides. So, it is important that various features like acquisition, access of various materials and associated policies and various issues are needed to evaluate. Review of Literature A sizable literature is available on evaluation of repositories. A study by Fernandez (2006) reflects the status of open access repositories across India. Bertot and McClure (1998) also evaluated nine open access repositories in the field of Computer Science and Information Technology. The repositories have been evaluated using content; preservation policies; right management; promotion advertisements; TRIM 7 (2) July - Dec 2011 153 Institutional Repositories: An Evaluative Study Hashim & Jan services; feedback and access status as important parameters. Lynch (2003) has also discussed about the infrastructure of institutional repositories visualizing the future developments also. Carpenter, Graybill, Offord, Jr, and Piorun (2011) have also envisioned new features in the institutional repository world. Workflow pattern in institutional repositories has been researched by Hanlon and Ramirez (2011). A shifting landscape of institutional repositories is well knitted by various authors (Shreeves & Cragin, 2008; Nykanen, 2011). Repository management has been well been researched by number of authorities (Bide, 2002; Genoni, 2004; Medeiros, 2003; Poynder 2006; Markey, Rieh, St. Jean, Kim, & Yakel, 2007; McDowell, 2007). Metadata issues in institutional repositories has been researched by Dunsire (2008); Goldsmith & Knudson (2006) . Scope The scope of is limited to five web based open access repositories. Objective(s) The main objective of the study is to evaluate the various features of the institutional repositories using standard parameters identified for this purpose. Methodology After reviewing existing literature useful and relevant inform about evaluation activities of institutional repositories were studied. The literature proved extremely useful in identifying the main elements and issues. Five repositories were randomly selected using Toppet’s (1927) Random Number Table. These were: The Sydney eScholarship Repository, University of Sydney University of Melbourne Digital Repository Digital Repository of the University of Wolverhampton National Aerospace Laboratories Institutional Repository (NAL) Open MED@NIC TRIM 7 (2) July - Dec 2011 154 Institutional Repositories: An Evaluative Study Hashim & Jan Questionnaires were sent via e-mail to repository administrators to ascertain the content management policies of the repositories. Results & Discussion The collective discussion about evaluation of the repositories under study is given under different headings already being chosen for the purpose. Overview Out of five repositories selected for evaluation, the softwares used are DSpace, digitool , Open Repository Software and E-print Software.The repositories are generally maintained by the Information Management Web team. Open MED@NIC is maintained by bibliographic information Division, National Informatics Centre (NIC). The collection of most of the institutional repositories has records in thousands but one repository has records in hundreds. The contents of the selected repositories comprise mainly of journal articles, books, book chapters, conference papers, research datasets, journal/magazine articles, patents, preprints, presentations, research reports, technical reports, theses, multimedia files, documents of creatives, documents of patents, digital version of library collection. Repositories provide usage statistics on the monthly basis about uploading and downloading status, etc. Visual Interface The repositories have a clear user Interface and simple enough for nonexperienced users. Each repository is designed with their own branding/web interface design. The site functions are fairly simple and intuitive to use. The FAQ provides help about the most common problems, tailored to the user account type. Online help is also available in all the repositories. Resource Discovery The present study adopted the parameters used by Smith (2000) to evaluate the search features of different institutional repositories. The display features of any search interface are very pivotal for it. The results showed that the institutional output can be displayed by the order of relevancy, title, author, submit date, issue date, either ascending order or descending order. Again a sort bar that enables users to sort by author, TRIM 7 (2) July - Dec 2011 155 Institutional Repositories: An Evaluative Study Hashim & Jan date or title and change the number of results is also evident in the selected repositories. The full metadata records can also be viewed and the item recommended was sent via email to individuals. The repositories have browsing facilities in well organized forms. The traditional subject, author and collections listings, besides listing by title, date issued and date submitted can also be generated. Access The repositories have some mechanism to control access to their collection. Options permit access to free abstracts without any registration. Most repositories seek the user to register for accessing full text collection. Some repositories restrict full text to intranet having an agreement with publishers or owners of content. System Features The repository need basic software and hardware. They also support the LDAP authentication. Text/document file support in the form of HTML, PDF, Postscript, plain text, Richtext format ,XML,MS-WORD, MS-EXCEL, MS-Power point JPEG,PNG, GIF, BMPetd are found in all the repositories. Three types of metadata form the structural framework of selected repositories viz, descriptive, administrative and structural. Some metadata elements are auto generated in repositories.W3C standards XHTML 1.0 label is present on site. Metadata standards include MARC, Dublin Core, Metadata Object Description Schema (MODS), SRW and Metadata Encoding and Transmission standards (METS).The workflow integration supports use workflow tools. Content Management Policy Almost all the institutional repositories accept post and preprints of research publications of in house researchers, annual reports, theses, institutional publications etc. All repositories under study support for web-based document management, auditing, simple workflow, including research status, publishing rights and ability to edit incorrect content. All the contents have to pass through an administration process before publication. The provision for storage and long term preservation is there. OpenMed does not have well documented collection policies. TRIM 7 (2) July - Dec 2011 156 Institutional Repositories: An Evaluative Study Hashim & Jan Most institutions allow both unmediated and mediated submission of documents. The most commonly accepted document formats are MSWord, PDF and LaTeX. Conclusion All repositories systems are equipped with rich sets of functionalities to facilitate depositing, accessing, and retrieving scholarly materials and all repositories take advantage of web technology for their cross-server functionality. By introducing their product to the scholarly communities all over the world, they have taken successful steps towards making the repositories they have developed an integrated part of the new means of international information dissemination. Changes in repository content management are changing rapidly. The effectiveness and efficiency of the institutional repositories is reflective in the policies adopted by the institutional repositories to work successfully in the present web based information retrieval systems. References Bertot, J.C., & McClure, C.R. (1998). Measuring electronic services in public libraries; issues and recommendations. Public Libraries 37(3), pp.176–180 Bide, M. (2002). Open archives and intellectual property: incompatible world views? Open Access Forum, Bath. Retrieved from www.oaforum.org/otherfiles/oaf_d42_cser1_bide.pdf Carpenter, M., Graybill, J., Offord, J., Jr., & Piorun, M. (2011). Envisioning the Library’s Role in Scholarly Communication in the Year 2025. portal: Libraries and the Academy. 11 (2), 659–681. doi:10.1353/pla.2011.0014 Cullen, R., & Chawner, B. (2011). Institutional Repositories, Open Access, and Scholarly Communication: A Study of Conflicting Paradigms. The Journal of Academic Librarianship Volume. 37 (6), 460–470. TRIM 7 (2) July - Dec 2011 157 Institutional Repositories: An Evaluative Study Hashim & Jan Dunsire, G. (2008). Collecting metadata from institutional repositories. OCLC Systems & Services: International digital library perspectives. 24 (1), 51-58. doi: 10.1108/10650750810847251 Hanlon, A., & Ramirez, M. (2011). Asking for Permission: A Survey of Copyright Workflows for Institutional Repositories. portal: Libraries and the Academy. 11 (2), 683–702. doi: 10.1353/pla.2011.0015 Fernandez, L. (2006). Open access initiatives in India: An evaluation. The Canadian Journal of Library and Information Practice and Research, 1(1). Retrieved from http://www.dlib.org/dlib/january05/foster/01foster.html Genoni, P. (2004). Content in institutional repositories: a collection management issue. Library Management. 25 (6-7), 300-306. doi: 10.1108/01435120410547968 Goldsmith, B., & Knudson, F. (2006). Repository librarian and the next crusade: the search for a common standard for digital repository metadata. D-lib Magazine. 12 (9). Retrieved from http://dlib.ukoln.ac.uk/dlib/september06/goldsmith/09goldsmit h.html Lynch, C. A. (2003). Institutional Repositories: Essential Infrastructure For Scholarship In The Digital Age. portal: Libraries and the Academy. 3 (2), 327-336. doi: 10.1353/pla.2003.0039 Markey, K., Rieh, S. Y., St. Jean, B., Kim, J., & Yakel, E. (2007). Census of Institutional Repositories in the United States: MIRACLE Project Research Findings. Washington, D.C: CLIR. Retrieved from http://www.clir.org/pubs/reports/pub140/pub140.pdf McDowell, C. S. (2007). Evaluating institutional repository deployment in American academe since early 2005: Repositories by the numbers, Part 2. D-Lib Magazine, 13 (9/10). Retrieved from http://www.dlib.org/dlib/september07/mcdowell/09mcdowell. html TRIM 7 (2) July - Dec 2011 158 Institutional Repositories: An Evaluative Study Hashim & Jan Medeiros, N. (2003). E-prints, institutional archives, and metadata: disseminating scholarly literature to the masses. OCLC Systems & Services. 19 (2), 51-3. doi: 10.1108/10650750310481757 Nykanen, M. (2011). Institutional Repositories at Small Institutions in America: Some Current Trends. Journal of Electronic Resources Librarianship. 23 (1), 1-19. doi: 10.1080/1941126X.2011.551089 Poynder, R. (2006). Clear blue water. Retrieved from http://poynder.blogspot.com/2006/03/institutionalrepositories-and-little.html Shreeves, S. L., & Cragin, M. H. (2008). Introduction: Institutional Repositories: Current State and Future. Library Trends. 57 (2), 89-97. doi: 10.1353/lib.0.0037 Smith, A. G. (2000). Search features of digital libraries Information Research, 5(3). Retrieved from http://informationr.net/ir/5-3/paper73.html TRIM 7 (2) July - Dec 2011 159 Open Source Code doesn’t always help… Bhat & Quadri Open Source Code Doesn’t Always Help: Case of File System Development Wasim Ahmad Bhat S.M.K. Quadri Abstract Purpose: One of the most significant and attractive features of Open Source Software (OSS), other than its cost, is its open source code. It is available in both flavours; system and application. It can be customized and ported as per the requirements of the end user. As most of the system software run in the kernel mode of operating system and system programmers constitute a small chunk of the programmers, the code customization of Open Source System Software is less realized practically. In this paper, the authors present file system development as a case of Kernel Mode System Software development and argue that customization of Open Source Code available for file systems is not preferred. To support the argument, the authors discuss various challenges that a developer faces in this process. Furthermore, the authors look into the user mode file system development for possible solution and discuss the architecture, advantages and limitations of most popular and widely used framework called File system in UserSpace (FUSE). Finally, the authors conclude that the user mode alternative for file system development and/or extension supersedes kernel mode development. Design/Methodology/Approach: The broad domain, complexity, irregularity and limitations of kernel development environment are made as a base to put forth our argument. Moreover, the existence of rich and capable user-mode file system development frameworks are used to supplement the argument. Findings: The research highlights the fact that kernel mode file system development is difficult, bug prone, time consuming, exhaustive and so on, even with source code at disposal. Furthermore, it highlights the existence of user mode alternative which is easy, reliable, portable, etc. Research Implications: The research considers file system development as a case of kernel mode development. Fortunately, in this case, the authors have choice of user mode alternatives. However, author argument cannot be generalised for those kernel modules wherein there is no user mode alternative. Furthermore, the authors did not take into consideration the benefits of extending file systems in kernel mode. Originality/Value: The research stresses that having open source code is not enough to make a choice when we cannot use it in a reliable and productive manner. Keywords: Open Source Software); Open Source System Software, Source Code, File System, Kernel Mode, User Mode, File system in User-Space (FUSE) Paper Type: Argumentative Ph.D. Scholar. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] TRIM 7 (2) July - Dec 2011 160 Open Source Code doesn’t always help… Bhat & Quadri Introduction pen Source Software is consistently gaining on its software market share because of its two most notable strengths which include low cost and availability of source code. Although, for some low cost is enough to make a choice while for others availability of source code is mandatory. Having source code at our disposal, the product can be customized or optimized as per the requirements or can be used to fix unanticipated bugs. Open source ideology is logically simple; one creates OSS project, uploads the project along with source code and license to download, customize, distribute, compile and use it. There are many portals that host OSS projects; http://sourceforge.net being the most popular one. OSS paradigm started unknowingly early in 1960’s when RFCs for network protocols were created by ARPANET, followed by a big boost by Linus Torvalds’ Linux OS. The paradigm has spread geographically because of Internet and has penetrated into every aspect of software development, be it an application software or system software. This penetration is largely because of Linux Operating System which is the one of the most prominent example of OSS and provides an excellent platform to develop such software. OSS has attracted Computer Science researchers all over the globe because of the availability of source code that too just couple of clicks away. Specifically, researchers working on the system side aspects of Computer Science have been using Linux OS to implement and test their ideas and innovations by customizing the source code and recompiling it. One of the most notable system side research areas that specifically depend upon the availability of source code is File System Development (FSD). FSD includes designing and developing a new file system from scratch and/or extending the existing ones in order to accommodate and cope up with change in hardware technology and user requirements. Designing and developing a file system from the scratch is practically rarely practiced. There are many reasons for this; a significant innovation is required in new design, a number of good designs are already available and implemented a lot of knowledge about operating systems internals and experience with system programming is required for development. But, because the hardware technology is both getting advanced and affordable, the digital data proliferation rate is very high. This has created voluminous amount of digital data which needs to be managed efficiently, reliably and securely. This change in hardware technology and user requirements asks for optimization, refinement and fine tuning of existing file systems. Linux, the Open Source System Software, provides a good platform for testing and implementing such refinements. Linux is a pioneer and prominent software in OSS community. The Linux kernel comes with O TRIM 7 (2) July - Dec 2011 161 Open Source Code doesn’t always help… Bhat & Quadri more than two dozen file systems along with source code. These inkernel file systems are difficult to develop and debug. In this paper, authors argue that code customisation of in-kernel file systems, to extend their capabilities, is not preferred even with open source code. To support the argument authors discuss various challenges that are faced by a developer in this process. Furthermore, authors navigate to file system development in user space to look for possible solution. The authors present an overview of various user space frameworks and discuss the architecture, advantages and disadvantages of most popular and widely used framework called FUSE. This existence of rich and capable user space framework supplement author argument. Why in-kernel code customization of File System is not preferred? File systems represent one of the most important aspects of operatingsystem services. Traditionally, file systems are integrated with the operating system kernel. Earlier, file system syscals directly invoked file system methods. This architecture made it difficult to add multiple file systems to an OS. In 1986, to address this problem, Kleiman (1986) introduced virtual node or vnode which provides a layer of abstraction that separates core OS from file systems. This architecture finally matured into VFS in UNIX like and UNIX based OSes. Rosenthal (1992) proposed layering to extend capabilities of file systems and modified VFS of SunOS to support it. All these demarcations and modifications remained within the boundary and domain of the kernel. As mentioned earlier, file systems need to evolve. Customizing in-kernel file systems is a challenging task because of variety of reasons. First, this approach requires the programmer to understand and deal with complicated kernel code and data structures. Thus, a deep understanding of operating system (kernel) internals is required even to make a small change in existing code or to add some new code. The situation is worse than it seems as the operating systems vary in their kernel architectures, same architectures vary in major aspects for different flavours, same flavours vary in crucial implementations for different versions and same versions vary in degree of cohesion for different underlying hardware. All these factors finally lead to a time consuming and exhaustive effort of a programmer to understand internals of a specific kernel release. Furthermore, such programmers constitute a small chunk of programmers. Second, even if this is all what is required and is successfully done, the code customization can induce more bugs than expected. The kernel development environment lacks facilities that are available to application programmers. For instance, the kernel code lacks memory protection as it runs in supervisor mode of operating system and as such a single wild TRIM 7 (2) July - Dec 2011 162 Open Source Code doesn’t always help… Bhat & Quadri pointer can bring down the system which otherwise could have only terminated the application. Also, it requires careful use of synchronization primitives, can only be written in C and that too without being linked against the standard C library and so on. All these factors lead to a higher probability of not only a simple bug induction but a bug that is capable enough to bring down the system and hence affects the reliability of the operating system. Third, if the customization is inevitable then debugging not only is obvious but tedious. Debugging kernel code is much difficult than debugging user space code as kernel code development lacks facilities found in IDE for most programming languages for user mode development. For instance, the famous “Blue Screen of Death” on Windows platform is still there since the inception of Windows. Fourth, even a fully functional in-kernel file system still has several disadvantages. Porting a file system written for a particular kernel to a different one can require significant changes in the design and implementation, though the use of similar file system interfaces (such as the VFS layer) on several Unix-like systems makes the task somewhat easier. Finally, an in-kernel file system can be mounted only with super user privileges. This can be a hindrance for file system development and usage on centrally administered machines, such as those in universities and corporations. How File Systems can be extended in User Space? In contrast to kernel development, programming in user space minimizes or completely eliminates several of the aforementioned issues. By developing and/or extending file systems in user space, the programmer need not to worry about the intricacies and challenges of kernel-level programming and has access to a wide range of familiar programming languages, third-party tools and libraries. Further, a highly dangerous bug can at most terminate the application and hence can never break the reliability of kernel. Moreover, debugging is comparatively much easier. Of course, user space file systems may still require some effort to be ported to different operating systems. However, this depends on the extent to which a file system’s implementation is coupled with a particular operating system’s internals. In order to develop and/or extend file systems in user space, a framework is required which traps the file system calls in kernel and passes them to user space to be processed. The framework should also provide a simple and powerful set of APIs in user space that are common amongst most operating systems. Various projects have aimed to support development of user space file systems while exporting an API similar to that of the VFS TRIM 7 (2) July - Dec 2011 163 Open Source Code doesn’t always help… Bhat & Quadri layer. A brief introduction of such popular frameworks is as follows. UserFS consists of a kernel module that registers a UserFS file system type with the VFS (Fitzhardinge, n.d). All requests to this file system are then communicated to a user space library through a file descriptor. The Coda distributed file system contains a Coda kernel module which communicates with user space cache manager, Venus, through a character device /dev/cfs0 (Satyanarayanan, Kistler, Kumar, Okasaki, Siegel & Steere, 1990). UserVFS, which was developed as a replacement for UserFS, uses this Coda character device for communication between the kernel module and the user space library (Machek, n.d). Similarly, Arla is an AFS client that consists of a kernel module, xfs, which communicates with the arlad user space daemon to serve file system requests (Westerlund & Danielsson, 1998). The ptrace () system call can also be used to build an infrastructure for developing file systems in user space (Spillane, Wright, Sivathanu & Zadok, 2007). An advantage of this technique is that all OS entry points, rather than just file system operations, can be intercepted. The downside is that the overhead of using ptrace() is significant, which makes this approach unsuitable for production level file systems. The number of production-quality systems that provide a standardized API for developers to design a unique file system in user space is still small, but there is one commonly used and well deployed system called FUSE; part of the Linux kernel since version 2.6.14. FUSE: A Widely Used Framework for File Systems in User Space The fundamental design consideration of microkernel implementations such as Mach and the MIT exo-kernel is to reduce the complexity of the kernel. Both approaches remove all but the most basic operating system services from the kernel, moving it to programs residing in userspace. FUSE (File system in User-Space) is a recent example of this general trend in operating system design (Szeredi, n.d). FUSE is the most well-known example of a user space file system framework. FUSE design provides a thin layer in kernel which traps and forwards file system calls meant for mounted FUSE file system to user space. In user space, FUSE provides a library interface to implement the corresponding file system call functionality. Architecture of FUSE FUSE is a three-part system (shown as shaded blocks in Fig. 1).The first of those parts is a kernel module, FUSE, which hooks into the VFS code and looks like a file system module. It registers fusefs file system type with VFS and also implements a special-purpose device /dev/fuse. In user space, FUSE implements a library, libfuse, which manages TRIM 7 (2) July - Dec 2011 164 Open Source Code doesn’t always help… Bhat & Quadri communications with the kernel module. It accepts file system requests from the FUSE device and translates them into a set of function calls which look similar (but not identical) to the kernel's VFS interface. Finally, there is a user-supplied component (userfs in our example in Fig. 1) which actually implements the file system of interest. It fills a structure with pointers to its functions which implement the required operations in whatever way makes sense. Fig. 1: Path of a read () call for a file residing in FUSE file system Fig. 1 shows the path of a read()call for a file residing in FUSE file system. In this example, the user space file system functionality is implemented as a set of callback functions in the userfs program, which is passed the mount point, /fuse. Once userfs FUSE file system is mounted, all file system calls targeting the mount point, /fuse, are forwarded to the FUSE kernel module. When an application issues a read() system call for the file /fuse/file, the VFS invokes the appropriate handler in fusefs. If the requested data is found in the page cache, it is returned immediately. Otherwise, the system call is forwarded over a character device, /dev/fuse, to the libfuse library, which in turn invokes the callback defined in userfs for the read() operation. The callback may take any action, and return the desired data in the supplied buffer. For instance, it may do some pre-processing; request the data from the underlying file system (such as Ext4 in our example) and then post-process the read data (Mathur, Cao, Bhattacharya, Dilger, Tomas & Vivier, 2007). Finally, the result is propagated back by libfuse, through the kernel to the application that issued the read() system call. TRIM 7 (2) July - Dec 2011 165 Open Source Code doesn’t always help… Bhat & Quadri Advantages of using FUSE framework FUSE’s relatively loose policy of implementing file system APIs allows developers to run file systems with only a few functions implemented. Also, FUSE presents an application with a well-known, standardized and native file system that accepts regular system calls. This means that applications can use interesting and cutting edge file systems on FUSE without changing any code inside the application. This easy prototyping and application friendliness of FUSE’s design clearly encourages not only file system developers but also other people who are not familiar with kernel programming to challenge themselves by implementing their own file systems. Developers implementing a file system in user space no longer have to recompile the kernel or worry about crashing the operating system during development. FUSE moves a step further by allowing unprivileged users to safely mount their own file systems, even ones they make themselves, as long as the system administrator loads the FUSE kernel module. More than twenty different language bindings are available for FUSE, allowing file systems to be written in languages other than C. This means that programmers can use languages that are based on different programming paradigms, offer different levels of type safety and type checking, and are generally intended for different usage scenarios. The main advantage of FUSE over other similar projects is its large and active user community. The FUSE user community has developed several dozen file systems to date, several of which provide significant functionality to the platforms supported by FUSE. Among the more interesting FUSE file systems are Wayback (Cornell, Dinda & Bustamante, 2004), NTFS-3g (NTFS-3g, n.d) and SSHFS (Szeredi, n.d). These provide a versioning file system, safe read and write support for NTFS volumes, and a file system based on secure communications over SFTP, respectively. Furthermore, FUSE framework has been ported to almost all platforms including Windows (Driscoll, Beavers & Tokuda, n.d). Thus, FUSE file systems not only are reliable and easy to develop and debug, but are also highly portable. Performance Issues in FUSE File Systems There are certain performance issues related to FUSE framework’s architecture (Rajgarhia & Gehani, 2010). First, when only anin-kernel file system (such as Ext4 alone) is used, there are two user-kernel mode switches per file system operation (i.e. to and from the kernel), and no process context switches. User-kernel mode switches are inexpensive and involve only switching the processor from unprivileged user mode to TRIM 7 (2) July - Dec 2011 166 Open Source Code doesn’t always help… Bhat & Quadri privileged kernel mode, or vice versa. However, FUSE introduces two process context switches for each file system call. There is a process context switch from the user application that issued the system call to the FUSE user space library, and another one in the opposite direction. A context switch can have a significant cost, although the cost may depend vastly on a variety of factors such as the processor type, workload, and memory access patterns of the applications between which the context switch is performed. Second, while using an in-kernel file system alone, data need to be copied in memory only once, either from the kernel’s page cache to the application issuing the system call, or vice versa. FUSE also introduces two additional memory copies. While writing data to a FUSE file system, the data is first copied from the application to the page cache, then from the page cache to libfuse via /dev/fuse, and finally from libfuse to the page cache when the system call is made to the in-kernel file system. For read(), the copying is similarly performed in opposite direction. If the FUSE file system is mounted with the DIRECT_IO option, then the FUSE kernel module bypasses the page cache and forwards the applicationsupplied buffer directly to the user space daemon. In this case, only one additional memory copy is performed. The advantage of using DIRECT_IO is that writes are significantly faster due to the reduced memory copy. The downside is that each read() request has to be forwarded to the user space file system, as data is not present in the page cache, thus affecting read performance severely. Finally, when using the in-kernel file system alone, all data that is read from or written to the disk is cached in the kernel’s page cache. With FUSE, fusefs also caches the data in the page cache, resulting in two copies of the same data being cached. Although the use of the page cache by FUSE is very beneficial for read operations since it avoids unnecessary context switches and memory copying, the fact that the same data is cached twice reduces the efficiency of the page cache. In Linux, one can open files on the native file system using the O_DIRECT flag and thereby eliminate caching by the native file system. However, this is generally not a feasible solution since O_DIRECT imposes alignment restrictions on the length and address of the write() buffers and on the file offsets. Conclusion In this paper the authors argued that the in-kernel code customization of open source system software like file system, is not practically feasible as it requires deep understanding of operating system internals, experience with kernel level programming, is time consuming and exhaustive, and so on. Furthermore, the process is highly prone to simple bugs which can TRIM 7 (2) July - Dec 2011 167 Open Source Code doesn’t always help… Bhat & Quadri crash operating system, and demand studious and exhaustive debugging. All these factors lead to slow progress in file system development with higher probability of low operating system reliability and file system productivity; all this with even source code at the disposal. The research also highlighted the concept of file system development (extension) in user space and explained the basic architecture of most popular user space file system development framework called FUSE. Although, there are certain performance issues related to FUSE but the gains outcast the issues. It can be safely argued that a file system extended using FUSE framework is very easy to develop and debug, in addition to being highly reliable and portable as compared to the one which is extended by customising and recompiling the source. This is enough to validate the argument that almost all user-space file system frameworks are from open source community; surfaced to overcome the code customisation problem in one of their pioneer and flagship product, Linux OS. References Cornell, B., Dinda, P., and Bustamante, F. (2004). Wayback: A User-level Versioning File System for Linux. In Proceedings of the annual conference on USENIX Annual Technical Conference (ATEC '04), Article 27. Driscoll, E., Beavers, J., & Tokuda, H. (n.d). FUSE-NT: Userspace File Systems for Windows NT. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131. 3896 Fitzhardinge, J. (n.d). UserFS. Retrieved from http://www.goop.org/~jeremy/userfs Kleiman, S. R. (1986). Vnodes: An architecture for multiple File system types in Sun UNIX. In Proceedings of the Summer USENIX Technical Conference, pp. 238-247. Machek, P. (nd). UserVFS. Retrieved from http://sourceforge.net/projects/uservfs Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., & Vivier, L. (2007). The new ext4 filesystem: Current status and future plans. In Proceedings of the Ottowa Linux Symposium. NTFS-3g. (n.d). Retrieved from http://www.tuxera.com/community/ntfs-3g-manual/ Rajgarhia, A., & Gehani, A. (2010). Performance and extension of user space file systems. In Proceedings of the 2010 ACM Symposium on Applied Computing (SAC '10), pp. 206-213. Rosenthal, D. S. H. (1992). Requirements for a “Stacking” Vnode/VFS interface. Tech. Rep. SD-01-02-N014, UNIX International. TRIM 7 (2) July - Dec 2011 168 Open Source Code doesn’t always help… Bhat & Quadri Satyanarayanan, M., Kistler, J. J., Kumar, P., Okasaki,M. E., Siegel, E. H., & Steere, D. C. (1990). Coda: A highly available file system for a distributed workstation environment. IEEE Transaction on Computers, 39(4), 447-459. Spillane, R. P., Wright, C. P., Sivathanu, G., and Zadok, E. Rapid file system development using ptrace. In Proceedings of the 2007 workshop on Experimental computer science (ExpCS '07), ACM, Article 22. Szeredi, M. (n.d). File system in user space. Retrieved from http://fuse.sourceforge.net Szeredi, M. (n.d). SSH filesystem. Retrieved from http://fuse.sourceforge.net/sshfs.html Westerlund, A., & Danielsson, J. (1998). Arla- a free AFS client. In Proceedings of the annual conference on USENIX Annual Technical Conference(ATEC '98), Article 32. TRIM 7 (2) July - Dec 2011 169 A new Approach of CLOUD… Srivastava and Kumar A New Approach of CLOUD: Computing Infrastructure on Demand * Kamal Srivastava ** Atul Kumar Abstract Purpose: The paper presents a latest vision of cloud computing and identifies various commercially available cloud services promising to deliver the infrastructure as a service (IaaS). Design/Methodology/Approach: Cloud computing provides the architectural detail and different types of clouds such as we studied different cloud based architectures like Blue Cloud built on IBM's massive scale computing initiatives, Google Cloud which claims business can get started using Google Apps online pretty much instantly, Salesforce.com consists of development as service, a set of development tools and APIs that enables enterprise developers to easily harness the promise of the cloud computing. Findings: It was found that Cloud computing is changing the way we provision hardware and software for on-demand capacity fulfillment and changing the way we develop web applications and make business decisions. Keywords: Cloud Computing, Amazon Elastic Compute Cloud, Google App Engine, Microsoft Azure, Salesforce.com. Paper Type: Survey Introduction he term “cloud”, as used in this white paper, appears to have its origins in network diagrams that represented the internet, or various parts of it, as schematic clouds. “Cloud computing” was coined for what happens when applications and services are moved into the internet “cloud.” Cloud computing is not something that suddenly appeared overnight; in some form it may trace back to a time when computer systems remotely time-shared computing resources and applications. More currently though, cloud computing refers to the many different types of services and applications being delivered in the internet cloud, and the fact that, in many cases, the devices used to access these services and applications do not require any special applications. Cloud Computing refers to both the applications delivered as services over the Internet and the hardware and systems software in the data centers that provide those services. A cloud computing platform dynamically provisions, configures, reconfigures, and deprivations servers as needed. Cloud applications are those that are extended to be accessible through the Internet. The datacenter hardware and software is what we will call a Cloud. Cloud computing is changing the way we provide hardware and T * Department of Computer Science, Shri Ramswaroop Memorial College of Engg. & Mgmt. Lucknow, U.P., India. email: [email protected] ** Department of Computer Science , Shri Ramswaroop Memorial College of Engg. & Mgmt. Lucknow, U.P, India. email: [email protected] TRIM 7 (2) July - Dec 2011 170 A new Approach of CLOUD… Srivastava and Kumar software for on-demand capacity fulfillment and changing the way we develop web applications and make business decisions. Cloud computing is a computing paradigm in which tasks are assigned to a combination of connections, software and services accessed over a network. This network of servers and connections is collectively known as “the cloud.” Computing at the scale of the cloud allows users to access supercomputer-level power. Users can access resources as they need them. Understanding Cloud Computing Cloud computing describes how computer programs are hosted and operated over the Internet. The key feature of cloud computing is that both the software and the information held in it live on centrally located servers rather than on an end-user's computer. How does cloud computing works? The concept is fairly simple. First, consider the traditional means of running application, and application appears to run on a dumb terminal or, these days your PC; practicality, this is only frontend of the application. Your computer is connected to a server that actually runs the program or application and returns output to personal computer. The server constitutes the backend and it can be located in the same building as you are or not. With cloud computing application program runs somewhere within the cloud; ideally the user concern only with applications that are available and need not to be aware of the underlying technology or the physical location of the Application's computer. User desktop is connected via internet to a server farm, a collection of remote servers that runs many, many applications at once. Which server or servers an application runs on is determined by the application program already running on the machines; there is an attempt to balance the load so that all of the programs run optimally. There are number of companies that offer cloud computing services like Amazon offers something called Amazon Elastic Compute Cloud (EC2), Google with its own cloud computing offering, Google App Engine and Microsoft offers Microsoft Azure. When a Cloud is made available in a pay-as-you-go manner to the public, we call it a Public Cloud; the service being sold is Utility Computing. We use the term Private Cloud to refer to internal datacenters of a business or other organization that are not made available to the public. Thus, Cloud Computing is the sum of SaaS (Software as a Service) and Utility Computing, but does not normally include Private Clouds. From a hardware point of view, three aspects are new in Cloud Computing (Vogels, 2008): 1. The illusion of infinite computing resources available on demand, thereby eliminating the need for Cloud Computing users to plan far ahead for provisioning. TRIM 7 (2) July - Dec 2011 171 A new Approach of CLOUD… Srivastava and Kumar 2. The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs. 3. The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed, thereby rewarding conservation by letting machines and storage go when they are no longer useful. As a successful example, (Armbrust, et al, 2009) Elastic Compute Cloud (EC2) from Amazon Web Services (A WS) sells 1.0-GHz x86 ISA “slices” for 10 cents per hour, and a new “slice”, or instance, can be added in 2 to 5 minutes. Amazon's Scalable Storage Service (S3) charges USD 0.12 to USD 0.15 per gigabyte month, with additional bandwidth charges of USD 0.10 to USD 0.15 per gigabyte to move data in to and out of A WS over the Internet. Commercially Available Cloud Services 1) Google: The core of Google's business is all in Cloud Computing. Services delivered over network connections include search, e-mail, online mapping, office productivity (including documents, spreadsheets, presentations, and databases), collaboration, social networking and voice, video, data services. Users can subscribe to these services for free or pay for increased levels of service and support. 2) Amazon: As the world's largest online retailer, the core of Amazon's business is ecommerce. While ecommerce itself can be considered Cloud Computing, Amazon has also been providing capabilities which give IT department’s direct access to Amazon compute power. Key examples include S3 (Simple Storage Services) and EC2. Any internet user can access storage in S3 and access stored objects from anywhere on the Internet. EC2 is the Elastic Compute Cloud, a virtual computing infrastructure able to run diverse applications ranging from web hosts to simulations or anywhere in between. This is all available for a very low cost per user 3) Microsoft: Traditionally Microsoft's core business has been in device operating systems and device office automation software. Since the early days of the Internet Microsoft has also provided web hosting, online e-mail and many other cloud services. Microsoft now also provides office automation capabilities via a cloud (“Office Live”) in an approach referred to as “Software Plus Services” vice “Software as a Service” to allow synchronous/asynchronous integration of online Cloud documents with their traditional offline desktopresident versions. 4) Salesforce.com: The core mission of Salesforce.com has been in TRIM 7 (2) July - Dec 2011 172 A new Approach of CLOUD… Srivastava and Kumar delivery of capabilities centered on customer relationship management. However, in pursuit of this core Salesforce.com has established themselves as thought leaders in the area of Software as a Service and is delivering an extensive suite of capabilities via the Internet. A key capability provided is the site Force.com, which enables external developers to create add-on applications that integrate into the main Salesforce.com application and are hosted on the infrastructure Salesforce.com. 5) VMware: Provides several technologies of critical importance to enabling cloud computing, and has also started offering its own cloud computing on demand capability called vCloud. This type of capability allows enterprises to leverage virtualized clouds inside their own IT infrastructure or hosted with external service providers. New Application Opportunities 1) Mobile interactive applications: Tim O'Reilly believes that “the future belongs to services that respond in real time to information provided either by their users or by nonhuman sensors” (Li., et al, 2009). Such services will be attracted to the cloud not only because they must be highly available, but also because these services generally rely on large data sets that are most conveniently hosted in large datacenters. While not all mobile devices enjoy connectivity to the cloud 100% of the time, the challenge of disconnected operation has been addressed successfully in specific application domains, so we do not see this as a significant obstacle to the appeal of mobile applications. 2) Parallel batch processing: Cloud computing presents a unique opportunity for batch-processing and analytics jobs that analyze terabytes of data and can take hours to finish. If there is enough data parallelism in the application, users can take advantage of the cloud's new “cost associativity”: using hundreds of computers for a short time costs the same as using a few computers for a long time. For example, Programming abstractions such as Google's MapReduce (Dean & Ghemawat, 2004) and its open-source counterpart Hadoop (Bialecki, Cafarella, Cutting & O'Malley, 2005 allow programmers to express such tasks while hiding the operational complexity of choreographing parallel execution across hundreds of Cloud Computing servers. 3) Analytics: A special case of compute-intensive batch processing is business analytics. While the large database industry was originally dominated by transaction processing, that demand is leveling off. A growing share of computing resources is now spent on understanding customers, supply chains, buying habits, ranking, and TRIM 7 (2) July - Dec 2011 173 A new Approach of CLOUD… Srivastava and Kumar so on. Hence, while online transaction volumes will continue to grow slowly, decision support is growing rapidly, shifting the resource balance in database processing from transactions to business analytics. 4) Extension of compute-intensive desktop applications: The latest versions of the mathematics software package Matlab and Mathematica are capable of using Cloud Computing to perform expensive evaluations. Other desktop applications might similarly benefit from seamless extension into the cloud. Again, a reasonable test is comparing the cost of computing in the Cloud plus the cost of moving data in and out of the Cloud to the time savings from using the Cloud. Cloud Architectures and Infrastructure Cloud computing architecture comprised of two components (hardware and application). These two components have to work together seamlessly or else cloud computing will not be possible. Cloud computing requires an intricate interaction with the hardware which is very essential to ensure uptime of the application. If application fails, the hardware will not be able to push the data and implement certain processes. On the other side, hardware failure will mean stoppage of operation. Applications built on Cloud Architectures are such that the underlying computing infrastructure is used only when it is needed (for example to process a user request), draw the necessary resources on demand (like compute servers or storage), perform a specific job, then relinquish the unneeded resources and often dispose themselves after the job is done. While in operation the application scales up or down elastically based on resource needs. Applications built on Cloud Architectures run in-the cloud where the physical location of the infrastructure is determined by the provider. They take advantage of simple APIs of Internet-accessible services that scale on demand, that are industrial-strength, where the complex reliability and scalability logic of the underlying services remains implemented and hidden inside-the-cloud. The usage of resources in Cloud Architectures is as needed, sometimes ephemeral or seasonal, thereby providing the highest utilization and optimum bang for the buck. Instead of building your applications on fixed and rigid infrastructures, Cloud Architectures provide a new way to build applications on ondemand infrastructures. Cloud Architectures address key difficulties surrounding large-scale data processing. In traditional data processing it is difficult to get as many machines as an application needs. Second, it is difficult to get the machines when one needs them. Third, it is difficult to distribute and coordinate a large-scale job on different machines, run processes on them, and provision another machine to recover if one TRIM 7 (2) July - Dec 2011 174 A new Approach of CLOUD… Srivastava and Kumar machine fails. Fourth, it is difficult to auto scale up and down based on dynamic workloads. Fifth, it is difficult to get rid of all those machines when the job is done. Cloud Architectures solve such difficulties. A. The on-demand, self-service, pay-by-use model The on-demand, self-service, pay-by-use nature of cloud computing is also an extension of established trends .From an enterprise perspective, the on-demand nature of cloud computing helps to support the performance and capacity aspects of service level objectives (Sun Microsystems, 2009). The self-service nature of cloud computing allows organizations to create elastic environments that expand and contract based on the workload and target performance parameters and the pay by- use nature of cloud computing may take the form of equipment leases that guarantee a minimum level of service from a cloud provider. Virtualization is a key feature of this model. IT organizations have understood for years that virtualization allows them to quickly and easily create copies of existing environments -sometimes involving multiple virtual machines - to support test, development, and staging activities. The cost of these environments is minimal because they can coexist on the same servers as production environments because they use few resources. Likewise, new applications can be developed and deployed in new virtual machines on existing servers, opened up for use on the Internet, and scaled if the application is successful in the marketplace (Sun Microsystems, 2009). The ability to use and pay for only the resources used shifts the risk of how much infrastructure to purchase from the organization developing the application to the cloud provider. B. Cloud computing infrastructure models There are many considerations for cloud computing architects to make when moving from a standard enterprise application deployment model to one based on cloud computing. There are public and private clouds that offer complementary benefits, there are three basic service models to consider, and there is the value of open APIs versus proprietary ones. IT organizations can choose to deploy applications on public, private, or hybrid clouds, each of which has its trade-offs. The terms public, private, and hybrid do not dictate location. While public clouds are typically “out there” on the Internet and private clouds are typically located. 1) Public clouds are run by third parties, and applications from different customers are likely to be mixed together on the cloud's servers, storage systems, and networks. Public clouds are most often hosted away from customer premises, and they provide a way to reduce customer risk and cost by providing a flexible, even temporary extension to enterprise infrastructure. TRIM 7 (2) July - Dec 2011 175 A new Approach of CLOUD… Srivastava and Kumar 2) Private clouds are built for the exclusive use of one client, providing the utmost control over data, security, and quality of service. The company owns the infrastructure and has control over how applications are deployed on it. Private clouds may be deployed in an enterprise datacenter, and they also may be deployed at a collocation facility. 3) Hybrid clouds combine both public and private cloud models. They can help to provide on-demand, externally provisioned scale. The ability to augment a private cloud with the resources of a public cloud can be used to maintain service levels in the face of rapid workload fluctuations. This is most often seen with the use of storage clouds to support Web 2.0 applications. A hybrid cloud also can be used to handle planned workload spikes. Sometimes called “surge computing,” a public cloud can be used to perform periodic tasks that can be deployed easily on a public cloud. Hybrid clouds introduce the complexity of determining how to distribute applications across both a public and private cloud. C. Architectural layers of cloud computing Cloud computing can describe services being provided at any of the traditional layers from hardware to applications. In practice, cloud service providers tend to offer services that can be grouped into three categories: software as a service, platform as a service, and infrastructure as a service. 1) Software as a service (SaaS) features a complete application offered as a service on demand. A single instance of the software runs on the cloud and services multiple end users or client organizations. The most widely known example of SaaS is salesforce.com, though many other examples have come to market, including the Google Apps offering of basic business services including email and word processing (Sun Microsystems, 2009). 2) Platform as a service (PaaS) encapsulates a layer of software and provides it as a service that can be used to build higher-level services. There are at least two perspectives on PaaS depending on the perspective of the producer or consumer of the services: Someone producing middleware, application software, and even a development environment that is then provided to a customer as a service. Someone using PaaS would see an encapsulated service that is presented to them through an API. The customer interacts with the platform through the API, and the platform does what is necessary to manage and scale it to provide a given level of service (Sun Microsystems, 2009). 3) Infrastructure as a service (IaaS) delivers basic storage and compute TRIM 7 (2) July - Dec 2011 176 A new Approach of CLOUD… Srivastava and Kumar capabilities as standardized services over the network. Servers, storage systems, switches, routers, and other systems are pooled and made available to handle workloads that range from application components to high-performance computing applications. Conclusion and Future Work In this paper we are presenting that Cloud computing promises significant benefits, but today there are security, privacy, and other barriers that prevent widespread enterprise adoption of an external cloud. In addition, the cost benefits for large enterprises have not yet been clearly demonstrated. The usage of resources in Cloud Architectures is as needed, sometimes ephemeral or seasonal, thereby providing the highest utilization and optimum bang for the buck. In Cloud, for the broader vision of Cloud Interoperability to work, ranging from VM mobility to storage federation to multicast and media streaming interoperability to identity and presence and everything in between, analogous core network extensions (or replacement) technologies need to be invented. Finally, we need improvements in bandwidth and costs for both datacenter switches and WAN routers. While we are optimistic about the future of Cloud Computing but cloud platforms aren't yet at the center of most people's attention. The attractions of cloud-based computing, including scalability and lower costs, are very real. If you work in application development, whether for a software vendor or an end user, expect the cloud to play an increasing role in your future. The next generation of application platforms is here which is “cloud; A computing infrastructure on demand.” References Armbrust, Michael., et al. (2009). Above the Clouds: A Berkeley View of Cloud Computing. CA: Electrical Engineering and Computer Sciences, University of California at Berkeley. (Technical Report No. UCB/EECS-2009-2). Retrieved from http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-200928.pdf Bialecki, A., Cafarella, M., Cutting, D., & O'Malley, O. (2005). Hadoop: A Framework for Running Applications on Large Clusters Built of Commodity Hardware. Retrieved from http://lucene.apache.org/hadoop/ Dean, J., & Ghemawat, S. (2004). MapReduce: simplified data processing on large clusters. In Proceedings of the 6th conference on Symposium on Opearting Systems Design \& Implementation (Vol. 6) (p. 10). Berkeley, CA, USA: USENIX Association. Retrieved from TRIM 7 (2) July - Dec 2011 177 A new Approach of CLOUD… Srivastava and Kumar http://dl.acm.org/citation.cfm?id=1251254.1251264 Li, Hong., et al. (2009). Developing an Enterprise Cloud Computing Strategy. Intel Information Technology. Retrieved from http://www.intel.com/en_US/Assets/PDF/whitepaper/icb_cloud _computing_strategy.pdf Sun Microsystems. (2009). Introduction to Cloud Computing architecture st White Paper (1 Ed.). Retrieved from http://eresearch.wiki.otago.ac.nz/images/7/75/Cloudcomputing .pdf Vogels, W. (2008). A Head in the Clouds – The Power of Infrastructure as a Service Infrastructure as a Service. In First workshop on Cloud Computing and In Applications (CCA’08). TRIM 7 (2) July - Dec 2011 178 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi Einstein’s Image Compression Algorithm: Version 1.00 * Yasser Arafat Mohammed Mustaq Mohammed Mothi Abstract Purpose: The Einstein’s compression technique is a new method of compression and decompression of images by matrix addition and the possible sequence of the sum. The main purpose of implementing a new algorithm is to reduce the complexity of algorithms used for image compression in recent days. The major advantage of this technique is that the compression is highly secure and highly compressed. This method does not use earlier compression techniques. This method of compression is a rastor compression. This method can be used for astronomical images and medical images because the image compression is considered to be lossless. Design/Methodology/Approach: The idea uses the previous literature as a base to explore the use of image compression technique. Findings: This type of compression can be used to reduce the size of the database for non- frequently used important data. This technique of compression will be in future used for compression of colour images and will be researched for file compression also. Social Implications: This idea of image compression is expected to create a new technique of image compression and will promote more researchers to research more on this type of compression Originality/Value: The idea intends to create a new technique of compression in the compression of image research. Keywords: Image Compression; Einstein’s Image Compression; New Compression Technique; Matrix Addition Based Compression. Paper Type: Technical Procedure he image is gained as an input preferably black. The value of colour will range from 0 to 255 as 0 is completely blank and 255 full. The image is processed in to the system and is converted in table of rows and columns of pixels preferably .jpg or .bmp. The input image will be in the form of Fig 1 and converted values will something be like Fig 2 T * Student, Department of Electronic sciences. Sathyabama University, Chennai-119, India. email: [email protected] TRIM 7 (2) July - Dec 2011 179 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi Fig 1: Courtesy: www.mathworks.com Fig 2: Courtesy: www.mathworks.com The image may be of any number of rows and columns but all the rows must have the same number of columns and vice versa. According to our example we have taken a matrix of 255 x 255 Compression Calculation of Rows and Columns A counter will be assigned to calculate the number of rows and will be stored in a variable as ρ Another counter will be assigned to calculate the number of columns and will be stored in a variable as χ ρ is nothing but the number of cells in each column and χ is the number of cells in each row According to our example the pattern will be: ρ=χ TRIM 7 (2) July - Dec 2011 180 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi The Database A database of all the possible sum is created. This is one time creation of database. As our example image is 255 x 255 when processed will have [1 × (255 × 255)] and the maximum possible values as our image is black and white is 255 so the maximum possible sum of the matrix is 16581375. So the database is created for the values ranging from sum of all the columns ranging from 0 to 16581375. For every possible value there is number of possible values i.e. according to permutations for sum σ and µ columns in the row matrix we get ν combinations i.e. ν = σ + (µ - 1) c (µ + 1) For example if there are 4 columns and sum of the matrix is 10 we get 715 combinations. And an extra column is added in the table for generating sequence number Λ. The table is stored in ascending order considering as digits (Table 1) Table 1 0 0 0 0 0 0 1 1 1 2 MATRIX VALUES 0 0 0 1 0 2 1 0 1 1 2 0 0 0 0 1 1 0 0 0 2 1 0 1 0 0 1 0 0 0 Λ 1 2 3 4 5 6 7 8 9 10 Table 1 forms as a look up table for example if the sum of the matrix is 2 and the matrix is [0 2 0 0 ] then the sequence number of the matrix will be 6 i.e. Λ = 6 Similarly, a database of all the possible values is generated. The database is found to be so important that it is even required for decompression also. The Second Step The second step involves the conversion of * ρ × χ +image into * 1 × (χρ)+. The actual image will be in the form of Fig 3. TRIM 7 (2) July - Dec 2011 181 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi Fig. 3 On the first stage of conversion the image is cut into each row matrices so that we get ρ * 1 × χ+. (Fig. 4) Fig. 4 Then the row matrices formed is lined one after the other to form a [ 1 × (χρ)+ row matrix (Fig. 5) Fig. 5 TRIM 7 (2) July - Dec 2011 182 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi Adding for σ and Generation of Sequence Λ As the row matrix is generated from the previous step. The values of the cells in the rows are added and are stored in σ. This forms a new cell in the compressed image The next cell in the compressed image comprises of the sequence number Λ. This number is generated by some random search algorithm referring to the table created as database and the original image. Extra cells Some extra cells like τ which refers to the type of image compressed; the extensions of the uncompressed images are converted and stored as ASCII values. Two cells containing the counter values like ρ and χ, and an extra cell number of colours or the layers present in the cell i.e. 1 denote black and white image and 3 denotes RGB. The output image for a black and white image will be in the form of Fig. 6. Fig. 6 σ =sum of matrix cells Λ = sequence generated for the sum τ =Type of image ρ =Number of rows in the original matrix χ =Number or columns in the original matrix α =Number of colors in the image Decompression The image compressed by the above technique can be decompressed by this method The compressed image received if black and white (Fig 7). Fig. 7 On seeing the value stored in last cell it understands whether the image is colour or black and white. If the value is 1 it understands the image is black and white and allots a single table for pixel storage else it allots 3 tables, each one for red, green and blue respectively TRIM 7 (2) July - Dec 2011 183 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi For Black and White Images The table or the matrix allotted is in the form of [1 × (χρ)+ cells (Fig. 8). Fig. 8 Then the programs search the database of the sum σ and goes to the value Λ and fills the table with the values stored in the table. Then the [1 × (χρ)+ matrix is cut at each χ and makes a new row (Fig. 9). Fig. 9 Then the rows are joined to form the complete image (Fig. 10). Fig. 10 Then the value of τ are converted into extension and the image is stored following the Dot (.) i.e. .jpeg etc. Other Notes This type of compression is calculated to be highly compressible and forms a lossless image when compressed. The software is calculated to be heavy as the database is heavy. The thumbnails of the compressed image are not possible as the image is stored as a table. TRIM 7 (2) July - Dec 2011 184 Einstein’s Image Compression Algorithm: Version 1.00 Arafat, Mustaq & Mothi The compression can be also secure if the value of Λ is sent separately. Multiple compressions are not possible for the image as onetime compression is compressed on maximum basis. The compression technique does not use any previous method of compression. Conclusion Simplicity of matrix addition is the major advantage of the Einstein’s image compression algorithm. The images compressed can be stored in the database with less space. The technique is based purely upon a new idea and does not contain any previous type of compression. The next version of the compression technique will be in research for the compression of colour images. Acknowledgement First of all I would like to thank Dr. Jeppiaar, Chancellor of Satyabama University; Dr.C.D.Suriyakala, Head, Department of ETCE, Sathyabama University and Ms. Ulagamudhalvi for providing me the opportunities to write this paper. This algorithm could not have been possible without Dr. N.M. Nanditha, Professor, Satyabama University who not only served as my supervisor but also encouraged and challenged me throughout the completion of the paper. She and the other faculty members, Mr. Selvakumar and Mr.Vedhanarayanan, guided me through the completion process, never accepting less than my best efforts. I thank them all. Additional Readings Carpentieri, B., Weinberger, M. J., & Seroussi, G. (2000). Lossless Compression of Continuous-Tone Images. Proceedings of IEEE. 88 (11), 1797-1807 Xu, D., et al. (2005). Parallel Image Matrix Compression for Face Recognition. MMM '05 Proceedings of the 11th International Multimedia Modelling Conference. Washington DC, U.S.A: IEEE Computer Society. TRIM 7 (2) July - Dec 2011 185 Open Source Software (OSS)… Kotwani & Kalyani Open Source Software (OSS): Realistic Implementation of OSS in School Education * Gunjan Kotwani ** Pawan Kalyani Abstract Purpose: Freedom to think for the generation of new ideas and act to conceptualize them, are the concepts which are revolutionizing today’s world. The software world is also not left untouched. Open Source Software (OSS) has brought the idea of sharing of ideas for the betterment of Computer Science to the forefront. With the passage of time, open source software has not only gained prominence in the server software segment, but is also penetrating the desktop segment. Open source softwares are attracting attention all over the world; especially governments of developing nations are working on the promotion and spread of OSS. The advantages of localization, freedom to modify the software, and easy availability are factors that are attracting people towards OSS. The impact of OSS is felt in many arenas. Education is one of them; in India itself, Kerala and Goa have pioneered the use of OSS in school education. Design/Methodology/Approach: In this research paper, the authors focus on OSS in education and its realistic implementation in school education. The authors conducted an empirical study on school students to study the effect of OSS on their learning curve. Findings: The authors propose a curriculum for the school that is based on OSS. Research Implications: The apt usage of information and communication technologies (ICTs) has the potential to improve the quality of education. However, educational institutions face many constraints, like financial, equipped staff, resources, etc. The high cost of software along with the hardware poses major challenge. OSS with its unique features proves to be of great help by lowering the cost factor of the software. OSS not only provides financial benefits, but also there are many other advantages of OSS which prove to be a boon for the education sector. Value: This research paper will aide policy-makers and decision-makers, to understand the potential use of OSS in education—how and where it can be used, why it should be used, and what issues are involved in its implementation. In particular, officials in ministries of education, school and university administrators and academic staff should find this research useful. Keywords: Open Source Software; Education; School Education; Information and Communication Technology (ICT); Realistic Implementation- OSS. Paper Type: Empirical * Department of Computer Science and Information Technology, Management and Commerce Institute of Global Synergy, Ajmer, Rajasthan, India. email: [email protected] ** Department of Computer Science and Information Technology, Management and Commerce Institute of Global Synergy, Ajmer, Rajasthan, India. email: [email protected] TRIM 7 (2) July - Dec 2011 186 Open Source Software (OSS)… Kotwani & Kalyani Introduction SS is a software that gives user the freedom to use, study, and modify the software based on local needs and preferences. This freedom is vital for the growth and development of Computer Sciences. Certain distinctive advantages of OSS are as: Lower costs Reliability, performance and security Build long-term capacity Open philosophy Encourage innovations Alternative to illegal copying Possibility of localization Learning from source code Previous studies show that OSS based educational infrastructure in comparison to proprietary software to facilitate the process of teaching and learning has proved to be more beneficial in stimulating crossboundary learning and modifying the technologies into the desires of the users (Pearson & Koppi, 2002). Many more studies propagate the use of OSS in education. Now, the next step is to design an age-appropriate syllabus based on local needs and environment that could be implemented in schools. It also requires the development of coursematerial for the teacher’s aide. Through this research paper we propose an OSS based curriculum based on the recommendations of National Curriculum Framework (NCF) 2005 proposed by the National Council of Educational Research and Training (NCERT), India. We have also developed the study material which can be instrumental in realistic implementation of OSS in schools of India. The paper investigates the need of OSS in education, its merits for the students, educational institutions and the nations especially developing ones. It further depicts an empirical study of effects of OSS inside the classroom environment. The paper also presents an overview of the proposed comprehensive integrated curriculum plan based on the recommendations of the NCF 2005. Appendix A, gives the introduction of proposed software included in the curriculum with a sample of the course material developed. Appendix B shows the samples of the work done by students using OSS. O Need of OSS in Education As Computer Science educators, we constantly seek new channels, methods, and technologies to reach and intrigue our students. We hope to first capture their interest, then develop their understanding, work towards retention of the concept, and finally encourage their own TRIM 7 (2) July - Dec 2011 187 Open Source Software (OSS)… Kotwani & Kalyani independent creative work. Throughout this process, we try to teach them skills that they can apply in the real world. The breadth of our field and the variety of pedagogical approaches make this process very difficult. We believe that OSS can serve as a channel, method, and technology to teach and learn Computer Science. OSS has the potential to expand group work beyond the classroom to include much larger projects and more distributed teams. OSS can also be used to introduce our students to the larger Computer Science community and to the practice of peer-review. Finally, OSS can provide us with free or lower-cost technology in the classroom, permitting us to use technology that we might otherwise be unable to afford. Merits for the Students Students use open source in school, which substantially shortens their learning curve when they go to work for software companies. Students who are encouraged to build projects on top of OSS bases can build more interesting and exciting systems than they might have developed from scratch. The foothold of OSS is increasing in the industrial sector. Today’s learner will be tomorrow’s professional. If he/she is not equipped with the desired skill, he/she will find difficulty to adapt in tomorrow’s job market. Teaching OSS from the elementary years of education adapts the child for future market and job requirements. Students, who take up Computer Science as a subject in higher secondary school and take up professional computing courses in under-graduate and post-graduate programs, remain largely aloof with the actual coding segment taking place in the software industry. Use of OSS will help them work and see the actual software codes; how they can modify them and be a part of a larger online community which is working on OSS. Merits for the Educational Institutions Free and OSS can save the school’s money in a context where schools – even the affluent ones are short of money. Teaching students’ way of life is the aim of education. Schools should promote “open source software just as they promote recycling”, which will benefit society as a whole. OSS does not demand high end hardware configurations which result in “lowered carbon footprints”. TRIM 7 (2) July - Dec 2011 188 Open Source Software (OSS)… Kotwani & Kalyani OSS opens the code for the students, permitting them to learn how software works, thus helping to build good future coders. Proprietary software rejects their thirst for knowledge by keeping knowledge secret and “learning forbidden”. Schools teach students to be good citizens – to cooperate and share with others who need their help. This is the philosophy of open source. The training to use free software, and encouragement to participate in the free software community, generates a sense of importance of sharing and collaborative development amongst the students. Merits for the Nation Sovereignty and security issues. Promote growth of local software industry. Induce economic development tapping on local talent and human resources. Encourage use of local software at national level. Reduced costs and dependency on imported technology and skills. Affordable software for individual, enterprise and government. Access to government data without barrier of proprietary software and formats. Ability to customise software to local languages and cultures. Lowered barriers to entry for software business. Research Undertaken Effects of OSS inside the classroom (Subject: Mathematics) We, along with a mathematics teacher, planned a research plan for students of Class III, Section A and B. The strength of each section was 36 students. Methodology Research Plan 2010 Actions Timescales/ Key dates Collect data related to the understanding the students of class III, already have related to the topic 2nd week of September. TRIM 7 (2) July - Dec 2011 Resources / Sources of support and challenge We will be using worksheets and photocopy of student’s class-work. Success Criteria Comments /Amendments to plan The worksheets will be completed individually. The worksheet assessment and the oral assessment gave a different output for certain students who were good in oral work but poor in 189 Open Source Software (OSS)… Multiplication and Money and identify the student groups who are struggling with the concepts Explain the concept of Multiplication and Money using multimedia modules. Provide students with opportunities to use their concept knowledge to play computer games and to improve their skills by trying to improve their scores Software used: Tux Math and GCompris Kotwani & Kalyani comprehension. Mid September Using computer and the module available related to the topics. 3rd week of September. Using Free and Open Source Software. The computer teacher will also act as a resource person. The challenge will be to adjust the timetable so that the computer lab is available to this group of students. Assessment to gauge the students level of learning Software used: Tux Paint A grid Last week of September. Assessment sheets, Classroom observation, interview with students. To see that Students have achieved the expected learning outcome Feedback 1st week of October Feedback Form To get the learner’s point of view. TRIM 7 (2) July - Dec 2011 All the children will have access to a computer and the module. All the students will be able to play the games with increasing difficulty level Learners were keen to watch the multimedia modules. The idea of taking a mathematics class in the computer lab was enough to excite the students. The game play of Tux Math provided ample opportunities for oral and mental mathematics calculations. The results were saved and the game play could be continued in the next lesson which gave the learners an opportunity to wait for the upcoming mathematics class. Using the capabilities of the free and open source software tux paint, a grid was designed which was included as a stamp in the software. The teacher gave questions that had to be solved using the grid and answers be noted in the grid. This was used later by the teacher for assessments. The learners gave positive responses about the whole exercise. 190 Open Source Software (OSS)… Kotwani & Kalyani The above methodology was adopted in section ‘A’ of class III. In section ‘B’ with the same teacher the approach was kept conventional. To gauge the performance of the students periodically assessments were conducted. In this study, we conducted four (4) assessments. The results of the assessments of both the sections were compiled and tabulated. A comparative study was then conducted after the assessment of both the sections. Results The study clearly showed that the number of students who grasped the concept in less time period and with a better quality were more in section ‘A’ where certain open source software were adapted in conformance with the syllabus of the class (Fig. 1). Fig. 1: Comparative analysis of students of Class III A and III B Discussion After the completion of the study, a feedback was taken from the students as well as the concerned subject (Mathematics) teacher (Fig. 2). TRIM 7 (2) July - Dec 2011 191 Open Source Software (OSS)… Kotwani & Kalyani Fig. 2: Sample of student feedback forms Appendix A Review of the Mathematics teacher Before starting my lesson on multiplication using computer aided technology, I assessed the previous knowledge base and the level of understanding of my Class III students through a worksheet. I found that the majority of students understood that multiplication was grouping of objects but were not clear about multiplication as repeated addition. I also talked to my colleagues teaching Class III and all of them unanimously agreed that the students of Class III (A) were very restless with a short attention span and that they were also finding it difficult to keep them engaged for longer periods. At this point I would like to mention that I follow the activity based method of teaching and I teach every topic through some activity to make it interesting to students. Yet we were all facing the challenge of keeping Class III-A engaged. I also observed the computer lesson of this class and was surprised to see the level of engagement in the same students. This made me decide that using computer as a tool for teaching mathematics will not only help in improving student performance but will also increase student engagement. TRIM 7 (2) July - Dec 2011 192 Open Source Software (OSS)… Kotwani & Kalyani I had discussions with our computer teacher, Ms. Gunjan Kotwani who has been working with OSS (Open Source Software) since the past few years and is also working on integrated learning approach for students of classes I to V. She went through the Mathematics syllabus of Class III and gave me valuable inputs on which topics could be taught using certain software. We both took Mathematics lessons in the computer laboratory. We shared few tips on how to help students when they were facing some difficulty in carrying out their Mathematics assignment on computers. I started my lesson on multiplication using multimedia modules. I explained the concept of repeated addition using this software. We then took the class to the computer laboratory where the students would have access to individual computers and could apply whatever they had grasped from their previous lesson in the given assignment. We noticed that the level of student engagement was very high; in fact they did not want to return to their classroom at the end of the lesson. After 3 lessons in the computer lab we assessed the student learning and were surprised at the result as we found that there was not any significant improvement in their learning. After discussions with other mathematics teachers, we realized that what the students also needed in Mathematics was daily practice and drilling which included pen and paper exercises. We made some basic changes in our plan and interspersed Mathematics lessons with assignments on computer as well as exercises in the notebooks, worksheets and home assignments. As we progressed we noticed that the students were responding better. Curriculum Planning This study aims to provide a realistic implementation of OSS in schools. The major problem faced by the schools willing to adopt OSS in Computer Science curriculum is the lack of study material. The teachers are not equipped enough to handle OSS in their classrooms. A series of training sessions with adequate support in the form of study material and services can play a defining role in the implementation of OSS. This curriculum has been designed keeping in mind the recommendations of National Curriculum Framework (NCF 2005). Appendix B Class Age-Group of Children I 6-7 II 7-8 III 8-9 TRIM 7 (2) July - Dec 2011 Suggested Software TuxType, TuxPaint, GCompris TuxPaint, Tux Math, Introduction to OpenOffice.org Word Processor OpenOffice.org Word Processor, 193 Open Source Software (OSS)… IV 9-10 V 10-11 VI 11-12 VII 12-13 VIII 13-14 IX X XI XII 14-15 15-16 16-17 17-18 Kotwani & Kalyani Introduction to Logo programming using KTurtle Introduction to OpenOffice.org Presentation, Basics of Logo programming using KTurtle Advanced OpenOffice.org Word Presentation, Programing Basics of Logo programming using KTurtle Internet Browser (Firefox), Raster Graphic Editor- GIMP, OpenOffice.org Calc (Spreadsheet package) Vector Graphics Editor-Inkscape, Introduction to Database using MySQL, HTML programming using BlueFish Database concepts using MySQL, Introduction to programming using Java NetBeans, Database connectivity between Java and MySQL, Page-layout programScribus Advanced Java programming using NetBeans Introduction to programming in C++ using GCC compiler Based on recommendations of CBSE Based on recommendations of CBSE Conclusion The study aims to lay stress on the need of use of OSS in a developing nation like India. The use of OSS will promote free thinking, innovation, development of new software models and the field of Computer Science can reach to great heights. The students need to be exposed to these software at an early stage of their mental development. Use of OSS will teach the usage of tools rather than laying importance to software. For example, a document can be created in any word processor software. The student should be comfortable in adapting to various word processors available. Ultimately the tools of a word processor will be similar, only their placement and arrangement might be different. Since Computer Science is a rapidly evolving field in which new software and technologies keep on emerging. This kind of flexibility with the software is essential. In this context, the acceptance and adaptability for the changing software is necessary for the students. This research aims to provide a practical, feasible and working model of OSS in education. For this the development of study tools like coursematerial, resource CDs, etc. are essential to provide support to the teaching community and will also help in removing the hitch to adopt OSS in education. Still there are many challenges in the implementation of OSS in school curricula. The major one being the reluctance to change. The teaching fraternity first needs to be convinced about the benefits, OSS can give to their students. The unavailability of teaching resource material for OSS is another hitch. And lastly teacher training and OSS maintenance are the challenges which need to be overcome for effective implementation of OSS in school education. TRIM 7 (2) July - Dec 2011 194 Open Source Software (OSS)… Kotwani & Kalyani References National Curriculum Framework (NCF). (2005). Retrieved from http://www.ncert.nic.in/html/pdf/schoolcurriculum/framework0 5/prelims.pdf Pearson, E. J., & Koppi, A. J. (2002). A WebCT course on making accessible online courses. WebCT Asia Pacific Conference, Melbourne, Australia, March 2002. TRIM 7 (2) July - Dec 2011 195 Measurement of Processes in OSS Development Kaur & Singh Measurement of Processes in Open Source Software Development Parminder Kaur Hardeep Singh Abstract Purpose: This paper attempts to present a set of basic metrics which can be used to measure basic development processes in an OSS environment. Design/Methodology/Approach: Reviewing the earlier literature helped in exploring the metrics for measuring the development processes in OSS environment. Results: The OSSD is different from traditional software development because of its open development environment. The development processes are different and the measures required to assess them have to be different. Keywords: Open Source Software (OSS); Free Software; Version Control; Open Source Software Metrics; Open Source Software Development Paper Type: Conceptual Introduction ree software [FS], term given by Richard Stallman, introduced in 1984, can be obtained at zero cost i.e. software which gives the user certain freedoms. Open Source Software (OSS), term coined by Eric Raymond, in 1998, is software for which the source code is freely and publicly available, though the specific licensing agreements vary as to what one is allowed to do with that code . In the case of FS, only executable file is made available to the end user, through public domain and end user is free to use that executable software in any way, but the user is not free to modify that software. The alternative term Free/Libre and Open Source Software (FLOSS) refers to software whose licenses give users four essential ‘freedoms’: To run the program for any purpose, To study the workings of the program, and modify the program to suit specific needs, To redistribute copies of the program at no charge or for a fee, and To improve the program, and release the improved, modified version (Perens, 1999; 2004). The free software movement is working toward the goal of making all software free of intellectual property restrictions which hamper technical improvement whereas OSS users do not pay royalties as no copyright exists, in contrast to proprietary software applications which are strictly F Department of Computer Science and Engineering, Guru Nanak Dev University, Amritsar143005, India. email: [email protected] Department of Computer Science and Engineering, Guru Nanak Dev University, Amritsar143005, India. email: [email protected] TRIM 7 (2) July - Dec 2011 196 Measurement of Processes in OSS Development Kaur & Singh protected through patents and Intellectual Property Rights (IPR’s) (Asiri, 2003; Wheeler, 2003). OSS is software for which the source code is publicly available, though the specific licensing agreements vary as to what one is allowed to do with that code (Stallman, 2007). Open Source Software Development Open Source Software Development (OSSD) produces reliable, high quality software in less time and with less cost than traditional methods. Adelstein (2003) is even more evangelical, claiming that OSSD is the “most efficient” way to build applications. Schweik and Semenov (2003) add that OSSD can potentially “change, perhaps dramatically, the way humans work together to solve complex problems in computer programming”. Even when there is a great level of exaggeration, OSS can be used as an alternative to traditional software development. Raymond (1998) compares OSSD to a “bazaar” – a loosely centralized, cooperative community where collaboration and sharing enjoy religion status. Conversely, traditional software engineering is referred to as a “cathedral” where hierarchical structures exist and little collaboration takes place. Problems with Traditional Development Traditional software development projects suffer from various issues such as time and cost overruns, largely unmaintainable, with questionable quality and reliability. The 1999 Standish Group report revealed that 75% of software projects fail in one or more of these measures, with a third of projects cancelled due to failure. In addition, systems often fail to satisfy the needs of the customer for whom they are developed (Sommerville, 1995). These failures are ascribed to: Inadequate understanding of the size and complexity of IS development projects coupled with inflexible, unrealistic timeframes and poor cost estimates (Hughes & Cotterell, 1999; McConnell, 1996). Lack of user involvement (Addison & Vallabh, 2002; Frenzel, 1996; Hughes & Cotterell, 1999; McConnell, 1996). Shortfalls in skilled personnel (Addison & Vallabh, 2002; Boehm, 1991; Frenzel, 1996; Hughes & Cotterell, 1999; Satzinger, Jackson & Burd, 2004). Project costs are further increased by the price of license fees for software and tools required for application development as well as add-on costs for exchange controls. Benefits of Open Source Software The benefits with OSS (Feller & Fitzgerald, 2001; FLOSS Project Report, 2002) are as follows: Collaborative, parallel development involving source code TRIM 7 (2) July - Dec 2011 197 Measurement of Processes in OSS Development Kaur & Singh sharing and reuse Collaborative approach to problem solving through constant feedback and peer review Large pool of globally dispersed, highly talented, motivated professionals Extremely rapid release times Increased user involvement as users are viewed as codevelopers Quality software Access to existing code Despite these benefits, perceived disadvantages of OSS are: In the rapid development environment, the result could be slower, given the absence of formal management structures (Bezroukov, 1999; Levesque, 2004; Valloppillil, 1998). Strong user involvement and participation throughout a project is becoming problematic as users tend to create bureaucracies which hamper development (Bezroukov, 1999). OSS is premised on rapid releases and typically has many more iterations than commercial software. This creates a management problem as a new release needs to be implemented in order for an organization to receive the full benefit (Farber, 2004). The user interfaces of open source products are not very intuitive (Levesque, 2004; Valloppillil, 1998; Wheatley, 2004). As there is no single source of information as well as no help desk therefore no ‘definitive’ answers to problems can be found (Bezroukov, 1999; Levesque, 2004). System deployment and training is often more expensive with OSS as it is less intuitive and does not have the usability advantages of proprietary software. Open Source Software Development Models There are several basic differences between OSSD and traditional methods. The System Development Life Cycle (SDLC) of traditional methods have generic phases into which all project activities can be organized such as planning, analysis, design, implementation and support (Satzinger, Jackson & Burd, 2004). Also, open source life cycle for OSSD paradigm demonstrates several common attributes like parallel development and peer review, prompt feedback to user and developer contributions, highly talented developers, parallel debugging, user involvement, and rapid release times. Vixie (1999) holds that an open source project can include all the elements of a traditional SDLC. Classic OSS projects such as BSD, BIND TRIM 7 (2) July - Dec 2011 198 Measurement of Processes in OSS Development Kaur & Singh and SendMail are evidence that open source projects utilize standard software engineering processes of analysis, design, implementation and support. Mockus, Fielding & Herbsleb (2000) describe a life cycle that combines a decision-making framework with task-related project phases. The model comprises six phases like roles and responsibilities, identifying work to be done, assigning and performing development work, pre-release testing, inspections, and managing releases. Jorgensen (2001) provides a more detailed description of specific product related activities that support the OSSD process. The model (Fig. 1) explains the life cycle for changes that occurred within the Free BSD project. Fig.1: Jorgensen Life-Cycle, 2001 Jorgensen’s model is widely accepted (Feller & Fitzgerald, 2001; FLOSS Project Report, 2002) as a framework for the OSSD process, on both macro (project) and micro (component or code segment) levels. However, flaws remain. When applied to an OSS project, the model does not adequately explain where or how the processes of planning, analysis and design take place. Schweik and Semenov (2003) propose an OSSD project life cycle comprising three phases: project initiation; going ‘open’; and project growth, stability or decline. Each phase is characterized by a distinct set of activities. Wynn (2004) proposes a similar open source life cycle but introduces a maturity phase in which a project reaches critical mass in terms of the numbers of users and developers it can support due to administrative constraints and the size of the project itself. Roets, et al. (2007) expands Jorgensen life-cycle model and incorporates aspects of previous models, particularly that of Schweik and Semenov (2003). In addition, this model attempts to encapsulate the phases of the traditional SDLC (Fig. 2). This model facilitates OSS development in terms TRIM 7 (2) July - Dec 2011 199 Measurement of Processes in OSS Development Kaur & Singh of improved programming skills, availability of expertise and model code as well as software cost reduction. Fig. 2: Roets, et al. life-cycle model of OSSD projects, 2007 Comparison of Traditional Life-Cycle with OSSD Life-Cycle Fig. 3 compares different phases of traditional software development lifecycle with OSSD life-cycle mentioned in Fig.2. Fig. 3: Comparison of Traditional Life-Cycle with OSSD Life-Cycle TRIM 7 (2) July - Dec 2011 200 Measurement of Processes in OSS Development Kaur & Singh Initiation phase of OSSD life –cycle combines three phases i.e. planning phase, analysis phase and design phase of traditional software development life-cycle. As it is suggested that it may be more important to get design right prior to actual programming so that all developers are working towards a clearly defined common purpose. Implementation phase combines the different aspects like review, contribution, precommit test and release of production. As multiple users as well as skilled personnel are involved in OSSD, parallel debugging and different versions of one piece of code can be grouped together with support phase of traditional software development life-cycle. Proposed Metrics for the Selected Model Keeping in view OSSD life - cycle model proposed by Roets, et al. (2007), a following set of metrics is proposed to keep a check over the generation of multiple processes in OSSD. Total Number of Contributions Under the considered model, a large number of users in an open environment contribute towards the development of the project. This metrics assesses the total number of contributors for the projects. This number can be a number of unique contributors or some contributors may be associated with multiple projects. However, this metric is related to the number of contributors for a given project irrespective of their affiliations to other projects. Average Domain Experience of Contributors A particular project is developed on a specific domain. Usually the contributors having some expertise and experience in that domain area contribute to the project. This metric helps in evaluating the average experience of all the contributors taken together and can be represented as Cumulative Experience of Contributor i.e. n E = ∑ ei i=1 [Where ei is the experience of an individual contributor in that domain] Average Experience of Contributors i.e. Eavg = E / N [Where N is total number of contributors] This metrics tends to measure the extent of support to the development of a project by the contributors. It is safe to assume that greater the average experience of a contributor, more robust development of the project would be. TRIM 7 (2) July - Dec 2011 201 Measurement of Processes in OSS Development Kaur & Singh Average Time for a Completion of a Version of Project Quick versioning is the essence of OSS development. However, different versions at different rates of time depending upon the factors like number of contributors, their experience, nature of project, complexity of the project etc. The average time for a completion of a version of project can be calculated as: Average Time i.e. Tavg = Ttotal / Nversion [Where Ttotal is total time taken to develop all the versions and N version is the total number of versions generated] Greater Tavg would indicate software development processes resulting from various factors like low number of contributors, their lack in experience or complexity of the project etc. Bugs Track per Version Quality of OSS is always a question. However, with proper bug tracking mechanism and tools in place, the bug tracking can be made very effective and the quality of OSS can be enhanced. The number of bugs tracked per version is an indication of quality and reliability of the product. Hence, this measure can be put to an effective use for enhancing the quality of the final product. Patch Accept Ratio Every contributor sends a patch for the enhancement of the product. However, it is not necessary that every patch sent by the contributor(s) is accepted for updating the product. The Patch Accept Ratio i.e. Pratio can be defined as: ra o Total no. of patches accepted Total no. of patches accepted A high Patch Accept Ratio shall effectively argue for a high competence of contributors and reverse is true for the less patch ratios. Number of effective Reviews Received In addition to the development of patches, some contributors send their reviews about the products in making. A large number of effective reviews indicate that some functionality was not taken care of by either the developing contributor or the mentors. Greater the number of effective reviews more is the gaps in the development process. Therefore, the number of effective reviews can result in an effective developmental methodology. Conclusion The OSSD is different from traditional software development because of its open development environment. The development processes are TRIM 7 (2) July - Dec 2011 202 Measurement of Processes in OSS Development Kaur & Singh different and the measures required to assess them have to be different. This paper attempts to present a set of basic metrics which can be used to measure basic development processes in an OSS environment. However, these need to be validated and established by using them on large number of systems. References Addison, T., & Vallabh, S. (2002). Controlling software project risks - an empirical study of methods used by experienced project managers. In Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology (SAICSIT '02) (128-140). Republic of South Africa: South African Institute for Computer Scientists and Information Technologists. Retrieved from http://dl.acm.org/citation.cfm?id=581506.581525 Adelstein, T. (2003). How to misunderstand open source software development. Retrieved from http://www.consultingtimes.com/ossedev.html Asiri, S. (2003). Open source software. SIGCAS Computers and Society, 33 (1), 2. doi: 10.1145/966498.966501 Bezroukov, N. (1999). Open source software development as a special type of academic research: Critique of vulgar Raymondism. First Monday. 4 (10). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/arti cle/view/696/606 Boehm, B. (1991). Software risk management: principles and practices. IEEE Software, 8(1), 32-41. Farber, D. (2004). Six barriers to open source adoption. Retrieved from http://www.zdnetasia.com/six-barriers-to-open-sourceadoption-39173298.htm Feller, J., & Fitzgerald, B. (2001). Understanding open source software development. London: Addison-Wesley. FLOSS Project Report. (2002). Floss Project Report: Free/Libre and open source software (FLOSS): Survey and study. Retrieved from http://www.infonomic.nl/floss/report/ nd Frenzel, C. (1996). Management of Information Technology (2 ed). Cambridge, MA: CTI nd Hughes, B., & Cotterell, M. (1999). Software project management (2 ed). Berkshire, United Kingdom: McGraw-Hill. Jorgensen, N. (2001). Putting it all in the trunk: Incremental software development in the FreeBSDopen source project. Information TRIM 7 (2) July - Dec 2011 203 Measurement of Processes in OSS Development Kaur & Singh Systems Journal, 11(4), 321-336. doi:10.1046/j.13652575.2001.00113.x Levesque, M. (2004). Fundamental issues with open source software development. First Monday. 9 (4). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/arti cle/view/1137/1057 McConnell, S. (1996). Avoiding Classic mistakes. IEEE Software, 13(5), 111-112. doi: 10.1109/52.536469 Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2000). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology, 11(3), 309-346. doi:10.1145/567793.567795 Perens, B. (1999). The open source definition. In M. Stone, S. Ockman & C. Dibona (Eds.), Open sources: Voices from the open source revolution. Sebastopol, California: O'Reilly & Associates. Perens, B. (2004). The open source definition. Retrieved from http://opensource.org/docs/def_print.php Raymond, E. (1998). The cathedral and the bazaar. First Monday. 3 (3). Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/arti cle/view/1488/1403 Roets, Minnaar., et al. (2007). Towards Successful Systems Development Projects in Developing Countries. In Proceedings of the 9th International Conference on Social Implications of Computers in Developing Countries, São Paulo, Brazil, May 2007. Satzinger, J. W., Jackson, R. B., & Burd, S. D. (2004). Systems Analysis and rd Design in a Changing World(3 ed). Boston: Course Technology. Schweik, C. M., & Semenov, A. (2003). The institutional design of open source programming: Implications for addressing complex public policy and management problems. Retrieved from http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/arti cle/view/1019/2426 th Sommerville, I. (1995). Software Engineering (5 ed). Harlow: AddisonWesley Longman Limited. Stallman, R (2007). Why “Free Software” is better than “Open Source”. Retrieved from http://www.gnu.org/philosophy/free-software-forfreedom.html Valloppillil, V. (1998). Open source Initiative (OSI) Halloween I: A (new?) software development methodology. Retrieved from http://www.opensource.org/halloween/halloween1.php#comm ent28 TRIM 7 (2) July - Dec 2011 204 Measurement of Processes in OSS Development Kaur & Singh Vixie, P. (1999). Software Engineering. In M. Stone, S. Ockman & C. Dibona (Eds.), Open sources: Voices from the open source revolution. Sebastopol, California: O'Reilly & Associates. Wheatley, M. (2004). Open Source: The myths of open source. CIO, March 01, 2004. Retrieved from http://www.cio.com/article/32146/Open_Source_The_Myths_o f_Open_Source Wheeler, D. A. (2003). Why open source software/free software (OSS/FS)? Look at the numbers! Retrieved from http://www.dwheeler.com/oss_fs_why.html Wynn, Jr. D. E. (2004). Organizational structure of open source projects: A life cycle approach. In proceedings of the 7th Annual Conference of the Southern Association for Information Systems, Savannah. Retrieved from http://sais.aisnet.org/2004/.%5CWynn1.pdf TRIM 7 (2) July - Dec 2011 205 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Open Source Systems and Engineering: Strengths, Weaknesses and Prospects Javaid Iqbal S.M.K.Quadri Tariq Rasool Abstract Purpose: This paper reviews the open source software systems (OSSS) and the open source software engineering with reference to their strengths, weaknesses and prospects. Though, it is not possible to spell out the better of the two software engineering processes, the paper outlines the areas where the open source methodology holds edge over conventional closed source software engineering. Then, the weaknesses are also highlighted, which tilt the balance the other way. Design/Methodology/Approach: The study is based on the works carried out earlier by the scholars regarding the potentialities and shortcomings of OSSS. Findings: A mix of strengths and weaknesses make it hard to pronounce open source as the panacea. However, the open source does have very promising prospect; owing to its radical approach to the established software engineering principles, it has spectacularly managed to carve a “mainstream” role, that too in just over a few decades. Keywords: Open Source Software (OSS); Open Source Development Paradigm; Software Engineering; Open Source Software Engineering. Introduction pen source traces back to early 1960s, yet as a term, “open source initiative” was coined in 1998 (Open Source, n.d). The history of open source is closely tied to that of UNIX. The rise of open source paradigm marks the end of the dominance of the proprietary-driven, close source software setup that dominated the arena over many decades. A new ideology that promises a lot in terms of economics, development environment and unrestricted user involvement, has been evolving in a big way thrusted into the big picture by loosely-centralized, cooperative, and gratis contributions from the individual developer-user to startle the purists in the field of software engineering. Eventually, open source software phenomenon has systematically metamorphosed from a “fringe activity” into a more mainstream and commercially viable form. The open source initiative succeeded spectacularly well. O Assistant Professor. P.G Department of Computer Science, University of Kashmir- North Campus (India). email: [email protected] Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. email: India. [email protected] Lecturer. P.G Department of Computer Science. Islamic University of Science and Technology. email l: India. [email protected] TRIM 7 (2) July - Dec 2011 206 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Defining Open source Software The term open source software (OSS) refers to software equipped with licenses that provide existing and future users the right to use, inspect, modify, and distribute (modified and unmodified) versions of the software to others. It is not only the concept of providing “free” access to the software and its source code that makes OSS the phenomenon that it is, but also the development culture (Raymond, 1999). Kogut and Metiu (2001) also comment on open source as right offered to the users to change the source code without making any payment. Nakakoji, Yamamoto, Nishinaka, Kishida and Ye (2002) refer to OSS as software systems that are free to use and whose source code is fully accessible to anyone who is interested in them. Open Source Software Engineering The open source development (OSD) model fundamentally changes the approaches and economics of traditional software development marking a paradigm shift in software engineering. Open source is a software development methodology that makes source code available to a large internet-based community. Typically, open source software is developed by an internet-based community of programmers. Participation is voluntary and participants do not receive direct compensation for their work. In addition, the full source code is made available to the public. Developers also devolve most property rights to the public, including the right to use, to redistribute and to modify the software free of charge. This is a direct challenge to the established assumptions about software markets that threaten the position of commercial software vendors (Hars & Ou, 2001). Torvalds et al (2001) acknowledges that OSS is not architected but grows with directed evolution. An open source software system must have its source code available free, for its use, customtailoring or its evolution in general by anyone whosoever is interested. Thus, from the point of view of a purist in traditional software engineering, open source is a break-away paradigm in terms of its defiance of conventional software engineering and non-adherence to the standardized norms and practices of the maturing software engineering process that we have been carrying along with so much of devotion all the way through our legacy systems. The open source development model breaks away from the normal in-house commercial development processes. The self-involved/self-styled open source developer-user uses the software and contributes to its development as well, giving birth to a user- centered participatory design process. TRIM 7 (2) July - Dec 2011 207 Open Source Systems and Engineering… Iqbal, Quadri & Rasool What Leads to the Success of OSS? Many important factors have catapulted the paradigm of OSS development to the forefront in software industry, which include cost, time, manpower, resources, quality, credit acknowledgement, spirit of shared enterprise etc: Cost: OSS products are usually freely available for public download. Time: The fact that OSS is massive parallel development and debugging environment wherein the parallel but collaborative efforts of globally distributed developers allow the development of OSS products much more quickly than conventional software systems, considerably narrowing down the gestation period. Manpower: With the development environment being spread across the globe, the best-skilled professionals work under the global development environment. This means more people are involved in the process. Resources: Again, more skilled professionals offer their resources for the development of OSS products. Quality: OSS products are recognized for their high standards of reliability, efficiency, and robustness. Raymond (2001) suggests that the high quality of OSS can be achieved due to high degree of peer review and user involvement in bug/defect detection. Credit Acknowledgement: The fact that people across the globe work on OSS find a chance collaborating with their peers gaining immediate credit acknowledgement for their contribution. Informal Development Environment: The informal development environment, unlike the organizational settings, liberates the developers from formal ways and conduct; more students see a chance working on real-time projects at their places. Spirit of Shared Enterprise: Organizations that deploy OSS products freely offer advice to one another, sharing insights regarding the quality upliftments and lessons learnt. Open Source Software Development Process versus Conventional Software Development Process Open source development is attracting considerable attention in the current climate of outsourcing and off-shoring (globally distributed software development). Organizations are seeking to emulate open source success on traditional development projects, through initiatives variously labeled as inner source, corporate source, or community source (Dinkelacker & Garg, 2001; Gurbani, Garvert & Herbsleb, 2005). The conventional software development process encompasses the four TRIM 7 (2) July - Dec 2011 208 Open Source Systems and Engineering… Iqbal, Quadri & Rasool phases comprising the software development life cycle. These phases are planning, analysis, design, and implementation. In open source software development process, these phases are accomplished in a way that is probably blurry in a sense as the first three phases of planning, analysis, and design are, kind of, blended and performed typically by a single developer or small core group. Given the ideal that a large number of globally distributed developers of different levels of ability and domain expertise should be able to contribute subsequently, the requirement analysis phase is largely superseded. Requirements are taken as generally understood and not in need of interaction among developers and endusers. Design decisions also tend to be made in advance before the larger pool of developers starts to contribute. Modularization of system is the basis for distributing the development process. Systems are highly modularized to allow distribution of work and thereby reduce the learning efforts to be made by new developers to participate (they can focus on particular subsystems without needing to consider the system in its totality). However, over-modularization can have reverse effects by increasing the risk of common coupling, an insidious problem in which modules unnecessarily refer to variables and structures in other modules. Thus, there has to be a balanced approach vis-à-vis modularization. In proprietary software, software quality testing is limited within a controlled environment and specific scenarios (Lerner & Tirole, 2002). However, OSS development involves much more elaborate testing as OSS solutions are tested in various environments, by various skills and experiences of different programmers, and are tested in various geographic locations around the world (Lakhani & Hippel, 2003; Lerner & Tirole, 2002; Mockus, Fielding & Herbsleb, 2002; West, 2003). In the OSS development life cycle, the implementation phase consists of several sub- phases (Feller & Fitzgerald, 2002): Code: Writing code and submitting to the OSS community for review. Review: Strength of OSS is the independent and prompt peer review. Pre-commit Test: Contributions are tested carefully before being committed. Development Release: Code contributions may be included in the development release within a short time of having been submitted—this rapid implementation being a significant motivator for developers. Parallel Debugging: The so-called Linus’ Law, “given enough eyeballs, every bug is shallow” as the large number of potential TRIM 7 (2) July - Dec 2011 209 Open Source Systems and Engineering… Iqbal, Quadri & Rasool debuggers on different platforms and system configurations ensures bugs are found and fixed quickly. Production Release: A relatively stable, debugged production version of the system is released. A common classification of the various stages of open source software is planning (only an idea, no code written), pre-alpha (first release, code written may not compile/run), alpha (code released works and takes shape), beta (feature-complete code released but low reliability- faults present), stable (code is usefully reliable, minor changes) and mature (final stage- no changes). Strengths of Open Source Software According to Feller and Fitzgerald (2000), OSS is characterized by active developers’ community living in a global virtual boundary. OSS has emerged to address common problems of traditional software development that includes software exceeding its budget both in terms of time, and money, plus making the production of quick, inexpensive, and high quality reliable software possible. The advantages and unique strengths of open source software systems include release frequency, solution to software-crisis, scalability, learnability and customer input and so on. Release Frequency One of the basic tenets of open source system is “release early, release often” (Raymond, 1999). It is this tenet which helps a significant feedback on a global level to shape up the open source product. With the exceptional globally distributed test-users, who report their fault findings back, the frequent release policy is very feasible. However, highrelease frequencies are infeasible for production environments. For these types of uses, stable releases are provided, leaving the choice about tracking new releases and updates in the hands of the users. Solution to Software-Crisis The recurring problems of exceeding of budget, failure to meet deadlines in development schedule, and general dissatisfaction when the product is eventually delivered especially in highly complex systems always demands an alternative to circumvent these problems so that the socalled “software crisis” is dealt with. Open source software model does promise a solution in this regard. The source of its advantage lies in concurrence of development and de-bugging (Kogut & Metiu, 2001). In fact, OSS is massively parallel development and debugging. Scalability According to Brooks Law, “adding people to a late project makes it later”. The logic underlying this law is that as a rule, software development productivity does not scale up as the number of developers increases. TRIM 7 (2) July - Dec 2011 210 Open Source Systems and Engineering… Iqbal, Quadri & Rasool However, the law may not hold well when it comes to software debugging and quality assurance actions. Unlike, the software development productivity, quality assurance productivity does scale up as the number of developers helping to debug the software increases. Quality assurance activities scale better since they do not require as much interpersonal communication as software development activities (particularly design activities) often do. In an OSS, there is a handful of core developers (who need not centralized but could be spread across the globe) are responsible for ensuring the architectural integrity of the software system. Then, there is a multitude of user-developers who form a user community across the globe. This community conducts the testing and debugging activities on the software released periodically by the core team. There is an obvious dynamism in the roles of the developer-at core and user- in community, in the sense that their roles may change in the context of above discussion. Learnability A very good thing about open source software development is that it is an inherent learning process for anyone involved with it. A member does contribute to the software development but at the same time learns from the community. Thus, open source is a global campaign for skill-set development. According to Edwards (2000), “open source software development is a learning process where the involved parties contribute to, and learn from the community”. Customer Input The informal organizational structure of core and community does not introduce any delays in the reporting of bug by a user to the core, who can immediately fix it. Moreover, the use of some impressive internetenabled configuration management tools [e.g. GNU-CVS (concurrent versioning system)] allows a quick synchronization with updates (issued by core) on part of community. This mechanism of immediate reward, by way of rapid bug-fixing, in open-source user community helps upholding the quality assurance activity. There are no restrictions on bug-fixing by them when the source code is open or they can design a test case of the same for use by core. Such a positive influence of the user community supplements debugging process in its entirety thereby leading to a visible improvement in software quality. This discussion should not drop a notion that OSSS are a panacea. Such systems do have their weaknesses too, in fact, plenty of them. Weaknesses of Open Source Software Open source is by no means a panacea and does have its own weaknesses. As expected, most of the weaknesses are the result of lack of formal organization or clear responsibilities. TRIM 7 (2) July - Dec 2011 211 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Diversity in Communication The globally distributed development environment brings in the developers from different cultural backgrounds and differing time-zones, having never met in person. Moreover, even if they cross these barriers, they hit a stumbling block when skillful community members find it hard to communicate in English. As a result, misunderstandings do crop up and the communicated content may be misconstrued. This may set in a feel of lack of cooperation, good manners, and useful information among the community. Uncoordinated Redundant Efforts With little coordination among the open source team, independent parties sometimes carry out tasks in parallel without knowing about each other. This consumes additional resources, but it may prove to be a blessing in disguise as there may be several solutions to choose from. The choice amongst the alternatives makes the selection difficult. Absence of Organizational Formalism It surfaces multi-pronged weaknesses. Absence of laid down formal rules and conventions, makes it hard for the community to work on systematic lines. This may manifest as lack of organizational commitment in terms of a time-schedule, and a diverging organizational focus. Without a timeschedule and without a concerted focus (spearheading), the distributed nature helps offset priorities, which may be either nonexistent or severely skewed towards the personal biases of influential contributors. In this un-organizational setting where no one is boss and no one is bossed, forcing the prioritization of certain policy matters is not possible. Non-Orientation of New-comers The new-comers do not undergo skill-setting and behavior-shaping orientation training. The new-comers have to learn the nitty-gritty involved very subtly. The tightly-knit community can do well, sharing their cultural backgrounds, but the new-comers are a problem. In fact, every one competes for attention and talent; these barriers to entry are very damaging to a project. Dependency on Key Persons The bulk of the work is done by a few dedicated members or a core team -- what Brooks calls a "surgical team" (Jones, 2000). Instead, we find that many projects critically depend on a few key persons who have the level of intimate knowledge that is required to understand all parts of a large software system. It is usually the core contributors who are the key persons. However, this dependency can become a liability if these key persons are unable to continue work on the project for some reason. It may be impossible to reconstruct the implicit knowledge of these persons from their artifacts (source code, documentation, notes, and emails) alone. This often leads to project failure. TRIM 7 (2) July - Dec 2011 212 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Leadership Traits Open source leaders who, lead by persuasion alone, are judged on the basis of their technical skills, vision and communication skills. Raymond (1999) points out that the success of the Linux project was to a large degree due to the excellent leadership skills demonstrated by its founder Linus Torvalds. The scarcity of good leaders is one of the growthinhibiting factors in open source enterprises. Prospects The open source phenomenon raises many interesting questions. Its proponents regard it as a paradigmatic change where the economics of private goods built on the scarcity of resources are replaced by the economics of public goods where scarcity is not an issue. Critics argue that open source software will always be relegated to niche areas; that it cannot compete with their commercial opponents in terms of product stability and reliability (Lewis, 1999). Moreover, they also argue that open source projects lack the capability to innovate. The OSSS prospect sounds encouraging when the absence of direct pay (compensations) and monetary rewards as well as property right claims have never been a bottleneck for its pervasiveness. It has direct implications on social welfare. Open source may hold key to the so-called “software-crisis”. The flourishing of this model to the extent of a significant market-share in absence of any marketing/advertising makes the prospect even sounder. OSS having been known for operating systems and development tools have already stepped into the arena of entertainment applications’ development. Actively growing interaction between academic institutions and the IT industry has contributed significantly in research and development of such systems and the progress is going great done. Open source is internet-based and hence together with ICT (Information and Communication Technology) has a lot of scope in terms of development and economics. Conclusion As an emerging approach the open source paradigm provides an effective way to create a globally distributed development environment wherein the community on a specific open source software project is interacting constantly and providing feedback to activities such as defect identification, bug fixing, new feature request, and support requests for the further improvement. This activity is rewarded by peer recognition of their work and immediate recognition through credit acknowledgement creating a promotional influence on effective development practices across the community. TRIM 7 (2) July - Dec 2011 213 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Open source has its strengths and weaknesses. The strengths come from its innovative development in and across a global development community of user-turned-developer. The weaknesses stem from the daring defiance of established and matured conventional software engineering principles and practice. However, though a good mix of strengths and weaknesses hold the open source in balance, the prospects of this paradigm are promising. Fostering innovation to improve productivity seems to be mission-statement of open source. References Dinkelacker, J., & Garg, P. (2001). Applying Open Source Concepts to a Corporate Environment. In Proceedings of 1st Workshop on Open Source Software Engineering, Toronto, May 15, 2001 Retrieved from http://opensource.ucc.ie/icse2001 Edwards, K. (2000). Epistemic Communities, Situated Learning and Open Source Software Development. Department of Manufacturing Engineering and Management, Technical University of Denmark, 2000 Feller, J., and Fitzgerald, B. (2002). Understanding Open Source Software Development, Addison-Wesley; London, 2002. Feller, J., & Fitzgerald, B. (2000). A framework analysis of the open source software development paradigm. In Proceedings of the 21st Annual International Conference on Information Systems, pp. 58–69, Brisbane, Australia, 2000. Gurbani, V. K., Garvert, A., & Herbsleb, J. D. (2005). A Case Study of Open Source Tools and Practices in a Commercial Setting. In Proceedings of the 5th Workshop on Open Source Software Engineering, St. Louis, MO, May 17, 2005, pp. 24-29. Open Souce. (n.d). The Open Source Definition. Open Source Initiative. Retrieved from http://www.opensource.org Hars, A., & Ou, S. (2001). Working for free?-Motivations of Participating th in Open Source Projects. In Proceedings of the 34 Hawaii International Conference on System science-2001 Jones, P. (2000). Brooks' Law and open source: The more the merrier? IBM Developer Works, May 2000. Kogut, B., & Metiu, A. (2001). Open Source Software Development and Distributed Innovation. April 2001 Lakhani, K. R., & Hippel, E. Von. (2003). How open source software works: “free” user-to-user assistance. Research Policy. 32 (6), 923–943. Lewis, T. (1999). The open source acid test. Computer, 32 (2), 125-128. doi:10.1109/2.745728 TRIM 7 (2) July - Dec 2011 214 Open Source Systems and Engineering… Iqbal, Quadri & Rasool Lerner, J., & Tirole, J. (2002). Some simple economics of open source. Journal of Industrial Economics. 50 (2),197–234. Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology. 11 (3), 309–346. Nakakoji, K., Yamamoto Y., Nishinaka, Y., Kishida, K., & Ye. Y. (2002). Evolution Patterns of Open-Source Software Systems and Communities. In Proceedings of International Workshop on Principles of Software Evolution (IWPSE 2002) (Orlando, FL, 2002), 76-85. Raymond, E. S. (1999). The cathedral & the bazaar: Musings on Linux and open source by an accidental revolutionary. Beijing: O'Reilly. Raymond, E. S. (2001). The cathedral and the bazaar: Musings on Linux and Open Source by an accidental revolutionary. Beijing: O'Reilly. Torvalds, L. et al. (2001). Software Development as directed Evolution Linux Kernel Mailing List, December 2001 West, J. (2003). How open is open enough? Melding proprietary and open source platform strategies. Research Policy. 32 (7), 1259– 1285. TRIM 7 (2) July - Dec 2011 215 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma Appraisal and Dissemination of Open Source Operating Systems and Other Utilities Satish S. Kumbhar Santosh N. Ghotkar Ashwin K. Tumma Abstract Purpose: In recent years there has been a substantial development in the arena of open source software (OSS) development. Both, academia as well as industry are focusing on developing their software in the open source genre. This paper presents a survey of the Open Source Operating System GNU/Linux and discusses its intricacies at length. It also throws light on some of the extremely popular open source utilities used in diverse sub-domains of Computer Science and Engineering. Methodology: A profound survey and analysis has been undertaken with regard to the open source to build this compilation. Findings: The appraisal and dissemination has found that there is a square increment in the usage of OSS in both academia as well as in the industry. It also throws light on the near future of OSS usage. Research Implications: Any appraise or survey that is conducted today is bound to be superannuated tomorrow. Nevertheless, OSS will continue to remain in the market with newer trends coming in. The paper can be a motivation for further contributions to OSS. Originality/ Value: The paper conglomerates diverse domain OSS that are used in the market, and highlights their usage with emphasis on their other features as well. Keywords: Open Source; GNU/ Linux; Utilities; Debian; Mozilla; Apache Web Server; My SQL. Paper Type: Survey Introduction here has been a gigantic rise in the use of computers and eventually the software used in them. Computers have predominantly made their presence in almost all the domains of mankind. Name any field, and we are bound to find the influence of computers in it. As such, when computers grew, the software required for them also started rising at an exponential pace, and today the scenario is such that the development of software have been ad nauseam. With a galore rise in the computer industry, novel products keep on creeping in the market adding complexities to the diligent customers or T Department of Computer Engineering and Information Technology, College of Engineering Pune, India. email: [email protected] Department of Computer Engineering and Information Technology, College of Engineering Pune, India. email:: sng.comp @coep.ac.in Department of Computer Engineering and Information Technology, College of Engineering Pune, India. email:: tummaak08.comp TRIM 7 (2) July - Dec 2011 216 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma end users. Now, the end user has an array of options available at his service which can be used for his needs and/ or business purpose. Engineers and developers have assiduously been in the quest of pushing the boundaries of Engineering and developing high quality software. This development has mainly revolved around two broad categories, viz. open source and closed source softwares. A recent trend in the arena of software development is the open source genre. OSS are the software which are publicly available in the form of source codes and are distributed under software licenses that allow its users to study the software, make changes to it as per the users' requirement and convenience, improve the software in terms of quality or to cater the users' necessity, even distribute the software with due diligence to the owner, and conforming to the license of the software. The rationale behind the open source code of the software is that the user requires access to un-obfuscated source code because it is exceedingly implausible to evolve programs without modifying them. Since the main motif behind the software development is to make evolution easy, it is mandatory that the modification be made easy. As such there are numerous kinds of software being developed using the terminologies of open source. Open source development has not left any aspect of software untouched. Right from operating systems to benign utilities, open source holds a prevalent share may it be any field. It would not be an exaggeration to say that the number of open source utilities available are much more efficient than those of the propriety ones. The prime reason behind this is that, in most cases, there is scarcely any monotony across the software development, quality and usage. One of the path-breaking developments that happened was the development of GNU/Linux Operating System, an open source operating system. Its underlying source code can be used freely, modified and redistributed, both commercially and non-commercially, by anyone under licenses such as the GNU General Public License. GNU/Linux which falls under the UNIX-like operating system family continuously evolves, with various releases in play along with support for multi-lingual environments. This paper provides a survey of this operating system and the intricacies that are involved in its development, distribution, usage and market shares. Open source has also made its presence felt in other domains of software technologies, like that of web browsers, database management systems, web servers, web application development, data mining, artificial intelligence, virtualization, network related tools, proxy servers, office suites, web cache daemons, bug trackers, etc. Some of the widely known software in the above mentioned technologies are: Mozilla Firefox web browser, MySQL database management system, Apache Web Server, TRIM 7 (2) July - Dec 2011 217 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma LAMP software package or suite for web development, Oracle Virtualbox as a virtualization suite, Weka data mining tool built in Java, Squid proxy server, Open office suites and fields like dynamic and light web application development using AJAX. The paper presents an introduction to some of the above listed technologies in the form of a survey and also explicates the involutions of the same. Open Source Technologies GNU/Linux GNU/Linux falls under the category of UNIX family of operating systems. It is the most popular open source software, of which the underlying source code can be used, freely modified as per the users’ requirements and also redistributed, both in commercial as well as non-commercial domains. The license on which this is built is GNU General Public License (About GNU, 2009). Linux was first gestated by a Finnish software engineer and hacker, Linus Torvalds in the year 1991. The name Linux comes from the Linux Kernel written by him. Later, the primary user space system tools and libraries were taken from the GNU Project. Even after coming a long way after Linux development, the naming issue still remains controversial. The Free Software Foundation has vox populi that the Linux distributions that use GNU software be referred to as GNU/Linux or Linux-based GNU system. But, the media and most population around refer to it as simply, Linux. Authors have no biased opinions to any of sides, and here onwards, GNU/Linux and Linux should refer to the same. Typically, Linux is distributed in the packaged format called the Linux distributions. Linux distribution refers to the monolithic Linux Kernel which handles process control, networking and file system access along with its supporting utilities and libraries. Linux has undoubtedly made its spectacular presence on a wide variety of computer hardware that range from handheld mobile devices, phones and tablet machines to mainframes and high end servers that use supercomputers. It is typically available in two variants, one for the desktop machines and other for the high end servers. Henry (2010) lists Linux as the leading server operating system and flawlessly runs the 10 fastest supercomputers in the world without any compromise and without any degradation or devolvement in its performance. As regards the development taking place in Linux, Linus Torvalds still continues to direct the development of the Linux Kernel. The Linux kernel (Torvalds, 1992) has undergone legion versions the current stable version being: 3.1.1. For years together, the Linux kernel had versions with 2.6.x, with x being a numeric value representing the release. The version 3 kernel has brought in a significant paradigm shift in the frames of kernels TRIM 7 (2) July - Dec 2011 218 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma (Linux Kernel, 2011). Richard Stallman, initiator of the GNU Project, heads Free Software Foundation which supports the GNU components in Linux distributions. Countless programmers worldwide develop third party components that are integrated in the distributions. With respect to the user interface amenities also, Linux stands at the apex. It provides a powerful command-line interface as well as a graphical user interface with many outstanding features which are built on KDE Desktop, GNOME, etc. and popular one being the X system. Today, Linux distributions hold a major share in most of the domains. It is successful in securing a place in the server installations both in homes and academia. Various local and national governments have also started supporting and promoting Linux. In India, Kerala state has also enforced that all high schools and other academic organizations run Linux on them. The market shares of Linux are shooting up at a high pace with a gigantic increase in the revenue of servers, desktops and packaged software. Linux shares an overall 12.7% market share with more than 60% webservers running Linux as against many leading operating systems (IDC Report, 2009). Also in many surveys conducted worldwide, seniors in this domain, recommend Debian Linux distributions for servers because of their sturdiness and power. Analysts and proponents of Linux attribute this success to the security, reliability and low cost with freedom. Debian: The Universal Linux Distribution As mentioned earlier, Debian is one of the Linux distributions available today in abundance. It is a distribution composed of software packages release under the GNU General Public License and other free software licenses. Debian OS is very well known for its conformity to the UNIX and free software terminologies as well as using collaborative software development and testing processes. Debian was first incepted in August 1993, and since then it has earned wide popularity because of its ease in the user operations. Aesthetic beauty of the graphical user interfaces has been the charm of Debian and its variants. The most promising feature that Debain offers is that currently it is available in more than 65 languages, along with support for many Indian vernacular languages. This has contributed to end the tyranny of linguistic burdens of the masses. Debian also uses Monolithic Linux kernel and the current release of it is 6.0.3 which is named as “Squeeze”. Debian managers strictly follow the bugs and perform rigorous testing on the product and do not release their product unless it is bug-free from their perspective (Distro Watch, 2008) Software in Debian is available in the form of packages named .deb packages and can be downloaded and installed lucidly. Even a novice user of the Linux box can easily cope up with it. The reason of wide TRIM 7 (2) July - Dec 2011 219 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma popularity of this OS is its package manager. The package manager here is named as “dpkg” and is the simplest of all the package managers available. Debian simply maintains a repository at various geographical locations in the world, and the users can download the software they require merely by one command or one click, which proves that this is a classic example of simplicity at its best. One of the major variant of Debian is Ubuntu Operating System. Ubuntu is primarily designed for desktop, notebook and server usage. It follows the Debian philosophy and inherits their style. Ubuntu is one of the most popular and favorite operating system amongst the student community because of its ease of use, its free availability and simplicity of software development (Ubuntu, 2011). Web statistics portray that Ubuntu shares more than 50% of the market share of Linux desktop usage in the world. Ubuntu is also gaining rise in the server editions as well (Stat Owl, 2010). Recently, Ubuntu has also stepped in the world of Cloud Computing which is at its peak today. It allows its users to build their own cloud infrastructure, be it public or private clouds. Its sophisticated orchestration tools assist the users in deployment, managing and scaling their cloud related services within seconds, thereby reducing the total down time of any enterprise; which can then, in a long run, increase the capital costs of the enterprise (Debian FAQ, 2008). The Mozilla Project As their tag line says, “We are building a better Internet”, Mozilla project is focused more on the development of internet based software. Their primary and most popular used software being open source web browser, the Mozilla Firefox. The current version of Mozilla Firefox is 8.0 and in a matter of few days, the downloads of the browser have exceeded more than hundred million, which has been a path-breaking record of its kind. Statistics point to the fact that that no other browser has ever achieved such a high acclaim in such a short span of time. It enjoys a world popularity of more than 25% usage (Synder, 2011). Also, Mozilla Firefox has been the first web browser that has rolled out rapid releases/ versions to its users. The aim of this faster-speed process is to get new functions to the users faster. The primary factor which has contributed to such a high popularity of this browser is that, it is based on open source, users and programmers can customize it as per their requirements and one of the vivid features being the availability of add-on features to the browser. Programmers from all around the world, write some add-on features for the browser which can be freely downloaded from the World Wide Web and can be lucidly integrated with the current browser. The aesthetic beauty of the browser is far ahead of comparison with others in this domain. The charm of open TRIM 7 (2) July - Dec 2011 220 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma source development can be clearly seen in case of this browser. Mozilla Foundation sets the policies that govern the development, operates key infrastructure and controls trademarks and other intellectual property of it. The Mozilla Foundation was founded by the Netscape-affiliated Mozilla Organization and was incepted in the year 2003. Since then, the growth of it has been magnificent due to its ideas of open source releases and user satisfaction. The most significant contribution that Mozilla Foundation has given to the world is that, it is dedicated in preserving and promoting a healthy online space by developing versions of Firefox. Mozilla Foundation has partnered with Knoxville Zoo in effort to raise awareness about endangered Red Pandas (Knoxville Zoo, 2011). Apache: The Open Source Web Server Apache HTTP Server Project is developed to design and implement an open source HTTP server for the modern operating systems including all the families of UNIX as well as Windows operating systems (Apache, 2011). The main aim of the project is to provide a secure, efficient and extensible server that provides HTTP services in synchrony with the current HTTP standards. Its initial release was way back in 1995, and since then it is shaping the web accordingly. Its current stable release is 2.2.17. Apache Web server is written in C language and is a cross-platform server. The license under which it is distributed is Apache Software Foundation’s very own, Apache License 2.0. It is a web server that has made significant contributions in the tremendous growth of the World Wide Web. In 2009, Apache was regarded as the only web server software to surpass the 100 million website milestone (NetCraft, 2009). Studies have revealed that Apache is undoubtedly the server that has the maximum market share having more than 60% servers in the world running Apache on UNIX machines (Servers, 2011). The combination of Apache and UNIX has proved to be the most efficient in deployment of Web servers in the world since a long time. The main reason that has led to the wide spread acceptance and usage is the ease and simplicity of deploying the web server software on the machines. The configuration steps of the same are such that a benign user can also set up a web server on his/her desktop machine and serve the web pages. Apache web server also provides a strong security backup. There are no severe attacks reported till date, with regard to the security involvement while using the Apache Web server. In nutshell, statistics show that Apache web server has been the most promising and reliable web server till now. Recent study (Web Server, 2011) reveals that Apache has served over 59.13% of all websites and TRIM 7 (2) July - Dec 2011 221 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma more than 66.62% of the million busiest ones. MySQL Database Management System In the domain of database management systems, MySQL stands out at an apex. MySQL which is also referred to as “My Sequel” is a relational database management system that runs as a server providing multi-user access to a number of databases within it. It was developed by MySQL AB, which is now a subsidiary of Oracle and was first incepted in the year 1995 and its current stable release is 5.5.18. MySQL was successful in capturing the open source database management system market since then. It is primarily written in C and C++ and is also a cross-platform software system. The license under which it is distributed is GNU General Public License. MySQL stands as the world’s most popular open source database (MySQL, 2011) which has a very high download rate as compared to the others in this domain. MySQL also offers high performance and scalability in all aspects related to relational database management systems as well as many other enterprise databases. Recently, MySQL Query analyzer has also been conceived which is built for Java and .Net applications for performance optimization. MySQL also offers many profiling tools to generate reports or profiles of the back-end databases. MySQL has been so successful in its domain because of its ease and simplicity in usage as well as administration. MySQL is offered in the form of both character as well as graphical user interface. The graphical user interfaces are built on top of MySQL servers and are used to manipulate the data in the back-end database servers. The current releases of MySQL claim that there has been a 1500% faster performance of MySQL on Windows operating system (MySQL Stats, 2011), plus 37.0% faster performance on Linux operating systems. Moreover, the scalability, performance schema and partitioning options are enhanced in such a way that they are way ahead of many other such softwares. MySQL also offers a superior protection to the database and uses strong authentication facilities while providing strong internal security algorithms. This is reason, till date there have been no severe database related attacks reported with the usage of MySQL. MySQL powers the most often required Web, E-Commerce and Online Transaction Processing applications very sophisticatedly. It is a fully integrated transaction-safe, ACID compliant database management system. It delivers the ease of use, scalability and performance that has made it the world’s most popular open source database. Some of the world’s most trafficked websites run MySQL for their business and other critical applications. TRIM 7 (2) July - Dec 2011 222 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma Future Enhancements and Conclusion Since the inception of the idea of developing software in an open source way, the concept has come a long way, yet the awareness about the open source terminologies and technologies is not up to the mark as it should have been. One of the lacuna of open source development is that, the source code being available in the hands of multiple personnel, there are many bolts for the same nut; everyone comes up with their own approach of software development and this, number of times, might result in a chaos. No doubt there are versioning systems and profiling systems available, still there needs to be more management in this domain so that the influence of it will surely be more than what it is today. There is a need for enhancement with due respect and due diligence to the current perfectly working community, OSS systems. Open source has still to travel a very long distance and will eventually minimize the software monopolies of some propriety giants in the software world. Open source technologies are now being made mandatory in most of the academic as well as government organizations, but still their use is not up to mark. The ideas and terminologies of developing software in the open source way need to be inculcated amongst the masses from a benign level itself. If this is done, then surely the end-users will get more effective and user convenient software. The main beauty of open source is that, users are able to edit and play with the source code of the system as they wish. When any user is given the privilege of editing the source code of the system as per their convenience and requirements, then obviously there is a high probability of the acceptance of the software system on a large scale. Understanding the users’ perspective and needs is the main key factor that renders open source development on the golden crown. References About GNU. (2009). GNU Operating System: Initial Announcement. Retrieved from http://www.gnu.org/gnu/initial-announcement.html Apache (2011). Apache Web Server Project. Retrieved from http://httpd.apache.org/ Debian FAQ (2008). Debian GNU/Linux. FAQ. Retrieved from http://www.debian.org/doc/manuals/debian-faq/ Distro Watch (2008). Linux Distributions – Facts and Figures. Distro Watch. com. Retrieved from http://distrowatch.com/stats.php?section=popularity Henry Burkhart, KSR (2010). TOP 500 Supercomputer sites. Retrieved from http://www.top500.org/lists/2010/06 TRIM 7 (2) July - Dec 2011 223 Appraisal & dissemination of OS Operating System… Kumbhar, Ghotkar & Tumma IDC Report (2009). IDC Q1 report – Linux for devices. Knoxville Zoo. (2011). Retrieved from Wikipedia http://en.wikipedia.org/wiki/Knoxville_Zoo Linux Kernel (2011). Linux Kernel Archives Download. Synder, R. (2011). Glow 1.0. Firefox 4 Download Stats. Mozilla Website Archive. Retrieved from http://blog.mozilla.com/website-archive/2011/06/14/glow-1-0/ MySQL stats (2010). MySQL Statistics. Retrieved from http://www.mysql.com/ MySQL (2011). MySQL Official Website. Retrieved from http://www.mysql.com/ NetCraft (2009). February 2009 Web Server Survey. Retrieved from http://news.netcraft.com/archives/2009/ Servers (2011). Welcome to the world of web server usage statistics. Retrieved from http://www.greatstatistics.com/serverstats.php Stat Owl (2010). Operating System Version Usage: Market Share of Operating System Versions (OS analysis). Stat Owl. Retrieved from http://www.statowl.com/operating_system_market_share_by_ os_version.php Torvalds L. (1992). Release notes for Linux volume 0.12. Linux Kernel Archives. Retrieved from http://www.kernel.org/pub/linux/kernel/Historic/oldversions/RELNOTES-0.12 Ubuntu. (2011). Retrieved from Wikipedia http://en.wikipedia.org/wiki/Ubuntu_(operating_system) Web Server (2011). January 2011 Web Server survey. Retrieved from http://news.netcraft.com/archives/2011/january-2011-webserver-survey-4.html TRIM 7 (2) July - Dec 2011 224 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri Morphological Analysis from the Raw Kashmiri Corpus Using Open Source Extract Tool Manzoor Ahmad Chachoo S. M. K. Quadri Abstract Purpose: Morphological information is a key part when we consider the design of any machine translation engine, any information retrieval system or any natural language processing application. It is important to investigate how lexicon development can be automated maintaining the quality which makes it of use for the applications, since manual development can be highly time consuming task. The paper describe how we can simply provide the extraction rules along with raw texts which can guide the computerized extraction of morphological information with the help of the extract tool like Extract v2.0. Design/methodology/approach: We used Extract v2.0 which is an open source tool for extracting linguistic information from raw text, and in particular inflectional information on words based on the word forms appearing in the text. The input to the Extract is a file containing, an un-annotated Kashmiri corpus and a file containing the Extract rules for the language. The tools output is the list of analyses; each analysis consists of a sequence of words annotated with a identifier that describes some linguistic information about the word. Findings: The study includes the fundamental extraction rules which can guide the Extract tool v2.0 to extract the inflectional information and help in the development of a full lexicon that can be use for developing different applications in the natural language applications. The major contributions of the study are: Orthography component: A Unicode Infrastructure to accommodate PersoArabic script of Kashmiri. Morphology component: A type system that covers the language abstraction and an inflection engine that covers word-and-paradigm morphological rules for all word classes. Research Implications: The study however does not include all the rules but can be taken as a prototype for extending the functionality of the lexicon. An attempt has been made to make use of automated morphological information using Extract tool. Originality/Value: Kashmiri language is the most widely spoken language in the state of Jammu and Kashmir. The language has very scarce software tools and applications. The study provides a framework for the development of a full size lexicon for the Kashmiri language from the raw text. The study is an attempt to provide a lexicon support for the applications which make use of Kashmiri language. This study can be extended for developing spoken lexicon of Kashmiri language that can be used in spoken dialogue systems. Faculty. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir. 190 006. India. email: [email protected] TRIM 7 (2) July - Dec 2011 225 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri Keywords: Natural Language Processing; Morphology; Lexicon; Kashmiri Morphology; Extract Tool; Logic Paper Type: Design Introduction M orphological information is a key part when we consider the design of any machine translation engine, any information retrieval system or any natural language processesing application. It is important to investigate how lexicon development can be automated maintaining the quality which makes it of use for the applications since manual development can be highly time consuming task. Attempts have been made to use unsupervised learning to automate the process (Forsberg & Ranta, 2004; Creutz & Lagus, 2005) but if under the supervision of humans who simply have to provide knowledge about the rules along with raw texts can guide the computerized extraction of morphological information with the help of the extract tool. Extract v2.0 is an open source tool for extracting linguistic information from raw text, and in particular inflectional information on words based on the word forms appearing in the text. The input to the Extract is a file containing, an un-annotated Kashmiri corpus and a file containing the Extract rules for the language. The tools output is the list of analyses, each analysis consists of a sequence of words annotated with a identifier that describes some linguistic information about the word Morphological lexicon with a wide coverage especially with new words as used in newspaper, texts and online sources forms a key requirement of the information retrieval systems, machine translation and other natural language applications. It would be a time consuming task to extract morphological information manually, so it is natural to investigate how the lexicon development can be automated. Since large collections of raw language data in form of technical texts, newspapers and online material are available and either free or cheap, it is an intelligent idea to exploit the raw data to obtain the high-quality morphological lexicon (Forsberg & Ranta, 2004). Clearly, attempts to fully automatize the process using the TRIM 7 (2) July - Dec 2011 226 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri supervised learning technique do not return the quality as expected (Creutz & Lagus, 2005; Sharma, Kalita & Das, 2002). However, instead of using different techniques of machine learning for lexicon extraction in some form, the language experts can use a suitable open source tool like Extract v2.0 wherein their role would be to write intelligent extraction rules. The extract tool will start with a large-sized corpus and a description of the word forms in the paradigms with the varying parts, referred to as technical stems, represented with variables. In the tool’s syntax, we could describe the first declension noun of Kashmiri with the following definition. paradigm decl1 = x+"r" { x+"i" & x+"iv" & x+"I" & x+"in" } ; All the forms are given in the curly braces , called the constraint, for some prefix x, the tool outputs the head x+"r" tagged with the name of the paradigm for example Ka:r can have other forms like Kar:iv , Ka:ri ,Kar:in. Given that we have the lemma and the paradigm class label, it is a relatively simple task to generate all word forms. The paradigm definition has a major drawback: very few lemmas appear in all word forms but the tool a solution by supporting propositional logic in the constraint. Related Work The most important work dealing with the very same problem, i.e. extracting a morphological lexicon given a morphological description, is the study of the acquisition of French verbs and adjectives by Cl´ement, Sagot & Lang (2004). Likewise, they start from an existing inflection engine and exploit the fact that a new lemma can be inferred with high probability if it occurs in raw text in predictable morphological form(s). Their algorithm ranks hypothetical lemmas based on the frequency of occurrence of its (hypothetical) forms as well as part of- speech information signaled from surrounding closed-class words. They do not make use of human-written rules but reserve an unclear, yet crucial, role for the human to hand-validate parts of output and then let the algorithm TRIM 7 (2) July - Dec 2011 227 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri re-iterate. Given the many differences, the results cannot be compared directly to ours but rather illustrate a complementary technique. Tested on Russian and Croatian, Oliver (2004); Oliver and Tadic (2004 a) describe a lexicon extraction strategy very similar to ours. In contrast to human-made rules, they have rules extracted from an existing (part of) a morphological lexicon and use the number of inflected forms found to heuristically choose between multiple lemma-generating rules (additionally also querying the Internet for existence of forms). The resulting rules appear not at all as sharp as hand-made rules with built-in human knowledge of the paradigms involved and their respective frequency (the latter being crucial for recall). Also, in comparison, our search engine is much more powerful and allows for greater flexibility and user convenience. For the low-density language Assamese, Sharma, Kalita & Das (2002) report an experiment to induce both morphology, i.e. the set of paradigms, and a morphological lexicon at the same time. Their method is based on segmentation and alignment using string counts only – involving no human annotation or intervention inside the algorithm. It is difficult to assess the strength of their acquired lexicon as it is intertwined with induction of the morphology itself. We feel that inducing morphology and extracting a morphological lexicon should be performed and evaluated separately. Many other attempts to induce morphology, usually with some human tweaking, from raw corpus data (Goldsmith, 2001), do not aim at lexicon extraction in their current form. There is a body of work on inducing verb sub categorization information from raw or tagged text (Faure & Nedellec, 1998; Gamallo, Agustini & Lopes, 2003; Kermanidis, Nikos & Kokkinakis, 2004). However, the parallel between sub categorization frame and morphological class is only lax. The latter is a simple mapping from word forms to a paradigm membership, whereas in verb sub categorization one also has the onus discerning which parts of a sentence are relevant to a certain verb. Moreover, it is far from clear that verb sub categorization comes in welldefined paradigms – instead the goal may be to reduce the amount of TRIM 7 (2) July - Dec 2011 228 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri parse trees in a parser that uses the extracted sub categorization constraints. Methodology Kashmiri is a mix of both agglutinating and inflectional type of language. Agglutinating language consists of poly morphemic words in which each morpheme corresponds to a single lexical meaning or grammatical function and by inflectional means that the lexical meanings and grammatical functions are at times fused together. Morphemic processes across most lexical categories such as nouns, verbs, adjectives and adverbs are studied and converted into rules which are input to the extract tools e.g. Nouns in Kashmiri are not marked for being definite. There is an optional indefinite marker –a:h Also animate nouns follow the natural gender system. Gender of a large number of inanimate nouns is predictable from their endings. The following suffixes are added to nouns to derive masculine forms : da:r, -dar , -vo:l, -ul and –ur paradigm decl2 = x+"r" { x+" da:r " & x+"-dar " & x+"-vo:l " & x+"-ul " } ; The following suffixes are added to nouns to derive feminine forms : -en, -in , -e:n, --ba:y , -ir and –va:jen paradigm decl3 = x+"r" { x+" en " & x+"- in " & x+"-e:n " & x+"-ir " & x+"-ir " } ; Morphology Morphology is the study of morphemes, and Morphemes are words, word stems, and affixes, basically the unit of language one up from phonemes. These are often understood as units of meaning, and also part of a language's syntax or grammar. It is in their morphology that we most clearly see the differences between languages that are isolating (such as Chinese, Indonesian, TRIM 7 (2) July - Dec 2011 229 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri Krewol...), ones that are agglutinating (such as Turkish, Finnish, Tamil...), and ones that are inflexional (such as Kashmiri, Russian, Latin, Arabic...). Isolating languages use grammatical morphemes that are separate words. Agglutinating languages use grammatical morphemes in the form of attached syllables called affixes. Inflexional languages change the word at the phonemic level to express grammatical morphemes. All languages are really mixed systems -- it's all a matter of proportions. English, for example, uses all three methods: To make the future tense of a verb, we use the particle will (I will see you); to make the past tense, we usually use the affix -ed (I changed it); but in many words, we change the word for the past (I see it becomes I saw it). Looking at nouns, sometimes we make the plural with a particle (three head of cattle), sometimes with an affix (three cats), and sometimes by changing the word (three men). But, because we still use a lot of non-syllable affixes (such as -ed, usually pronounced as d or t, and s, usually pronounced as s or z, depending on context), English is still considered an inflexional language by most linguists. Paradigm File Format A paradigm file consists of two kinds of definitions: regexp and paradigm. A regexp definition associates a name (Name) with a regular expression (Reg). A paradigm definition consists of a name (Name), a set of variable regular expression associations (VarDef), a set of output constituents (Head) and a constraint (Logic). The basic unit in Head and Logic is a pattern that describes a word form. A pattern consists of a sequence of variables and string literals glued together with the ‘+’ operator. An example of a pattern given previously was x+"r". Propositional Logic Propositional logic appears in the constraint to enable a more finegrained description of what word forms the tool should look for. The basic unit is a pattern, corresponding to a word form, which is combined with the operators & (and), | (or), and ˜ (not). The syntax for TRIM 7 (2) July - Dec 2011 230 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri propositional logic is given in Fig. 1, where Pattern refers to one word form. Fig. 1: Propositional logic grammar kLog ::= kLog & kLog | kLog | kLog | kLog | ˜ kLog | kPattern | ( kLog ) The addition of new operators allow the paradigm in section 1 to be rewritten with disjunction to reflect that it is sufficient to find one singular and one plural word form. The middle vowel /o/ of the structure nouns changes to a central vowel and the final consonant is palatalized. paradigm decl1 = x+"r" { (x+"I" | x+"ur") } ; Regular Expressions The variable part of a paradigm description provided by the tool is to enable the user to associate every variable with a regular expression. The association dictates which (sub-) strings a variable can match. An unannotated variable can match any string, i.e. its regular expression is Kleene star over any symbol. As a simple example, consider German, where nouns always start with an uppercase letter. This can be expressed as follows. regexp UpperWord = upper letter*; paradigm n [x:UpperWord] = ... ; The syntax of the tool’s regular expressions is given in Fig. 2, with the normal connectives: union, concatenation, set minus, Kleene star, Kleene plus and optionality. eps refers to the empty string, digit to 0 − 9, letter to an alphabetic Unicode character, lower and upper to a lowercase respectively an uppercase letter. char refers to any character. A regular TRIM 7 (2) July - Dec 2011 231 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri expression can also contain a double-quoted string, which is interpreted as the concatenation of the characters in the string. Fig. 2: Regular expression kReg ::= kReg | kReg | kReg − kReg | kReg kReg | kReg * | kReg + | kReg ? | eps | kChar | digit | letter | upper | lower | char | kString | ( kReg ) Multiple Variables The Extract tool allows multiple variables, i.e. a pattern may contain more than one variable. The use of variables may reduce the time-performance of the tool, since every possible variable binding is considered. The use of multiple variables should be moderate, and the variables should be restricted as much as possible by their regular expression association to reduce the search space. A variable does not need to occur in every pattern, but the tool only performs an initial match with patterns containing all variables. The reason for this is efficiency — the tool only considers one word at the time, and if the word matches one of the patterns, it searches for all other patterns with the variables instantiated by the initial match. For obvious reasons, an initial match is never performed under a negation, since this would imply that the tool searches for something it does not want to find. TRIM 7 (2) July - Dec 2011 232 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri It is allowed to have repeated variables, i.e. non-linear patterns, which is equivalent to back reference in the programming language Perl. An example where a sequence of bits is reduplicated is given. This language is known to be non-context-free (Hopcroft & Ullman, 2001). regexp ABs = (0|1)*; paradigm reduplication [x:ABs] = x+x { x+x } ; Multiple Arguments The head of a paradigm definition may have multiple arguments to support more abstract paradigms. An example is of Swedish nouns, where many nouns can be correctly classified by just detecting the word forms in nominative singular and nominative plural. An example is given (Fig. 3), where the first and second declension is handled with the same paradigm function, where the head consists of two output forms. The constraints are omitted. Fig. 3 paradigm regNoun = paradigm regNoun = gag+"ar" gag+"ir" kot+"ur" ko+":tar" {...} ; {...} ; The Algorithm Fig. 4 represents the algorithm of the tool is presented in pseudo-code notation. Fig. 4 let L be the empty lexicon. let P be the set of extraction paradigms. let W be all word types in the corpus. for each w : W for each p : P for each constraint C with which w matches p if W satisfies C with the result H, add H to W endif end end end TRIM 7 (2) July - Dec 2011 233 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri The algorithm is initialized by reading the word types of the corpus into an array W. A word w matches a paradigm p, if it can match any of the patterns in the paradigm’s constraint that contains all variables occurring in the constraint. The result of a successful match is an instantiated constraint C, i.e. a logical formula with words as atomic propositions. The corpus W satisfies a constraint C if the formula is true, where the truth of an atomic proposition “a” means that the word “a” occurs in W. Conclusion The paper describes the open source extract tool as a means to build morphological lexicon which requires relatively less human work. Given a morphological description, typically an inflection engine and a description of the closed word classes, such as pronouns and prepositions, and access to raw text data, a human with knowledge of the language can use a simple but versatile tool that exploits word forms alone. It remains to be seen to what extent syntactic information, e.g. part-of-speech information, can further enhance the performance. A more open question is whether the suggested approach can be generalized to collect linguistic information of other kinds than morphology, such as e.g. verb sub categorization frames. References Forsberg, M., & Ranta, A. (2004). Functional Morphology. In Proceedings of the ninth ACM SIGPLAN international conference on Functional programming (ICFP '04) (pp. 213-223). Snow Bird UT, U.S.A. New York: ACM. doi: 10.1145/1016850.1016879 Creutz, M., & Lagus, K. (2005). Inducing the morphological lexicon of a natural language from unannotated text. In Proceedings of the International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning (AKRR ’05), 15-17 June, Espoo, Finland, Espoo (2005) (106–113). Espoo, Finland. Retrieved from TRIM 7 (2) July - Dec 2011 234 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri http://research.ics.tkk.fi/events/AKRR05/papers/akrr05creutz.p df Sharma, U., Kalita, J., & Das, R. (2002). Unsupervised learning of morphology for building lexicon for a highly inflectional language. In Proceedings of the ACL-02 workshop on Morphological and phonological learning (MPL ‘02) (pp. 1-10). Stroudsburg, PA, USA: Association for Computational Linguistics. doi: 10.3115/1118647.1118648 Hopcroft, J. E., & Ullman, J. D. (2001). Introduction to automata theory, languages, and computation (2 nd ed.). Reading, Mass: Addison- Wesley. Cl´ement, L., Sagot, B., & Lang, B. (2004). Morphology based automatic acquisition of large-coverage lexica. Retrieved from http://hal.archivesouvertes.fr/docs/00/41/31/89/PDF/LREC04.pdf Oliver, A. (2004).Adquisici´o d’informaci´o l`exica i morfosint`actica a partir de corpus sense anotar: aplicaci´o al rus i al croat. (PhD Thesis). Universitat de Barcelona. Oliver, A., & Tadic, M. (2004 a). Enlarging the croatian morphological lexicon by automatic lexical acquisition from raw corpora. In Proceedings of LREC’04, Lisboa, Portugal (2004) 1259–1262. Retrieved from http://www.hnk.ffzg.hr/txts/aomt4lrec2004.pdf Goldsmith, J. (2001). Unsupervised learning of the morphology of natural language. Computational Linguistics. 27(2), 153–198. doi: 10.1162/089120101750300490 Kermanidis, K. L., Nikos, F., & Kokkinakis, G. (2004). Automatic acquisition of verb subcategorization information by exploiting mininal linguistic resources. International Journal of Corpus Linguistics, 9 (1), 1-28. doi: 10.1075/ijcl.9.1.01ker Faure, D., & Nedellec, C. (1998). Asium: Learning subcategorization frames and restrictions of selection. In Y. Kodratoff (Ed.). 10th Conference on Machine Learning (ECML 98) – Workshop on Text TRIM 7 (2) July - Dec 2011 235 Morphological Analysis from the Raw Kashmiri Corpus…. Chachoo & Quadri Mining, Chemnitz, Germany, Avril 1998. Springer-Verlag, Berlin (1998) Gamallo, P., Agustini, A., & Lopes, G.P. (2003). Learning subcategorisation information to model a grammar with “Co-restrictions”. Traitement Automatique des Langues. 44 (1), 93–177 TRIM 7 (2) July - Dec 2011 236 Open Access Research Output of the University of Kashmir Asmat Ali Tariq Ahmad Shah Iram Zehra Mirza Abstract Purpose: Open Access has come up with a promising future of making the scholarly content free of cost available to everyone. It has widened the information exchange market and is becoming a worldwide effort to provide free online access to scientific and scholarly research literature in diverse formats including open access journals. The present study attempts to provide an overview of open access publishing in the University of Kashmir. Design/Methodology/Approach: The study is based on the data extracted from SCOPUS, leading Science, Technology and Medicine (STM) citation database of world leading publisher, Elsevier. Findings: The study reveals that OA publishing is gaining popularity in the whole university with substantial amount of research publication already available through OA journals. Keywords: Open Access (OA); Open Access Publishing; University of Kashmir Paper Type: Survey Introduction ssociation of Research Libraries (ARL) refers open access to any dissemination models created with no expectation of direct monetary return and which makes works available online at no cost to the readers (ARL, 2008). In India, poor access to international journals and the low visibility of papers are main problems faced by researchers. OA is viewed as a solution to these problems. OA signifies the democratization of knowledge and supports socially responsible way to distribute knowledge.OA makes the same knowledge and information available to scholars in wealthy, first world nations, in developing excommunist, second world nations, and in under-developed third world nations (Ylotis, 2005). So, open access has proved a blessing to the scholars in one or the other way. Whether, the scholar is an author or the user of scholarly content, it has democratized them in a real sense. OA to scholarly articles can be achieved through two main ways: by being published in an open access journal, or by being deposited in an open access repository (OAR) and open access archives (OAA) respectively (Fernandez, 2006; Chan & Costa, 2005). A Librarian, Nawa Kadal Degree College. Srinagar. Jammu and Kashmir. India. email: [email protected] Research Scholar. Department of Library and Information Science, University of Kashmir. 190 006. India. email: [email protected] Faculty. Department of Library and Information Science, University of Kashmir. 190 006. India. email: [email protected] TRIM 7 (2) July - Dec 2011 237 The open access journals make their quality controlled content freely available to all corners of the world, using a funding model that does not charge readers or their institutions for access. There are several operational models in place, the simplest one being where the journal is set up and run by the university department, published electronically using only the institutions server space and edited and administered, including peer review, by interested scholars. A modification of this is where the journal receives some funding, either by grants or sponsorship, to support some of the editorial or management cost (Correria & Jeixeria, 2005). Scholars all over the globe are actively involved with the open access publishing process. Because of innumerable benefits adhered to open access, scholarly networks all over the globe are adding their scholarly content to it. Whether the scholars are from the developed world or developing world, they have somewhat a common story to say. Scholars have contributed to open access journals as they provide a better and healthy platform to them. The present study attempts to ascertain the trends in open access publishing at the University of Kashmir. Objectives Objectives of the study are To assess the OA research output of University of Kashmir To assess the growth and trends of OA publications To gauge the geographical scattering of OA articles. Problem The elements associated with University of Kashmir have been active elements right from its inception. They have also contributed towards OA movement. The present study tries to explore the level of OA contribution from the University of Kashmir. Methodology Elsevier’s Scopus database was used to identify the research contribution from the University of Kashmir. Scopus claims to be the world’s largest abstract and citation database peer-reviewed literature and quality web resources. After ascertaining the journals in which the authors have made their contribution, the titles were further checked with the OA journal list maintained at the databases of Directory of Open Access Journals and Open J-Gate. Scope The study was confined to the publications that have been published during the last 11 years from 2000 to 2010. TRIM 7 (2) July - Dec 2011 238 Literature Review Arunachalam (2008) stresses the need of OA mandate by various research organizations in India for their own research output and of projects funded by them. Herb and Muller (2008) discovered that the scientists after becoming familiar with open access services use them to a greater extent. Haider (2007) considers open access as a way to connect the developing world to the system of science, by providing access as a way to connect the developing world. Mcculloh (2006) observes that open access initiative is dramatically transforming the process of scholarly communication, bringing great benefits to academic world. Prosser (2004) believes that OA journals and institutional repositories hold out the promise of providing a fairer, more equitable and more efficient system of scholarly communication and can better serve the international research community. Chan & Costa (2005) argues that OA enriches the global knowledge base by incorporating the missing research from the less developed world and improves the south-north and southsouth knowledge flow. Falk (2004) observes that open access is gaining momentum with very broad support from library and professional groups, university faculties and even journal publishers. Lawrence (2001) demonstrates that open access articles can substantially increase their impact implicitly and the impact factor of the source journals. Results and Discussions Open Access Publication Status During the study period, a total of 448 articles have been published of which only 137 (i.e., 30.58 per cent of the total) are of open access nature. Contribution to open access publications at the University of Kashmir is gaining momentum as the no. of OA articles goes on increasing with every passing year. However, the percentage of OA literature was mostly produced in 2010 in which 44.04 per cent of the total output was of OA nature, followed by 34.24 per cent of the total output in the year 2008 (Table 1). TRIM 7 (2) July - Dec 2011 239 Table 1: Open Access Publication Status Total Open Access Year Publications Publication 2010 84 37 (44.04) 2009 87 21 (24.13) 2008 73 25 (34.24) 2007 60 17 (28.33) 2006 43 14 (32.5) 2005 29 6 (20.6) 2004 32 8 (25.00) 2003 20 7 (35.00) 2002 5 1 (20.00) 2001 5 0 (0.00) 2000 10 1 (10.00) Total 448 137 (100) Figures in parentheses indicate percentage Preferred Open Access Journals The authors have made use of 56 different journals to make their scholarly content freely available on the public web. Among these, authors have published a maximum of 22 articles Indian Journal of Pure and Applied Physics followed respectively be 5 articles each in Current Science, Pakistan Journal of Biological Sciences, and Journal of Inequalities in Pure and Applied Mathematics. Table 2 shows the top 12 OA journals in which authors have made maximum contribution. Table 2: Preferred Open Access Journals Journal Title No. of Papers Indian Journal of Pure and Applied Physics 22 Current Science 5 Pakistan Journal of Biological Sciences 5 Journal of Inequalities in Pure and Applied Mathematics 5 Indian Journal of Animal Sciences 4 Indian Journal of Medical Microbiology 4 Pakistan Journal of Nutrition 4 Asian Journal of Plant Sciences 4 International Journal of Botany 4 Library Philosophy and Practice 4 Pharmacology online 4 Tropical Ecology 4 TRIM 7 (2) July - Dec 2011 240 Geographical Pattern of Publications The authors have OA scholarly work available in journals published from 23 different nations. From Table 3, it is clear that a maximum no. of 49 are published in 11 Indian journals, followed respectively by 26 publications in 9 Pakistani journals. On the other extreme, one article was published each in journals from Chile, China, Germany, Hungary, Romania, South Korea and United Kingdom. Table 3: Geographic pattern of publications Country No. of Journals No. of articles India 11 49 Pakistan 9 26 United States 6 14 Thailand 3 3 Iran 3 3 Serbia 2 5 Nigeria 2 4 Turkey 2 3 Brazil 2 3 Poland 2 2 Croatia 2 2 Australia 1 5 Italy 1 4 United Arab Emirates 1 3 Taiwan 1 2 Netherlands 1 2 United Kingdom 1 1 South Korea 1 1 Romania 1 1 Hungary 1 1 Germany 1 1 China 1 1 Chile 1 1 Conclusion Open access has come up with a promising future of making the scholarly content free of cost available to everyone. It has widened the information exchange market and is becoming a world wide effort to provide free online access to scientific and scholarly research literature in diverse formats including open access journals. Open access is found to be much TRIM 7 (2) July - Dec 2011 241 popular in University of Kashmir. It is hoped that with the benefit of OA becoming clear day by day, more publications from University of Kashmir will be available through open access channels. The different stakeholders like library professionals and open access advocates also have a key role in bringing the benefits of open access to the notice of researchers by extension and awareness programs. References Arunachalam, S. (2008). Open access to scientific knowledge. DESIDOC Journal of Library & Information Technology. 28(1), 7-14. ARL. (2008). Association of Research Libraries. Retrieved from http://www.arl.org/OSC/models/oa.html Chan, L., & Costa, S. (2005). Participation in the global knowledge common: challenges and Opportunities for research dissemination in developing countries. New Library World, 106 (3/4), 141-163. doi: 10.1108/03074800510587354 Correria, A. R., & Jeixeria, J.C. (2005). Reforming scholarly publishing and knowledge communication from the advent of the scholarly journal to the challenges of open access. Online Information Review, 29(4), 349-364. doi: 10.1108/14684520510617802 Falk, H. (2004).Open access gains momentum. The electronic Library, 22(6), 527 – 530. doi: doi:10.1108/02640470410570848 Fernandez, L. (2006). Open access initiative in India: An evaluation. Partnership: The Canadian Journal of Library and Information Practice and Research, 1 (1). Retrieved from http://journal.lib.uoguelph.ca/index.php/perj/article/viewArticle /110/171 Haider, J. (2007). Of the rich and the poor and other curious minds: on open access and development. ASLIB Proceedings, 59 (4/5), 449461. doi :10.1108/00012530710817636 Herb, U., & Muller, M. (2008). The long and winding road: Institutional and disciplinary repository at Saarland University and state library. OCLC System & Services, 24(1), 22-29. doi: 10.1108/10650750810847215 Lawrence, S. (2001). Free online availability sustainability increases a paper’s impact. Nature. Retrieved from http://www.nature.com/nature/debates/eaccess/articles/lawrence.html Mcculloch, E. (2006). Taking stock of open access progress and issues. Library Review, 55(6), 337-343. doi: 10.1108/00242530610674749 Proser, D. L. (2004). The next information revolution law open access repositories and Journals will transform scholarly TRIM 7 (2) July - Dec 2011 242 communications. Liber Quaterly: The journal of European Research Libraries. 14 (1). Retrieved from http://liber.library.uu.nl/ Ylotis, K. (2005). The open access initiative: a new paradigm for scholarly communications. Information Technology Libraries. 24, pp. 157162. Retrieved from www.find.galegroup.com/itex/start.do?prodId=ITOF&usergroup nave=bcdelhi TRIM 7 (2) July - Dec 2011 243 BOOK REVIEW EDITOR Prof. M.P. Satija G.N.D University (Punjab), India [email protected] Book Review Mirza Theimer, Kate. (2010). Web 2.0 tools and strategies for archives and local history collections. London: Facet Publishing xvii + 246p. ISBN: 9781-85604-687-9. To connect with and successfully serve the growing generation of native Web 2.0 users or archivists, librarians and other professionals responsible for historic collections must learn how to accommodate their changing information needs and expectations. In this clearly written jargon-free guide, Kate Theimer demystifies essential Web 2.0 concepts, tools, buzzwords and provides a thorough introduction to the key concepts of Web 2.0 and how the transition from Web 1.0 to 2.0, including the ways to interact with traditional audiences and attract the new ones, has provided a greater visibility and increased opportunity for resource discovery. The author explains briefly a variety of Web 2.0 technologies and their functionalities in Chapter 1, and also addresses popular fears of using them. Chapter 2 discusses the evaluation of Web 2.0 tools and how they can fit into an overall outreach plan and help readers to assess their current web presence. Chapters 3-9 each focuses on one important and widely used Web tools/services namely: blogs, podcasts, Flicker, You Tube, Twitter, wikis, Facebook, etc. The chapters follow a common structure with discussion of the functionalities of Web 2.0 tools in archives and their implementation requirements. In most cases, the screenshots, checklists as well as interviews of some of the personalities that have successfully utilized Web 2.0 tools is also captured. Chapter 10 draws the user attention towards the lesser-used mashups, widgets, online chat and second life. The experiences of archivists from institutions in US, UK and Australia are also highlighted. Chapter 11 raises an issue of measuring success of Web 2.0 implementations in archives. It provides a useful division between measuring outputs, outcomes and some practical tips are also given. Chapter 12 reviews the range of management and policy concerns for successful web project and considering factors to plan their implementation. The book also covers some suggested readings incorporated in Appendix corresponding to Web 2.0 tools, highlighting additional resources for further consultation. The book is thus a good read for any one working with historical and cultural collections: archivists, local history librarians and information professionals, to take advantage of Web 2.0 technologies. The book/ author provides with detailed look at the latest technologies with the real TRIM 7 (2) July - Dec 2011 244 Book Review Mirza world examples of archives and libraries using these technologies to enhance their online presence, showcase services and increase patronage. Professionals will find this manual guide valuable for promoting their services in a digital age and attracting even the most tec savvy of patrons. Iram Zehra Mirza Faculty Department of Library & Information Science University of Kashmir Hazratbal, Srinagar TRIM 7 (2) July - Dec 2011 245 NEWS SCAN EDITOR Dr. Sumeer Gul Assistant Professor Department of Library and Information Science University of Kashmir [email protected] Assistant Professor Department of library and Information Science University of Kashmir [email protected] News Scan Gul Department of Library and Information Science, University of Kashmir to Host Seminar on Emerging Frontiers of Digital Libraries: Perspectives, Empowerment and Advocacy (EFDL-I) The Department of Library and Information Science, University of Kashmir in collaboration with University Grants Commission is organizing a two days seminar on Emerging Frontiers of Digital Libraries: Perspectives, Empowerment and Advocacy. The aim of the conference is to explore different perspectives of digital libraries for playing an increasingly active role in learning and sharing in an open environment to empower stakeholders to make an understanding of different opportunities cum challenges and making an advocacy and promotion of different prospects which are likely to become highly available in the world of scholarship. For details log on http://lis.uok.edu.in/ Authors Sue Universities over Digital Libraries: Alleged Copyright Infringement Over Grey Area 'Orphan' Books The Authors Guild, the Australian Society of Authors, the Union Des Ecrivaines et des Ecrivains Quebecois (UNEQ) and eight individual authors have filed lawsuits against the University of Michigan, the University of California, the University of Wisconsin, the Indiana University and the Cornell University for copyright infringement, according to the Associated Press. The issue revolves around the use of 'orphan' works, which are texts that are out of print, with no known whereabouts for the author, leaving the books in a kind of grey area on the outskirts of copyright law. These texts were uploaded to the University of Michigan's Hathitrust online library, to which other universities subsequently signed up. The books were scanned from the University's physical library by Google, with five million done so far and several million left to go, but the authors and author societies claim that the scanning was unauthorised and illegal. Source: The Inquirer. September 13, 2011. Available at: http://www.theinquirer.net/inquirer/news/2108766/authors-sueuniversities-digital-libraries UK Higher Education Funding Bodies Choose Elsevier's SciVerse Scopus as Data Provider for 2014 Research Excellence Framework The four UK Higher Education Funding Bodies (representing England, Northern Ireland, Scotland and Wales) will use Elsevier's SciVerse Scopus database as the sole bibliometric provider for the 2014 Research Excellence Framework (REF). The Framework was developed to assess the quality of research in UK higher education institutions. TRIM 7 (2) July - Dec 2011 246 News Scan Gul Source: The Wall Street Journal. Market Watch. September 19, 2011. Available at: http://www.marketwatch.com/story/uk-higher-education-fundingbodies-choose-elseviers-sciverse-scopus-as-data-provider-for-2014research-excellence-framework-2011-09-19 Robot to Bring Virtual Rolling Tours of Campus Library Baylor Libraries has committed to purchase VGo, a remotely controlled robot that will be used for virtual tours of Armstrong Browning Library. Baylor hopes the tool will help enhance learning for students in grades K12. Source: The Baylor Lariot. November 18, 2011. Available at: http://baylorlariat.com/2011/11/18/robot-to-bring-virtual-rolling-toursof-campus-library/ Libraries in Gujarat to Digitally Bridge the Knowledge Gap Libraries in Gujarat are taking steps to digitally unify and share their resources. Boundaries of the libraries are expanding beyond four walls and library professionals are gearing up to take up the challenge of using IT in disseminating authentic, latest and right kind of information to the right type of users. Source: Daily News and Analysis (DNA). Friday November 18, 2011.Available at: http://www.dnaindia.com/india/report_libraries-in-gujarat-to-digitallybridge-the-knowledge-gap_1614404 MIT Launches Open Source Software Program The Massachusetts Institute of Technology (MIT) announced the launch of a new program called MITx. Under this initiative, students will be able to take online classes through an open source software program provided by the university. While school officials believe the software will be of great use to its on-campus students, they eventually hope it will also foster a virtual community of learners from around the world. The classes will be freely available to anyone who has Internet access. However, individuals who want to demonstrate their mastery of the material and earn credentials for their work must pay a small fee. For more information log on: http://www.usnewsuniversitydirectory.com/articles/mit-launches-opensource-software-program_12019.aspx TRIM 7 (2) July - Dec 2011 247 News Scan Gul Archives Department Takes Up Digitisation of Padmanabhaswamy Temple Records The State Archives Department is digitising the Mathilakom records (old palm leaf manuscripts of Padmanabhaswamy temple in Thiruvananthapuram) as part of the second phase of digitisation of old records. The records throw light on the history of the temple, and digitization might help in researching the records and finding missing links. There is renewed interest in the records because of the finding of large quantum of wealth in the temple vaults. Assistant Archivist Ashok Kumar told The Hindu that the State Archives had the largest collection of palm leaf records in the whole of Asia. The Department had plans to digitise all of them so that the information could be preserved. (The cadjan manuscripts were susceptible to climatic conditions). The process involved cleaning and scanning of the records and conversion into portable document format. Source: The Hindu. December 25, 2011. Available at: http://www.thehindu.com/news/states/kerala/article2746895.ece Islamic Manuscripts Conference at Cambridge University The Islamic Manuscripts Association, the Thesaurus Islamicus Foundation and the Al-Waleed bin Talal Islamic Research Center will jointly hold an Islamic manuscript conference at Cambridge University from 9 to July 11, 2012. According to the Miras Maktoub (written heritage) Research Center, the association invites all interested researchers to submit their articles to the secretariat of the conference on Islamic manuscripts, codicology, and conservation as well as management of manuscript collections. Source: The Islamic Manuscript Association http://www.islamicmanuscript.org/ TRIM 7 (2) July - Dec 2011 248 NOTE FOR CONTRIBUTORS The journal accepts papers of original research in the field of Library and Information Science which are not under consideration by any other journal or conference proceedings etc. All papers are subject to blind peer review. The papers may be submitted preferably online on email id: [email protected] or can be sent to the editor in CD. The paper should include an abstract highlighting problem, methodology and findings with three to five keywords. The softcopy should be submitted in MS-Word and figures, if any, be submitted as a separate graphic file (GIF, JPEG, or PNG format). Authors should follow APA style (American th Psychological Association) 6 ed for citation and references. A. For Citation Busha and Harter (1980) or (Busha & Harter, 1980) B. For Reference Book Busha, C.H., & Harter, S.P. (1980) Research methods in librarianship: techniques and interpretation. New York: Academic Press. Journal Article Wei, J., Stankosky M., Calabrese, F., & Lu, L. (2008). A framework for studying the impact of national culture on knowledge sharing motivation in virtual teams. VINE, 38(2), 221-231. doi: 10.1108/03055720810889851 Essays or Chapters in Edited Books Mangla, P.G. (1985). Library and Information Science in India. In Gupta, B.M., Guha, B., Rajan, T.N., and Satyanarayana, R. (Eds.), Handbook of libraries, archives & Information centers in India (p. 229-256). New Delhi: Information Industry An internet document Applegate, Lynda. M. (2009). Building Businesses in Turbulent Times. Working knowledge: A first look at faculty research. Retrieved March 29, 2011 from http://hbswk.hbs.edu/item/6159.html The Editor TRIM, Department of Library and Information Science University of Kashmir, Hazratbal, Srinagar India 190 006