here - Computer Science

Transcription

Trends in Information Management (TRIM) is a
biannual journal of the Department of Library
and Information Science, University of Kashmir,
India aimed to publish original papers on various
facets of Information and Knowledge
management. The Journal welcomes original
articles, reviews of books, professional news
etc., relevant to the focus of the journal.
TRIM
ISSN: 0973-4163
July – Dec 2011
Vol. 7 No. 2
© 2011, Department of Library and Information Science
Frequency:
Subscription Annual:
Biannual (June & December)
India:
INR 500.00
Foreign: USD 20.00
EDITOR
Professor S.M. Shafi
Department of Library and Information Science
University of Kashmir, Srinagar, J&K
India 190 006
ASSOCIATE EDITOR
Dr. Sumeer Gul
Assistant Professor
University of Kashmir, Srinagar, J&K
India 190 006
Online Access to TRIM is available through:
EBSCO HOST’s Database: Business Source Complete
www.ebscohost.com
TRIM is indexed in
o
ulrichsweb.com(TM) -- The Global Source for Periodicals
o
Cabell Directory: Educational Technology & Library Science
database
Editorial Board
Professor P.B. Mangla
Former Head, Department of Library and Information Science
University of Delhi, Delhi, India
Professor Shabahat Husain
Aligarh Muslim University, Aligarh, U.P, India
Professor N. Laxman Rao
Osmania University, Hyderabad
Andhra Pradesh, India
Dr. Vittal S. Anantatmula
Associate Professor
Director, Graduate Programs in Project Management
College of Business, Western Carolina University
North Carolina, United States
Dr. Deva E. Reddy
Associate Professor
Texas A&M University
University Libraries
College Station. Texas USA
Dr. Muhammed Salih
Medical Reference Librarian
National Medical Library
Faculty of Medicine and Health Sciences
United Arab Emirates University
Al-Ain, United Arab Emirates
Dr. Jagdish Arora
Director, Information and Library Network
(INFLIBNET), Ahmedabad, Gujarat, India
Professor M.P. Satija
GND University, Amritsar, Punjab, India
Professor Jagtar Singh
Dean, Faculty of Education and Information Science,
Punjabi University, Patiala
Punjab, India
Professor I.V. Malhan
Head, Department of Library
Central University of Himachal Pradesh
H.P, India
Editorial
‘O
pen access’ and ‘collaboration’ may help societies to grow
and prosper in free exchange of ideas and opinions,
resulting in establishment and evolution of an informed
society. It is here that Open Source Software systems (OSS) have derived
much support from this premise to become a global phenomenon by
fuelling development and research in wide areas of applications across
academic, professional and social initiatives .These are emerging and
nurturing parallel to commercial endeavors with varied levels of success
and failures. Among many, Library and Information Studies have a natural
synergy with Open Source Movement and consequently libraries are
frequent users of OSS though staff may often be unaware of utilities
derived from such options.
Hence the Department of Library and Information Science (University of
Kashmir) in collaboration with Department of Computer Sciences
(University of Kashmir) decided to organize a National Seminar on the
theme to have wider understanding and demonstration of far reaching
implications in the field and applications to mainstream Library
landscape. We received more than 100 papers, posters for the
presentation for three days National Seminar on Open Source Software
Systems: Challenges and Opportunities (OSSS 2011) (20-22 June, 2012)
which were deliberated upon in different technical sessions. Later, it was
decided to publish select useful papers in the present special issue of
TRIM. These papers can be broadly categorized into three groups
throwing light on philosophy, technical developments and applications.
Another cluster is devoted to focused development about Open Journals
/ Repositories – a blossom of Internet and Open Digital Library
convergence – in an Open Access mode.
It will not be out of place to lay down a brief background about the
developments of OSS systems and applications especially Digital Library
(DL) landscape here to assist a reader to navigate the journal issue in a
purposeful manner. The systems that hold the trade mark of the term
i
OSS, defines it not only in terms of availability of source code but include
free distribution as well as its future use with modification and derived
works thereof without any discrimination, of course, with a licensed
system. Hence, Stallman (2011) has rightly identified four kinds of
freedom for open Source applications supported by licensing that include
freedom to run the program for any purpose, adapt it to serve ones
needs, redistribute copies and improve the programme to community.
This extensive use of freedom has led to recent shift to the acronym
FLOSS (Free / Libre and Open Source Software) (Ghosh, 2002) but it
seems that OSS is still more familiar acronym among many initiatives /
users. OSS licenses are of particular concern to make it available to others
and need to understand them before committing to use the software for
any project. The most common license is GNU (General Public License,
2007) (GPL) which is based on concept of “Copy Left” that attempts to
negate copyright for the purpose of collaborative software development.
‘Creative Commons’ (n.d) is similar to that of GPL which is meant for
creative works like research papers, and often utilized within software
projects. The others include (a) GNU, Lesser General Public License (2007)
(LGPL) (b) Berkley System Distribution (BSD) License (Open Source
Initiative, 2010) (c) Mozilla Public License (MPL) (2011) (d) Netscape
Public License (NPL) (2011) (e) and OCLC Research Public License (2002).
Besides, the Internet has become so ubiquitous that greatest
participation on open software’s occurs over it. Some of the pillars of
internet computing such as Send mail sever & BIND, the software that
runs the web Domain Name Systems (DNS) are OSS applications. Apache,
the most popular server in the world is both maintained and enhanced
through open source model. Out of all OSS systems available, it is LINUX
that is most recognizable and identified as the poster child of OSS (Rhyno,
2004).
In addition libraries have a natural synergy with open source movement,
for being available to a wide community of users of non-profit, publicity
funded basis and like most organizations frequent users of open source
software. Besides, in the present era the emergence of digital libraries
ii
has become a main bridge to connect open source with intellectual
property of sharing the main collection of libraries. It poses tribulations
but empower one to be to be part of knowledge society. Hence, much
work and research has emerged from slop out content for digital libraries
to use of authoring tools, protocols for exchange purpose, long term
preservation etc. The content of digital libraries vary greatly particularly
in media type but one format that comes across media is XML which has
timed a key enabler along with Meta data for realizing value of digital
libraries and paved way for development of different tools of semantic
web like Resource Description framework (RDF) and Topic maps.
Important protocols and OSS options for using them have revolutionized
to help to communicate with many external systems. It is the Hypertext
Transfer Protocol (HTTP) which powers the web to exchange files (text,
graphic images, sound, video and other multimedia files). The other
distinguished ones include OAI–PMH (the open archives initiative
protocol for metadata harvesting), serving mainly as a transport
mechanism between digital libraries, including Apache, Z39.50, SOAP,
RSS, ATOM and Shibboleth etc. Authoring tools have developed in
different dimensions for creating digital versions of the object – mainly
image tools and editors to address graphics in reality with colour depth
and compression. For commercial package, flagship image tool is Adobe
Photoshop, a full featured and powerful image editing tool but
reasonable alternative is GIMP (GNU Image Manipulation Program). The
other tools for image in OSS include Image Magical, GNU Paint, SANE,
Sweep Sox, Bender etc. The most interesting and challenging from OSS
perspective is Open Office especially for archiving and preserving
contents of documents, even produced in any other word processing
package.
The open source has made much headway in other aspects like Relational
Databases where My SQL has been most popular and long lasting
application and runs on all major platforms. The others include Postgre
SQL, Berkley DB etc. It has served well in programming languages as well
like Perl, PHP, Python etc besides building specific systems in many public
iii
systems like digital libraries by developing popular and well known
systems like D Space, Greenstone, Fedora, Eprints etc. Many more open
Systems like OJS, OCS, OMP and OHS are offered through PKP (Public
Knowledge Project, 2011)
breaking ground for managing and
disseminating freely scholarly publications at different levels eventually
turning into a boom for academic and research endeavors.
I hope that more seminars / symposia need to concentrate over the
challenges and explore opportunities to address key issues in emerging
society focused to make waves more transparent, responsive and user
friendly. I am grateful to my all colleagues especially, Dr. Sumeer Gul, Mr.
Nadim Akhtar Khan and Mr. Tariq Ahmad Shah without whom it was not
possible to organize the seminar and bring out this issue. They have burnt
their midnight oil in coordinating activities and later selecting and editing
papers for the present issue. My sincere gratitude goes to University of
Kashmir authorities, J&K Government (IT Department) and J&K Bank
Ltd. for sponsoring the event which also proved supportive in bringing
the present publication.
S. M. SHAFI
References
Creative Commons. (n.d). About The Licenses. Creative Commons.
Retrieved from
http://creativecommons.org/licenses/
Ghosh, R. A. (2002). Workshop Report on Advancing the Research Agenda
on Free / Open Source Software. University of Maastricht, The
Netherlands: International Institute of Infonomics.
Retrieved
from
http://flossproject.org/report/workshopreport.htm
GNU Lesser General Public License. (2007). GNU Lesser General Public
License. GNU Operating System. Retrieved from
www.gnu.org/copyleft/lesser.html
General Public Licence (GNU). (2007). General Public Licence. GNU
Operating System. Retrieved from
iv
http://www.gnu.org/licenses/gpl.html
Mozilla Public Licence (MPL). (2011). Mozilla Public Licence. Mozilla.
Retrieved from
http://www.mozilla.org/MPL/
Netscape Public Licence. (2011). Amendments. Retrieved
http://www.mozilla.org/MPL/NPL/1.1/
OCLC Public Research Licence. (2002). OCLC Public Research Licence.
OCLC. Retrieved from
http://www.oclc.org/research/activities/software/license/v2fina
l.pdf
Open Source Initiative. (2010). Open Source Initiative OSI - The BSD
License: Licensing The BSD 2-Clause License. Open Source
Initiative. Retrieved from
http://www.opensource.org/licenses/bsd-license.php
Public Knowledge Project. (2011). Public Knowledge Project. Retrieved
from
http://pkp.sfu.ca/
Rhyno, A. (2003). Using open source systems for digital libraries.
Westport, Conn: Libraries Unlimited.
Stalman, R. (2011). Richard Stallman's Personal Home Page. Retrieved
from
http://stallman.org/
v
CONTENTS
Editorial
i-v
S. M. Shafi
Changing Designer-Function in Open Source
74
Gagan Deep Kaur
Open Access Journals in Library and Information Science:
The Story so Far
87
Reyaz Rufai, Sumeer Gul and Tariq Ahmad Shah
Graph Based Framework for Time Series Prediction
98
Vivek Yadav and Durga Toshniwal
Quality Practices in Open Source Software Development
Affecting Quality Dimensions
108
Sheikh Umar Farooq and S. M. K. Quadri
Open Source Tools for Varied Professions
127
Nadim Akhtar Khan
Analysis of Operating Systems and Browsers: A Usage
Metrics
139
Mohammad Ishaq Lone and Dr. Zahid Ashraf Wani
Institutional Repositories: An Evaluative Study
152
Tabasum Hashim and Tariq Rashid Jan
Open Source Code Doesn’t Always Help: Case of File
System Development
160
Wasim Ahmad Bhat and S.M.K. Quadri
A New Approach of CLOUD: Computing Infrastructure on
Demand
170
Kamal Srivastava and Atul Kumar
Einstein’s Image Compression Algorithm: Version 1.00
Yasser Arafat, Mohammed Mustaq and Mohammed Mothi
179
Open Source Software (OSS): Realistic Implementation of
OSS in School Education
186
Gunjan Kotwani and Pawan Kalyani
Measurement of Processes in Open Source Software
Development
196
Parminder Kaur and Hardeep Singh
Strengths,
206
Appraisal and Dissemination of Open Source Operating
Systems and Other Utilities
216
Open Source Systems and
Weaknesses and Prospects
Engineering:
Javaid Iqbal, S.M.K.Quadri and Tariq Rasool
Satish S. Kumbhar, Santosh N. Ghotkar and Ashwin K. Tumma
Morphological Analysis from the Raw Kashmiri Corpus
Using Open Source Extract Tool
225
Manzoor Ahmad Chachoo and S. M. K. Quadri
Open Access Research Output of the University of Kashmir
237
Asmat Ali, Tariq Ahmad Shah and Iram Zehra Mirza
Book Review
244
Iram Zehra Mirza
News Scan
Sumeer Gul
246
Kaur
Gagan Deep Kaur
*
Abstract
Purpose: Open source, since its arrival in 1990’s has been instrumental in
challenging the copyright owned by traditional texts and narratives. Dissolving of
author-function heralded by post-modern philosophers like Roland Barthes has
been materialized by open source technology completely. However, along with
textual narratives, similar dissolution can be seen for pictorial representations as
well like the designing of a web-site for instance. Open source makes available
technologies which along with, the content, the presentation, for instance, of the
website in equal dissolution stage. This paper seeks to explore this trend for
presentation part and argues how, like author-function, designer-function, is in
jeopardy
Design/Methodology/Approach: For this purpose, the two open-source tools
AddArt and ShiftSpace, the Mozilla Firefox extensions have been used to bring
home the impact of these in terms of altering the presentation of various
websites.
Findings: The tools have been found to be significantly impressive in modifying the
way the content is presented on the website without the owner’s permission. The
modifications effected by these tools are presented at the end of the paper in the
form of various snap-shots.
Research Implications: The implications of these tools can range from purely webethical to political. The tools are not only found to raise ethical concerns of privacy
and ownership of the way information is to be presented by the owners, but
politically also present a trespassing avenue through which capitalist ideas of
ownership of information through copyright can be effected.
Originality/Value: The paper has attempted to bring to the forefront the ethicopolitical concerns associated with naively appearing open-source browser
extensions like AddArt and ShiftSpace. The considering of concerns can provide a
further criteria for evaluating such extensions apart from purely technical which is
a norm at present.
Keywords: Designer-Function, AddArt, ShiftSpace, Web Aesthetics, Open Source
Paper Type: Research
Section -1: Salvaging the Author-Function
isruptive technologies like internet led to subversion of power
structures on the one hand, and the mechanism of knowledge
production and access on the other. Stallman’s Free Software
movement, in 1980s, opened up a new horizon where knowledge access,
generation and distribution became a collective concern. With Open
Source (OS) Initiative opening wings in early 1990s, the users marveled
and appreciated the easy access of knowledge sources which came to be
D
*
Research Scholar. Indian Institute of Technology, Bombay, India.
e-mail: [email protected]
TRIM 7 (2) July - Dec 2011
74
Kaur
seen as a societal property, instead of one individual or corporation’s
intellectual one Vainio and Vaden (2007). The benefits of freedom to
view source code, modifying and redistributing the derivative works,
integrating and experimenting with different products, doing away with
huge licensing hassles instilled the collective imagination where
collaborative sharing came up as new model of production. It quickly
seeped to non-techie sharing when artistic works like texts, sound
recordings, songs, movies, paintings, came to be shared, modified and
distributed online. Creative Commons (CC) supported the free access and
distribution of original and derivative works in these spheres through a
range of licenses.
At the same time, however, we see apprehensions growing in non-OS
1
quarters regarding the disappearance of ‘author’ from the scene as
derivative works could be easily made from the original, at low cost,
without having to have prior permission and commercially distributed as
well. Post-modern philosophy of authorship enunciated by Barthes and
Foucault had been major precursors of free software initiatives and such
apprehensions resultantly. Prior to the emergence of cyberspace, Barthes
(1977) persuasively argued about the death of author in narratives as text
is continually re-interpreted by readers. Author was reduced to authorfunction following Foucault (1979) proposal as author draws heavily from
the background cultural context for creating anything. These acted as
catalyst for doing away with the copyright regime in the debate that
ensued between the advocates of free software and proprietary ones.
However, the emergence of open source materialized their
apprehensions. Since open source is a platform of sharing and
distributing intellectual material freely without the proprietary rights
associated with their authors as strictly as in copyright regimes the
apprehension of death of the original author having proprietary rights on
its intellectual property by default, comes out as inevitable. In the case of
non-textual, aesthetic works on web like paintings, designing websites,
etc. open-source renders the designer of the original work (painting or a
webpage) virtually non-existent. In analogy to author-function, it is
designer-function that is at loss in web aesthetics.
It is argued in this context that apprehension of disappearance of author
from the scene is by and large a myth regarding open source as author is
very much at the centre in this model of knowledge production. Though,
open source does allow the freedom of accessing and modifying the textbased works, narratives or software programs, it does not do so at the
cost of deleting original author from the scene. It is proved from the fact
that only those materials are left for free accessing and modifying which
1
Author read here as writer, coder, programmer, as related to the particular kind of work.
75
Kaur
already bear the stamp of the author, i.e. which are already protected
under copyright as shown by Liang (2007). Even when released under
Creative Common’s license, the derivative works and the fresh
manuscripts put author in the centre and grant a series of protective
filters, which can be granted as rights to users regarding use of these
products. These filters include granting users various rights ranging from
right to free access to copy, to modify, to all of them. This means that if a
work has been released with license allowing users only to access the
work freely, the modification and distribution done by them would be
considered illegal. How far the author is comfortable with user’s using the
work depends entirely on author himself. A cursory glance at various
licenses will show this:
 Creative Commons Attribute License (CCAL) – Authors retain
ownership of the work, but freedom to access, modify, distribute the
works is allowed without prior permission. CC itself has range of
freedoms associated with their license from mere accessing to full
use of the work by the user. The range include
 Attribution by (cc by) – Full rights to the user including
distribution, remix, tweak, using commercially, provided credit is
given to the original author
 Attribution share alike (cc by-nd) – Allows redistribution
commercial or non-commercial, as long as credit is given to the
original author.
 Attribution non-commercial (cc by nc) – Allows access,
distribution and building upon the work, but with noncommercial usage. Derivates works made from user’s modified
works don’t carry the non-commercial condition
 Attribution non-commercial non share alike (cc by nc-sa) – All
above freedoms with the condition of non-commercial usage.
Also, derivates works can’t be used commercially as well.
 Attribution non-commercial no derivates (cc by nc-nd) – Strictest
of all. Allows redistribution with crediting the author, but no
derivatives works or their commercial utilization allowed.
Licenses (2010)
 GNU-GPLv3 – Freedom to access, modify and redistribute the code,
without commercial gains GNU General Public License (2007). This is
generally applied to the software codes, programs, etc.
 Free Art License – Allows user to access, modify and redistribute
creative works of art as long as proper credit is given to the original
creator/designer/author. It is in line with Berne Convention for the
Protection of Literary and Artistic Works. (See Akinci, 2007 for
detailed discussion of these licenses).
Thus, the original author is preserved in the scene by way of giving
76
Kaur
appropriate credits, and also the credits of the authors of derivative
works are preserved similarly. Author-function is not at loss even if the
work is released commercially via open source. Rather, author is now
more secured as it is now author who decides what range of freedom he
wishes the users to have regarding the appropriation of his work. The
broad range of author’ rights for creative works provided in Creative
Commons license testify to this security.
Section – 2: Designer-Function in Cyberspace
The designer-function pertaining to non-textual, especially visual arts is
represented by webpage designing, graphics, image-generation and is
analogous to author-function for textual content. Contemporary culture
is highly visual-symbol centric (Thorlacius, 2007a) and fast replacing the
information presented via textual narratives. A simple case of news
websites would illustrate where it is almost, “impossible to read history
from an internet archive [for instance] without constant intrusions by the
latest news and banners… the event is grasped visually and there is
nothing to comprehend or interpret in it”. Banners, thumbnails, creeping
lines with breaking news, photo galleries are dominant elements of visual
aesthetics on a news portal. Similar stories can be seen in mass-serving
portals like shopping, entertainment, consumer-product sites etc. Web
aesthetics pay a crucial attention to these visual symbols while designing
such portals and is fast emerging and important aspect of HumanComputer Interaction (Tractinsky, 2005). Rather, in any website, visual
aesthetics are considered at par with site’s functionality, user-friendliness
etc. However, the emergence of rapidly increasing user-control over
visually presented content like images, advertisements etc. is replacing
the designer with designer-function. This is because not only the content,
but form as well of the site is coming under user’s hold. Earlier it was the
owner of the site who decided what and how the information was to be
displayed. The opening of the code empowered the widely distributed
network of users to access, modify and redistribute the what-part or
content of the software. How-part relates to the way information is
presented on a single webpage in terms of its layout, designing etc. which
was earlier by and large in the control of web designer at the back.
Modifications to this form were delegated to the designer. A bit of
customizing ability would be given to users which mostly related to
changing the default colors or background themes as exemplified in email
accounts or social networking profiles. Still, what kind of themes or colors
a user could possibly use on a site was by and large decided by the
designer-cum-owner of the site at the back. Moreover, this customization
was facilitated for user’s personal accounts only, not the general form of
the website. For example, users are not permitted to change how the
77
Kaur
Gmail webpage looks!
A steady form over the years builds an image of the product in the mind
of users as compared to the ones changing their look every two months.
A blue broad horizontal strip with a wide patch of world map dotted with
orange busts on the left instantly connects the users to Facebook. Since
the form is designed by the site-owner-cum-site-designer, user has to
agree to whatever form he is presented with. Even if a large body of users
may not like a particular form, they have no choice but to yield. They
can’t decide for example what images should appear on a webpage, how
page logo should appear, what color pattern should be adopted and so
on. Ideologically, the complete ownership of form by the designer
presents a dictatorial regime where traffic is unidirectional – from one
designer to many users, from closed top to expanding bottom in a
pyramid. ShiftSpace, an open source Firefox extension, inverts this style
by allowing users to be participatory designers in this power structure. By
allowing users to alter the way web page appears to them, it challenges
the monopoly of the owners as far as form of the work is concerned. In
the process, it replaces the traditional notion of designer (webpage
designer at the back) with designer-function, as any lay user is given
opportunity as well as resources to design the form for him and share
that with others. It empowers the lay user to be a designer for form as
well as content not owned by him.
The ideology behind invoking this trend is allowing user more freedom in
regulating their web experience. Piggybacking on Greasemonkey script,
ShiftSpace, is a javascript program that once invoked modifies the web
page’s code and installs its own functions which enable the user to do
various actions on the web page like highlighting the text, creating sticky
notes and swapping images. These functions are called Spaces and
content generated by them are called Shifts which are publicly available.
That means, other users with the same script installed can see your
Shifts. By overturning the balance in the favor of the user, ShiftSpace
makes the web experience literally interactive as compared to earlier
times, when web pages were given to the users by the designers and
users reacted to them passively, i.e. users had no authority how they
wished to see a particular web page. That was decided by the owners at
the back. Co-founder Mushon Zer-Aviv explains ShiftSpace’s ideology as,
“While the internet’s design is widely understood to be open and
distributed, control over how users interact online has given us largely
centralized and closed systems. ShiftSpace attempts to subvert this trend
by providing a new public space on the web” ( Zer-Aviv, 2007).
ShiftSpace is supposed to be a utilitarian add-on with the explicit aim of
providing user an edge over the webpage form. However, its intrusion
into web space seems more like an overthrown of privately-owned
78
Kaur
capitalist space by forcible open-source socialism. It allows user to create
spaces barging into the content provided by the webpage owners by
altering their form. User may add personal comments in these spaces and
see others personal comments. It is its Image Swap function that alters
the form of the page by altering the visual elements like images to be
swapped with any other images on the web. Once Image Swap function is
invoked, top left corner of every image on the page, be it a simple image,
logo or advertisement, gives user the options of Swap and Grab. With
Grab you pick the image and with swap, paste it down on any other
image. The end may see user viewing the web page in a completely
different way. User is more in a control of the aesthetic elements of the
page than before.
The aesthetic experience is undoubtedly enriched; however, Image
Swapping could be potential harm to various stakeholders. First, as
obvious, it is an intrusion into the traditional designer’s zone which was
earlier inaccessible. It is more like a trespassing into a forbidden area
whatever the ideology behind may be. As discussed earlier, open source
is not against copyrighted material per se, making these open without the
consent of their legal owners. It is against the proprietorship of
knowledge, the restrictions posed by copyright regime, which restrains
the free distribution of knowledge. If form and content both make a work
complete, sanctity should be preserved in form as well. Whereas the
author-function is not trespassing into private arenas, courtesy open
source licenses, designer-function is, as the trespassing is into a protected
zone without permission and derivatives made are shared with other
users under Shift creators’ Id without any credit to the original designer,
of course!
Secondly, if the images relate to advertising, Image Swapping tends to be
in direct conflict with advertisers. AddArt, another open source Firefox
extension, for example, replaces the advertising images on the web page
with artistic images provided routinely by its developers. With Addblock
Plus in its entourage, the advertising images are first blocked and
subsequently replaced by AddArt. The aim is to empower user to regulate
what sort of stuff he would like to see on a webpage. Once installed,
every ad-image is replaced with the artistic image by AddArt. Together,
ShiftSpace and AddArt, with their Image Swapping lead to dissolution of
designer-function.
It is here argued that AddArt does its so-called utilitarian duty without
prior permission either from the owners of the website, or the
advertisers. Consequently, it turns out to be more an intrusive than an
utilitarian tool. Being an open source product, it creates a conflict
between the ideology of OS as a movement and proprietary regimes and
makes websites a battle zone of their conflicts. Not only the advertisers,
79
Kaur
but hosting page-owners stand to opposition. Who would like to
advertise on pages where the ads are to be replaced by AddArt? Or
swapped by ShiftSpace? OS aims at free distribution of works and with
allowing of commercial distribution of derivatives, provided permission
for so is obtained from the original authors with appropriate credits. It
opposes the proprietorship of software, but not forcibly prohibiting their
commercial selling. AddArt precisely does that albeit indirectly. It forcibly
inhibits the advertisements appearing on others webpages in the name of
providing greater aesthetic experience to the user. Replacing ad-images
with their works is preventing them reaching the viewers, and thus the
potential sale of their products. It comes out as in contradiction to OS as
a movement.
Apprehending the flak from advertisers’ side regarding the potential
harm to advertising industry, the developers of AddArt defend their right
to intrusion by saying,
Oh stop, you flatter us! Seriously, here’s some things to consider:
 You downloaded the page, and you own it. It’s yours and you
can do whatever you want to it. Just like if you get a free
newspaper, you can read it, or cut it up, or burn it. It’s your life
and you have no legal obligation to look at every ad presented to
you.
 People that use Ad Blocking software are not people that click
on ads or even respond favorably to them. There is no loss in the
market when these users block ads.
 If we extend the logic of Ad Blockers destroying the free iternet
then online ad blocking pales in comparison to the number of
people destroying the television industry by going to the
bathroom during commercial breaks, thereby stealing that
content from the television companies. Don’t waste your time
with us and go complain to them.
 Add-Art just replaces blocked ads with art. We didn’t write the
code that blocks ads, we just piggyback onto it …and
enthusiastically support it. If you want to complain about Ad
Blockers, talk to the people at AdBlock Plus. But know that
they’ve heard it all before and after hearing all those
areguments they still don’t agree, so you might just save your
energy and do something else.” [Spelling mistakes in original]
AddArt (n.d)
Their newspaper argument is unwarranted because even if we own the
newspaper copy we are not entitled to paste our news-items or pieces on
it under their banner and re-circulate them. What we are entitled to is
ownership of “physical medium” as German philosopher Gottlieb Fichte
(1791) says, and not the ownership of “content”. Their further insistence
80
Kaur
of considering art and advertisement as different is also untenable and
meant to tilt towards the justification of their own product. As long as
advertisements relate to generating meaningful emotional experience in
the viewer, it is definitely an art. Had it not been, there was no point in
investing millions on building Brand Image by companies whose success
depends upon how much viewers emotionally connect to it.
Even if analyzed a bit further their social aim of enriching user’s aesthetic
experience, AddArt’s efforts are seriously wanting. All what AddArt does
is replacing one image with another image selected by their “curator”
which shuffles every two weeks. The images range from realistic portraits
of unknown people, to surreal to almost disgusting. The kind of “art”
displayed by these images at times is aesthetically obnoxious and
unpleasant, let alone appealing (Fig.1).
Fig.1: The Washington Post website with replaced image by AddArt (Date: 1109-2010)
On the one hand, it ruins the user’s experience, on the other, it distorts
the webpage radically. Advertisements are positioned on the website
with harmony and balance between the advertising image and the rest of
the content. The choice of colors, the layout, the positioning is all
specifically designed according to the kind of advertisement and its
placement on the site alongwith the site-owner’s public profile. The
principles for this are neatly governed by web aesthetics. Further, the
aesthetic effects should be, “adapted to the target audience. A
presentation site targeting a young audience must be designed in
accordance with the contemporary trends in visual aesthetics and should
81
Kaur
differ from a presentation site that targets the general adult population.”
(Thorlacius, 2007b, p.67). AddArt replacements outrageously disregard
all these aspects of web aesthetics, namely image-content harmony, the
owner’s image, genre and target audience. With apparent incompatibility
between original work of art (in terms of shape, coloring, layout of the
original image) that AddArt seeks to replace and the ad-space wherein
that to be replaced, the result is almost nauseated as is shown in Fig. 2.
Fig. 2: Aaj Tak website with replaced images by AddArt (Date: 07-09-2010)
Note: The incompatibility between by replaced images and the Ad-space.
Consequently, the page is rendered more hideous than what it originally
was with advertisement. Advertisements had one stakeholder having
gains at least, AddArt leaves no-one, at least not the users! Unlike
AddArt, paradoxically, it is advertisements that serve a utilitarian function
by giving information about a product even if somebody chooses not to
buy. The often disgusting images chosen by AddArt have what?
Though majority of the websites are created with the parameters of
functionality and user-ease, yet significance of aesthetics, esp. visual
aesthetics, can’t be downplayed in the information they aim to convey.
This is because of the increasing role the visual symbols play in
contemporary culture as discussed above. Moreover, the sites whose
primary objective is not functionality, but aesthetics itself, can be
82
Kaur
seriously damaged by tools like these. The sites in this genre include,
cultural & heritage sites, art galleries, art museums sites etc whose main
purpose is not to give the factual information to the user as news or
shopping portals do, but visually present the cultural repertoire of a
specific region or ethnic group. AddArt degrades the user’s experience
horribly in those sites (Fig. 3).
Fig. 3: Royal Academy of Arts Museum, UK website with replaced images by
AddArt. (Dated 10-09-2010)
As regards the site-owners, it is a clear intrusion into their rights to
ownership of form. Without their permission, contents removed and
page distorted in the process. This subversion explicitly goes against the
open source ideology. Open source is about creating platform for free
sharing of sources, creating spaces and not forcibly trespassing into the
others’ private zones. Whereas ShiftSpace creates spaces forcibly into the
form not owned by the users, AddArt distorts that form. Image Swap
done by both of them is an intrusion into the designer’s right to the form
created by him. With designer dissolving into designer-function, the latter
too comes to naught with ShiftSpace and AddArt. This dissolution is
beyond salvage unlike author-function because the replacements,
content-generation and swaps can be observed globally and shared with
other users via ShiftSpace Server. That is, the shifts created by one user
on BBC’s website, for example, whether notes or swaps, can interactively
be shared by another users of ShiftSpace thereby providing an altogether
new experience of the site (Fig. 4 & Fig. 5).
Once logging-in, one can see all the shifts created on a particular site by
all the users using this tool. Unlike AddArt, ShiftSpace empowers users to
83
Kaur
customize image-replacement and also enables them to share their
adventure with other users by saving their shifts in the ShiftSpace’ server.
Fig. 4: Shifts created by author and other users on BBC site alongwith Image
Swap by SS (Date: 09-09-2010).
Fig. 5: Original BBC Page before invoking ShiftSpace (Date: 09-09-2010).
Further, there is ideological conflict between the aims of AddArt and its
work. Their philosophy is to prevent user from experiencing unwanted
advertisements by replacing them with the art images. Since user is not in
84
Kaur
control to decide the kind of art he is going to see, an equal rejoinder can
be made to AddArt that because of it, users are made to see art which
the developers want to show, as artistic images are not selected by the
users themselves. If their ideology is to ease user from forcible and
annoying advertisements, their artistic images come equally in the trap
for these too are forcible and annoying most of the time. The conflict is
not just between the ideologies of these tools and OS movement as such,
but among them as well. The two tools can come into mutual conflict
when used one after another. Of what use is the entire effort if AddArt
replaces an ad with an artistic image, and ShiftSpace has to change that
later with its Image Swap? Say, with the same image picked from another
website! This cycle seems to be of nobody’s use in the end (Fig. 6).
Fig. 6: AddArt artistic image swapped by ShiftSpace Image Swap Double
replacement. First AddArt replaces Ad-image with its art image, then SS
replaces that art-image with another Art image (Date: 07-09-2010)
Conclusion
Though as OS tools, ShiftSpace contribute in creating space over web,
thus serving a utilitarian purpose; same can’t be said about its Image
Swap function which is more an intrusion into the other’s privately
owned space. This intrusion is gross and complete with AddArt, another
OS tool. Together, both of these lead to the dissolution of designerfunction in cyberspace, a practice which does not serve anything of utility
to either to designer or the end user. On the one hand, these distorts the
form of the webpage completely, on the other, ruins the aesthetic
experience of the user as well.
85
Kaur
References
AddArt. (n.d). F.A.Q’s. Retrieved from
http://add-art.org/f-a-q-s
Akinci, I. O. (2007). Politics of Copyleft: How Do Recent Movements
Altering Copyright in Software And in Art Differ From Each
Other. Artciencia.com. II (6), 1-19. Retrieved from
http://www.artciencia.com/index.php/artciencia/issue/view/11
Barthes, R. (1977). Image, Music, Text. Steven Heath. (Trans.). UK:
Fontana Press.
Fichte, G. (1791). Proof of the Illegality of Reprinting: A Rationale and a
Parable. Martha Woodmanse. (Trans.). Retrieved from
www.case.edu/affil/sce/authorship/Fichte,_Proof.doc
Foucault, M. (1979). What is an author? In Josue V. Harari (Ed.). Textual
Strategies: Perspectives in Post-Structuralist Criticism (141-160).
USA: Cornell University Press. (Originally published in 1970)
GNU General Public License, Version 3 (2007). Retrieved from
http://www.opensource.org/licenses/gpl-3.0.html
Liang, L. (2007). Free/Open Source Software Open Content (eprimer),
UNDP- Asia-Pacific Development Information Programme.
Retrieved from
http://www.apdip.net/publications/fosseprimers/fossopencontent-nocover.pdf
Licenses. (2010). Retrieved from
http://creativecommons.org/about/licenses/
Tractinsky, N. (2005). Does Aesthetic Matter in Human Computer
Interaction? Mensh & Computers, pp. 29-42. Retrieved from
http://mc.informatik.unihamburg.de/konferenzbaende/mc2005/konferenzband/muc200
5_02_tractinsky.pdf
Thorlacius, L. (2007a). The Role of Aesthetics in Web Design, Nordicom
Review, 28, pp. 63-76
Thorlacius, L. (2007b). The Role of Aesthetics in Web Design, Nordicom
Review, 28, p. 67
Vainio, N., & Vaden, T. (2007). Free Software Philosophy And Open
Source. In K. S. Amant and B. Still (Eds.). Handbook of Open
Source Software: Technological, Economic and Social
Perspectives (pp. 1-11). USA: Information Science Reference.
Zer-Aviv, M. (2007). ShiftSpace: Thesis Paper. Retrieved from
http://itp.nyu.edu/projects_documents/1178942414_ShiftSpace
_thesis_paper.pdf
86
Open Access Journals in LIS…
Rufai, Gul & Shah
Open Access Journals in Library and Information Science:
The Story so Far
*
Reyaz Rufai
**
Sumeer Gul
***
Tariq Ahmad Shah
Abstract
Purpose: Internet has triggered the growth of scholarly publications and every
discipline is witnessing an unremitting growth in the scholarly market. Open
access, the product of Internet has also captured the global disciplines. Library
and Information Science, is also witnessing a dramatic growth in the open access
field. The study explores the status of open access titles in the field of Library and
Information Science (LIS). Various characteristics highlighting open access titles in
the field of LIS are featured in the study.
Design/Methodology/Approach: A systematic method for characterizing the
open access titles in the field of Library and Information Science was carried out
by extracting the data from Directory of Open Access Journals (DOAJ), Open JGate, and Ulrichsweb.com.
Findings: The results clearly reveal an expounding growth of open access titles in
the field of Library and Information Science. Commercial publishers have also
joined hands as open access market players. Indexing policies of OA titles in LIS
need to be restructured and low income nations have to evolve in the field of OA
bazaar.
Research Implications: The study will be helpful for the researchers in exploring
the open access titles in the field of LIS. Furthermore, it can act as an eye opener
to the scholarly world to know about the real status of open access titles in the
field of LIS.
Future Research: Future research can be carried out to expedite the innovative
trends in the LIS open journals.
Keywords: Open Access; Library and Information Science; Open Access Journals;
Open Access-Growth-Development
Introduction
cientific publishing is undergoing significant changes due to the
growth of online publications and increases in the number of
open access journals (Voronin, Myrzahmetov & Bernstein,
2011).The concept of open access (OA) that opened new dimensions in
the information communication cycle has been widely accepted all over
the world. Open access, which provides free access to the information
content, is widely expanding its domain because of enormous benefits
S
*
Librarian. Allama Iqbal Library. University of Kashmir, Jammu and Kashmir. 190 006. India.
email: [email protected]
**
Assistant Professor. Dept. of Library and Information Science. University of Kashmir,
Jammu and Kashmir. 190 006. India. email: [email protected]
***
Research Scholar. Dept. of Library and Information Science University of Kashmir, Jammu
and Kashmir. 190 006. India. email: [email protected]
87
Rufai, Gul & Shah
accrued from it. It is a blessing for everyone involved with the
information communication process. Their growth and development has
been one of the success stories over World Wide Web. With only five
journals offering open access mode to their contents in 1992 and 1200 in
2004 (Falk, 2004), the number has reached to more than 7000 as on
December 01, 2010 (Directory of Open Access Journals, 2010). Different
authorities on open access have highlighted this budding concept in
different ways. One of the lucid definitions on open access has been
provided by Budapest open access initiative which states that open
access is the free availability of articles on the public internet, permitting
any users to read, download, copy, distribute, print, search, or link to the
full texts of these articles, crawl them for indexing, pass them as data to
software, or use them for any other lawful purpose, without financial,
legal, or technical barriers other than those inseparable from gaining
access to the internet itself (Budapest Open Access Initiative, 2002).
However, Association of Research Libraries (ARL, 2007) define open
access as any dissemination model created with no expectation of direct
monetary return and which makes works available online at no cost to
the readers. An important and well renowned authority, Suber (2003) on
open access, defines open access as free online availability of scholarly
literature. Lynch (2006) also comments on open access as an increased
elimination of barriers to the use of the scholarly literature by anyone
interested in making such use. McCulloch (2006) visualizes that open
access movement attempts to reassert control over publicly funded
research in order to achieve “the best value” and make such research
output transparent and freely accessible. Nicholas, Huntington and
Rowlands (2005) elaborate on the value of such activity by stressing that
it is possible to “read, download, copy, distribute and print articles and
other materials freely”. The free availability of research is tempting the
researchers to embrace the open access revolution with warm welcome.
Number of advantages ranging from wider visibility to high citation have
made open access so popular among the researchers that the heat of
open access publishing is accelerating day by day. Highly ranked journals
like Nature, Wall Street Journal and The Scientist all ranked open access
among their top stories in 2003 (Willinsky, 2006). Initially a strong
resentment was seen from the publishing industry, that open scholarship
was a great threat to their business venture. But with the passage of
time, leading publishers also joined the open access bandwagon because
of innumerable potentialities that are adhered to it. Leading publishers
like Elsevier, Oxford, Taylor and Francis, Sage, Springer and many more
made some of their content freely available to the readers. Projects like
HINARI, AGORA, and OARE etc that made the scholarly content freely
available to developing economies also helped to propagate the cause of
88
Rufai, Gul & Shah
open access, i.e. information for all. Scholarly and scientific journals are
now enjoying flavours of open access and are growing at an escalating
rate day by day. Open access journals have in this relatively shorter span
of time won the hearts of the elements associated with the rim of open
access. With leading publishers and reputed universities their count is
growing at a very fast rate. The serial crisis that was the outcome of
spurting economy has also been solved by open access platform.
However, open access is gaining popularity day by day and every subject
has been positively affected by it. Social Sciences, which deal with the
various facets of society in relation to man, are also embracing this
concept with open arms. Scholars in the various fields of Social Sciences,
including Library and Information Science are contributing to open access
journal revolution because of innumerable benefits adhered to it.
Review of Literature
A number of studies have been carried that highlight various facets of
open access. Falk (2004) studied that 1200 open access journals were
available on the Web as compared to a total of only five in 1992.
Deals between publishers can be one of the catalytic forces in the
increase of open access journals. Development of open access journal
publishing has also been researched by Laakso, Welling, Bukvova,
Nyman, Björk & Hedlund, 2011). A steady rate of increase of the open
access journals has also been witnessed by number of authorities. Many
carry on studies were also conducted to trace the growth and
development of open access journals (Wells, 1999; Crawford, 2002;
Gustaffson 2002 (as cited in Laakso, Welling, Bukvova, Nyman, Björk &
Hedlund, 2011; Morris, 2006; Dramatic Growth of Open, 2007; Gul,
Wani & Majeed, 2008; Ware & Mabe, 2009)
A study by McVeigh (2004) documents that the number of open access
journals in the citation indexes provided by ISI Thomson™ is growing,
both in terms of creating new titles and conversion of established titles.
Open access journal publishing in different fields is also studied by
Borgman (2007).
The open access platform provided by publishers has also been studied
by Dallmeier-Tiessen, et al, (2010). Recent studies have explored a
dramatic growth of open access journals (Happy, 2012…, 2011;
Provençal, 2011; The challenges of success…, 2011; Illustrations of the
global…, 2012).
Problem
Millions of scholarly articles are appearing on the Web but due to
number of restrictions, access to them can’t be availed every time. Out of
them, a large number of articles are useful for LIS research and
89
Rufai, Gul & Shah
development that appear in different journals from time to time. Open
access journals that provide free access to the research have made their
debut to provide ease in access to the research. Day by day, these
journals increase at a very fast rate on the Web. The study will
encompass the development of open access journals in the field of LIS.
Objectives
The main objective was to study how open access journals in the field of
LIS are experimenting with features like publishing origin, publishing
models, language usage, visibility, article processing, and status concerns.
Scope
The study was undertaken to visualize the position of LIS field in this
epoch of open access which has revolutionized the entire world in a short
duration of time, as it got started with a meeting conveyed by Open
Society Institute held at Budapest, Hungary in the month of December
2001 (Budapest Open Access Initiative, 2002).
Methodology
In order to ascertain the no. of OA journals published in the field of
Library and Information Science (LIS), three authoritative and authentic
databases were consulted, i.e., Lund University’s Directory of Open Access
Journals (DOAJ), Serials Solution’s Ulrichsweb.com, and Informatics India
Private limited, Open J-Gate. As on June 10, 2011, DOAJ indexed 117
titles in the field of LIS, 93 by Ulrichsweb.com and 66 peer reviewed
journals by Open J-Gate. The titles from the three databases were
clubbed together and repeated titles were removed in order to avoid the
risk of duplication and to achieve an accurate and realistic number. Each
title was further manually checked on their respective websites and a no.
of discrepancies were found in the list of Open J-Gate & Ulrichweb.com,
like:
 Wrong Classification
Journals that belong to field of Computers and Education were tagged by
Open J-Gate under the field of LIS, like
Title
International Journal of Peer to Peer Networks
Transformations: Liberal Arts in the Digital Age
International Journal of Educational Technology
Journal of Research on Technology in Education
Current Issues in Education
Turkish Online Journal of Distance Education
Original Subject/s
Computers
Computers
Computers & Education
Computers & Education
Education
Education
90
Rufai, Gul & Shah
 Trade Journal instead of Scholarly
By open access we mean scholarly and peer reviewed publications and
not the trade journals. Open J-Gate tagged a journal - Idaho Librarian
(ISSN: 2151-7738) – as OA when its contents were supporting trade
instead of scholarly nature.
 Embargo Period/ Access to select issues only
Embargo period which denotes a time lag between the most current
issue/volume published and the content of the journal freely available on
the public web is against the very spirit of open access movement. OA
journals provide free access not only to the current issue or current
volume but also to back issues. However, in case of Journal of University
Librarians Association of Sri Lanka (ISSN: 13914081), it provides free
access to back issues only; current issue is available up to abstract level
only. Tushu Zixum Xuekan (parallel title: Bulletin of Library and
Information Science, ISSN: 10232125) which is tagged by Ulrichsweb.com
as Open Access journal also provides free access to back issues only.
Besides, Law Library Journal (ISSN: 10246444) does not provide free
access to all the issues, users are supposed to subscribe to access its
archive.
When all these doubtful titles were removed, a total of 144 OA journals
in the field of Library and Information Science were obtained. Among
these, 32 journals are indexed by all databases while 29 titles are indexed
only by DOAJ, 11 only by Urlichsweb.com & 16 by Open J-gate (Fig. 1).
Fig 1: Comparative Strength of LIS titles
Results & Discussion
 Country of Publication
144 OA LIS journals are published from 37 countries. Among these, a
maximum of 45 titles are published in United States (31.25%), followed
respectively by 12 in Brazil (8.33%) and 10 in Spain (6.95%). On the other
91
Rufai, Gul & Shah
extreme, five countries publish two journals each while 20 countries
including India publish single journal each.
If the countries are classified according to four economic zones of The
World Bank, i.e., High income, Upper-Middle-income, Lower-Middleincome & Low-income (Country and Lending Groups, 2011), 20 countries
that published OA journals fall under High-income economic zone, 12
countries under Upper-Middle-income economic zone and 5 countries
from Lower-Middle-income zone while countries from Low-income
economic zone have yet to publish any OA journal in the field of Library
and Information Science.
 Publisher Account
129 publishers take active part in the publication of OA LIS journals.
Informing Science Institute, USA publishes a maximum of 7 titles followed
by American Library Association (USA) which publishes 5 titles while 2
titles are published each by National Taiwan University (Taiwan),
Universidad Complutense de Madrid (Spain), Australian Library and
Information Association (Australia), Chartered Institute of Library and
Information Professionals (UK), and International Consortium for the
Advancement of Academic Publication (Canada). Rest of 122 publishers
publishes one title each.
When it comes to the nature of publishing body, it is found that
universities are the leading publishers of OA Journals which publish 55
titles, accounting to 38.19 per cent of the total, followed by library
associations and research centers & institutes with 32 (22.22%)and 22
(15.28%) titles respectively. Commercial publishers also offer 9 (6.25%)
journals while 5 (3.47%) titles are result of individual efforts. Rest of 21
(14.58%) titles is an endeavour of societies, consortia and others.
 Lingual Assessment
When it comes to the content language(s), 72.92 per cent of journals
(105) are unilingual, 19.44 per cent as bilingual (28), 4.17 titles (6) in
three languages, 2.78 per cent titles (4) in four languages and a single
title (0.69%) is published in a maximum of four languages. Overall, OA LIS
journals are represented in 22 different languages. English is the content
language preferred by majority of journals (114, 79.17%), followed
respectively 23 in Spanish (15.97%) and 15 in Portuguese (10.42%). On
the other hand, 2 journals are published each in Catalan, Danish,
Romanian, and Swedish languages. One journal each is published in
Arabic, Bulgarian, Croatian, Czech, Indonesian, Lithuanian, Polish,
Norwegian Slovak, and Slovene (Table 1).
92
Rufai, Gul & Shah
Table 2: Lingual Assessment of OA LIS journals
Rank Language
No. of Journals Percentage
1
English
114
79.17
2
Spanish
23
15.97
3
Portuguese
15
10.42
4
French
11
7.64
5
German
7
4.86
6
7
Italian
Turkish
6
3
4.17
2.08
7
Chinese
3
2.08
 Visibility
DOAJ not only indexes OA journals but also archives material of about 45
per cent of indexed titles. In case of OA LIS journals, 33 per cent of them
i.e. 47 titles are searchable to article level in DOAJ. For rest of titles, one
has to access them individually at their respective websites. Besides,
Seadle (2011) argues that most of the open access titles listed in DOAJ
currently have no effective long-term digital archiving. So far as Scopus is
concerned, one of the largest indexing and abstracting services in the
world; also indexes a few OA LIS journals, i.e. 15.3 per cent (22). This
represents a very poor visibility of them.
 Article Processing Charges / Handling Fee
By OA we mean that the journal is freely available to the user on the
public web, but the publisher may charge its authors to pay in the form of
article processing charges or handling charges. Since managing a journal
is a costly affair and the studies have shown that the process of peer
review costs on an average 400USD per article (Rowland, 2002).
Of 144 journals, only 6 journals charge their authors to pay article
processing charges or handling fee. Authors have to pay 1900 USD to get
their article published in Journal of Medical Internet Research (ISSN:
14388871), 550USD to published in International Journal of Library and
Information Science (ISSN: 21412537) and 50 USD for South African
Journal of Information Management (ISSN: 1560683X). However, the fee
charged by Anales de Documentación (ISSN: 15752437), Hipertext.net
(ISSN: 16955498), and Infodiversidad (ISSN: 1514514X) could not be
traced out.
 Status
Managing a journal is not an easy task. Like other ventures, it too
requires the active participation of experts (human expertise), material
(research contribution) and money (finance). 134 journals (93 %) have
93
Rufai, Gul & Shah
sustained their existence and are regularly being published. The
remaining 10 titles had ceased their publication and among these, four
titles are continued by some other journal name (Table 2).
Table 2
Ceased Title
Journal of Southern Academic and Special
Librarianship
Medizin-Bibliothek-Information
Journal of Library Science
Bulletin of the Medical Library Association
Continued by
Electronic Journal of Academic and Special
Librarianship
GMS Medizin-Bibliothek-Information
Journal of Library and Information Studies
Journal of the Medical Library Association
Conclusion and Discussion
The sustainability of open access journals in the field of LIS is evident
from the study. Countries falling in the low-income economic zones have
to come on open access canvas. Use of open journal systems (OJS’s) can
be one of the best solutions in the times of economic crisis and especially
for those nations which are endemically short of adequate financial
resources to cope up with the changing technologies (Gul & Shah, 2011).
Though commercial publishers have joined hands in open access market,
yet there need to be lots of efforts on their side to remove the economic
barrier that has always hindered the researchers from quality research in
the LIS field. Not only universities should be the pioneers in highlighting
the research in LIS but research institutes and centers, societies and other
elements associated with research should actively take part in the
research output. The journals offering hybrid or fee based mode should
try to slash down the author processing charges so that the article
publication can become an affordable job. Assigning the job of article
processing on volunteer basis and reduced costs can help in the elevation
of OA articles which in turn can benefit the readers to a greater extent.
Content availability in more languages with English as one of the
languages can help to remove the language barrier between the two
ends of information communication process. Indexing the journals in
more sources can help to increase the content visibility of OA journals in
the field of OA. Even a proper archiving policy in indexing sources can
help in long term preservation of the open digital content. To achieve
long term sustainability the elements associated with the scholarly
publication need to work in a more coordinated manner as researched by
Legace (as cited in Gul & Shah, 2010). Marketing the scholarly content in
a more organized and coordinated manner can also help in long term
sustainability of the journals. Application of Web 2.0 tools for the content
promotion and inclusion in different subjective forums and boards can
also help in the sustenance of the journals in the present dynamic and
ever changing digital environment.
94
Rufai, Gul & Shah
References
Association of Research Libraries (ARL). (2007). Retrieved from
www.arl.org/osc/models/oa.html
Borgman, C.L. (2007). Scholarship in the digital age: Information,
infrastructure, and the Internet (p.186). Cambridge, MA: MIT
Press.
Budapest Open Access Initiative. (2002). Read the Budapest Open Access
Initiative. Budapest open access initiative. Retrieved from
http://www.soros.org/openaccess/read
Country and Lending Groups. (2011). The World Bank. Retrieved from
http://data.worldbank.org/about/countryclassifications/country-and-lending-groups
Dallmeier-Tiessen, S., et al. (2010). Open Access Publishing - Models and
Attributes
(p.62).
Max
Planck
Digital
Library/Informationsversorgung.
Directory of Open Access Journals. (2010). Retrieved from
http://www.doaj.org/
Dramatic Growth of Open…. (2007). Dramatic Growth of Open Access
Series. The imaginary journal of poetic economics. Retrieved
from
http://poeticeconomics.blogspot.com/2006/08/dramaticgrowth-of-open-access-series.html
Falk, H. (2004). Open access gains momentum. The Electronic Library, 22
(6), 527-530. doi: 10.1108/02640470410570848
Gul, S., Wani, Z. A., & Majeed, I. (2008). Open Access Journals: A Global
Perspective. Trends in Information Management. 4 (1). 1-19.
Gul, S., & Shah, T.A. (2011). Managing Knowledge Repository in Kashmir:
Leap towards a Knowledge Based Society. Trends in Information
Management, 7 (1), 41-55.
Happy 2012…. (2011). Happy 2012 Open Access Movement! December
31, 2011 Dramatic Growth of Open Access. The imaginary
journal of poetic economics. Retrieved from
http://poeticeconomics.blogspot.com/2011/12/happy-2012open-access-movement.html
Illustrations of the global….. (2012). Illustrations of the global reach of the
open access movement. The imaginary journal of poetic
economics. Retrieved from
http://poeticeconomics.blogspot.com/2012/01/illustrations-ofglobal-reach-of-open.html
Laakso M,. Welling P,. Bukvova H,. Nyman L,. Björk B-C,. & Hedlund,
Turid. (2011). The Development of Open Access Journal
Publishing from 1993 to 2009. PLoS ONE, 6(6): e20961.
doi:10.1371/journal.pone.0020961
95
Rufai, Gul & Shah
Lynch, C. (2006). Improving access to research results: six points. ARL
Bimonthly Report, 248, October, pp. 5-7, Retrieved from
http://www.arl.org/bm~doc/arlbr248sixpoints.pdf
McCulloch, E. (2006). Taking stock of open access: progress and issues.
Library
Review,
55
(6),
337-343.
doi:
10.1108/00242530610674749
McVeigh, M. E. (2004). Open access journals and the ISI citation
database: Analysis of impact factors and citation patterns.
Thomson Scientific Whitepaper. Retrieved November 12, 2010
from
www.thomsonisi.com/media/presentrep/essayspdf/openaccess
citations2.pdf
Morris, S. (2006). Personal View: When is a journal not a journal - a closer
look
at
the
DOAJ.
Learned
Publishing
19:
doi:10.1087/095315106775122565
Nicholas, D., Huntington, P., & Rowlands, I. (2005). Open access journal
publishing: the views of some of the world's senior authors.
Journal
of
Documentation,
61(4),
497-519.
doi:
10.1108/00220410510607499
Provençal, J. (2011). Scholarly journal publishing in Canada: Annual
industry report 2010-2011. Canada: Canadian Association of
Learned Journals. Retrieved from
http://www.caljacrs.ca/docs/CALJ_%20IndustryReport_2011.pdf
Rowland, F. (2002). The peer-review process. Learned publishing, 15 (4),
247-258. doi: 10.1087/095315102760319206
Seadle, M. (2011). Archiving in the networked world: open access
journals. Library Hi Tech, 29 (2), 394- 404. doi:
10.1108/07378831111138251
Suber, P. (2003). How should we define open access? SPARC Open Access
Newsletter, 64. Retrieved from
http://www.earlham.edu/~peters/fos/newsletter/08-04-03.htm
The challenges of success….. (2011). The challenges of success: dramatic
growth of open access early year-end edition. The imaginary
journal of poetic economics. Retrieved from
http://poeticeconomics.blogspot.com/2011/12/challenges-ofsuccess-dramatic-growth.html
Voronin Y ., Myrzahmetov. A ., & Bernstein, A. (2011). Access to Scientific
Publications: The Scientist's Perspective. PLoS One, 6(11):
e27868. doi:10.1371/journal.pone.0027868
Ware M,. & Mabe, M. (2009). The STM report - An overview of scientific
and scholarly journals publishing (p.68). International
Association of Scientific, Technical and Medical Publishers.
96
Rufai, Gul & Shah
Willinsky, J. (2006). The Access Principle – The Case for Open Access to
Research and Scholarship. The MIT Press, Cambridge, MA.
97
Yadav & Toshniwal
*
Vivek Yadav
**
Durga Toshniwal
Abstract
Purpose: A time series comprises of a sequence of observations ordered with time.
A major task of data mining with regard to time series data is predicting the
future values. In time series there is a general notion that some aspect of past
pattern will continue in future. Existing time series techniques fail to capture the
knowledge present in databases to make good assumptions of future values.
Design/Methodology/Approach: Application of graph matching technique to
time series data is applied in the paper.
Findings: The study found that use of graph matching techniques on time-series
data can be a useful technique for finding hidden patterns in time series database.
Research Implications: The study motivates to map time series data and graphs
and use existing graph mining techniques to discover patterns from time series
data and use the derived patterns for making predictions.
Originality/Value: The study maps the time-series data as graphs and use graph
mining techniques to discover knowledge from time series data.
Keywords: Data mining; Time Series Prediction; Graph Mining; Graph Matching
Paper Type: Conceptual
Introduction
ata mining is the process of extracting meaningful and
potentially useful patterns from large datasets. Nowadays, data
mining is becoming an increasingly important tool by modern
business processes to transform data into business intelligence giving
business processes an informational advantage to make their strategic
business decisions based on the past observed patterns rather than on
intuitions or beliefs (Clifton, 2011). Graph based framework for time
series prediction is a step towards exploring new efficient approach for
time series prediction where predictions are based on patterns observed
in past.
Time Series data consists of sequences of values or events obtained over
repeated instances of time. Mostly these values or events are collected at
equally spaced, discrete time intervals (e.g., hourly, daily, weekly,
monthly, yearly etc.). When there is only one variable upon which
observations with respect to (w.r.t) time are made, is called univariate
time series. Data mining on Time-series data is popular in many
applications, such as stock market analysis, economic and sales
forecasting, budgetary analysis, utility studies, inventory studies, yield
D
*
Department of Electronics & Computer Engineering, IIT Roorkee.
**
Assistant Professor. Department of Electronics & Computer Engineering, IIT Roorkee.
98
Yadav & Toshniwal
projections, workload projections, process and quality control,
observation of natural phenomena (such as atmosphere, temperature,
wind, earthquake), scientific and engineering experiments, and medical
treatments (Han & Kamber, 2006).
Time series dataset constitutes of {Y1, Y2, Y3, …, Yt } values, where each Yi
represent the value of variable under study at time i. One of the major
goal of Data mining in the time series is forecasting the time series i.e., to
predict the future value Yt+1. The successive observations in time series
are statistically dependent on time and time series modeling is concerned
with techniques for analysis of such dependencies. In time series analysis,
a basic assumption is made that is (i.e.) some aspect of past pattern will
continue in future. Under this assumption time series prediction is
assumed to be based on past values of the main variable Y. The time
series prediction can be useful in planning and measuring the
performance of predicted value on past data against actual observed
value on the main variable Y.
Time series modeling is advantageous, as it can be used more easily for
forecasting purposes since the historical sequences of observations upon
study on main variable are readily available as they are recorded in the
form of past observations & can be purchased or gathered from
published secondary sources. In time series modeling, the prediction of
values for future periods is based on the pattern of past values of the
variable under study, but the model does not generally account for
explanatory variable which may have affected the system. There are two
reasons for resorting to such time models. First, the system may not be
understood, and even if it is understood it may be extremely difficult to
measure the cause and effect relationship of parameters affecting the
time series. Second, the main concern may be only to predict the next
value and not to explicitly know why it was observed (Box, Jenkins &
Reinsel, 1976)
Time Series analysis consists of four major components for characterizing
time-series data (Madsen, 2008). First, Trend component- these indicate
the general direction in which a time series data is moving over a long
interval of time, denoted by T. Second, Cyclic component- these refer to
the cycles, that is, the long-term oscillations about a trend line or curve,
which may or may not be periodic, denoted by C. Third, Seasonal
component- these are systematic or calendar related, denoted by S.
Fourth, Random component- these characterize the sporadic motion of
time series due to random or chance events, denoted by R. Time-series
modeling is also referred to as the decomposition of a time series into
these four basic components. The time-series variable Y at the time t can
be modeled as either the product of the four variables at time t (i.e., Yt =
Tt×Ct× St× Rt) using multiplicative model proposed by (Box, Jenkins &
99
Yadav & Toshniwal
Reinsel, 1970) where Tt means Trend component at time t, Ct means
cyclic component at time t, St means seasonal component at time t and Rt
signifies Random component at time t. As an alternative, additive model
(Balestra & Nerlove, 1966; Bollerslev, 1987) can also be used in which (Yt
= Tt+Ct+St+Rt) where Yt, Tt, Ct, St, Rt have the same meaning as described
above. Since multiplicative model is the most popular model, we will use
it for the time series decomposition. Example of time series data is the
airline passenger data set (Fig. 1) in which the main variable Y is the
number of passengers (in thousands) in an airline is recorded w.r.t time,
where each observation on main variable is recorded on monthly basis
from January 1949 to December 1960. Clearly, the time series is affected
by increasing trend, seasonal and cyclic variations.
Fig. 1: Time series Data of the Airline Passenger Data from Year 1949 to 1960 represented
on monthly basis.
In time series analysis there is an important notion of de-seasonalizing
the time series (Box & Pierce, 1970). It makes the assumption that if the
time series represents a seasonal pattern of L periods, then by taking
moving average Mt of L periods, we would get the mean value for the
year. This would be free of seasonality and contain little randomness
(owing to averaging). Thus Mt=Tt×Ct (Box, Jenkins & Reinsel, 1976). To
determine the seasonal component, one would simply divide the original
series by the moving average i.e., Yt/Mt= (Tt×Ct× St× Rt)/( Tt×Ct )= St× Rt.
Taking average over months eliminates randomness and yields
seasonality component St. De-seasonalized Yt time series can be
computed by Yt/St.
The approach described in (Box, et al, 1976) for predicting the time
series, uses regression to fit a curve to De-seasonalized time series using
100
Yadav & Toshniwal
least square method. To predict the values in time series, model projects
the De-seasonalized time series into future using regression and divide it
by the seasonal component. The Least Square Method is explained in
Johnson and Wichern (2002).
Exponential Smoothing has been proposed in (Shumway & Stoffer, 1982)
which is an extension to above method to make more accurate
predictions. It suggests, making prediction for Yt weighing the most
recent observation (Yt-1) by α and weighting the most recent forecast (Ft-1)
by (1- α). Note α lies between 0 and 1 (i.e., 0≤α≤1). Thus the forecast is
given by Ft+1= Yt-1* α +(Ft-1) * (1- α). Optimal α is chosen based on the
smallest MSE (Mean Square Error) value during the training.
ARIMA (Auto-Regressive Integration Moving Average Based Model) has
also been proposed (Box, et al., 1970, 1976; Hamilton, 1989). ARIMA
model is categorized by ARIMA(p,q,d) where p denotes order of autoregression, q denotes order of differentiation and d denotes order of
moving averages. The model tries to find the value of p, q, and d that best
fits the data. In time series forecasting using a hybrid ARIMA and neural
network model has proposed a model that tries to find p, q and d using
neural network (Zhang, 2003).
Proposed Work: Graph Based Framework for Time Series Prediction
In this paper, I propose to use graph based framework for time series
prediction. The motivation to use the graphs is to capture the tacit
historical pattern present in the dataset. The idea behind creation of
graph over time series is to utilize two facts. First, some aspect of time
series pattern will continue in future and graph is a data structure that is
well suited to model a pattern. Second, similarity can be calculated
between graphs to know the similar patterns and their order of
occurrence. Thus, graph is created with the motivation to store a pattern
over time series and make prediction based on similarity of observed
pattern from historical data as an alternative to Regression and curve
fitting. The major shortcoming of using the regression and curve fitting is
that it requires expert knowledge about curve equation and the number
of parameters in it. If parameters are too many there is problem of over
fitting and if parameters are too less, model suffers from problem of
under fitting (Han & Kamber, 2006). The complete pattern in time series
is not known initially and it is affected by random component which
makes the regression harder, hence deciding the curve equation and
number of parameters in it is a major issue.
To further explore the concept of pattern, let there be time series on
monthly data of N years where first observation was in first month of m
year, Data = {Y1(k)Y2(k)…Y12(k), Y1(k+1) Y2(k+1) …Y12(k+1),…, Y1(k+N)Y2(k+N)…Y12(k+N)}
where Y1(k) means value of variable under study for first month of year k
101
Yadav & Toshniwal
& Y12(k+N) means value of variable under study for twelfth month of year
k+N. Note m≤k≤(m+N). In general let d, be the time interval which makes
a pattern. If a pattern has to be stored yearly and data is available
monthly d=12, data is available quarterly d=4, etc. Each successive
observation to Yij (meaning month i and year j) on main variable ordered
by time is in general given by Yi’j’ where if Yij 1≤i≤12, k≤j≤(k+N), then for
Yi’j’ if i<12 then i’=i+1, j’=j else i’=1, j’=j+1. A graph over each successive d
observation is created to store the pattern. This is called ‘last-patternobserved-graph’. To make the prediction we also store the knowledge in
each graph that how the last pattern observed effect the next
observation. This is called ‘knowledge-graph’. Example If we consider the
data {Y1(k)Y2(k)…Y12(k), Y1(k+1) Y2(k+1) …Y12(k+1),…, Y1(k+N) Y2(k+N)…Y12(k+N)}, lastpattern-observed-graph for Jan of year (k+1) will be generated using data
{Y1(k)Y2(k)…Y12(k)} and knowledge-graph of Jan for year (k+1) will be
generated using {Y1(k)Y2(k)…Y12(k), Y1(k+1)} data. Knowledge graph is created
with intuition to capture how the variable under study changed over last
d observations and its effect on d+1 observation.
In time series data, the graph is created with the motivation to model
each observation as vertex and represent the effect of variation in
observations with respect to time in form of edges. The number of
vertices in graph is equal to time interval over which a pattern has to be
stored. The edges are created to take into account the effect of each
observation on other. Since the past values will affect the future values,
but future values would not affect the past values and hence the edges
are created between vertices corresponding to it and all the subsequent
observations which measure the change in angle with horizontal. The
graphs generated can be represented in computer memory either by
using Adjacency matrix representation or Adjacency list representation
(Cormen, 2001). I have used Adjacency list representation to save the
memory required to store the graph as each graph will have n(n-1)/2
edges thus space required will be n(n-1)/2 using adjacency list
2
representation as compared to n space using adjacency matrix
representation.
Dataset of N tuples is partitioned into two sets. First set for training data
of m tuples and second {N-m} tuples for training and validation of model.
During the training phase, a Knowledge-Graph is generated over training
data tuples over each subsequent d+1 observation. Yi(k)Y(i+1)(k)…Y(i+12)(k),
Y(i+13) (k) where i has bounds 1≤i≤12 and if i>12 then i=1 & k=k+1 for all m
tuples in training Dataset. Thus m-12 Knowledge-Graphs are generated.
These generated graphs are partitioned into d sets (d=12), where each
graph is stored in the interval over which knowledge they have captured
(i.e. graph for all Jan’s are stored together, all Feb’s stored together, etc.).
To implement this we have used an array of size d of linked list of graphs.
102
Yadav & Toshniwal
Each linked list stores all the knowledge graph corresponding to interval
over which knowledge it represents. The graphs are partitioned with the
motivation to ease the search since while making prediction, model will
query for all patterns observed w.r.t a particular month, since the graphs
are already stored in partitioned form, time taken by model to execute
this query will be O(1).
To predict the next value in time series, model will take the last d known
observations previous to the month on which prediction has to be done
and compute ‘last-pattern-observed-Graph’. The model will search for a
Knowledge graph (stored in the partitioned form corresponding to month
for which prediction has to be made) that is most similar to ‘last-pattern
observed graph’, considering only number of vertices equal to ‘lastpattern observed graph’ in Knowledge-Graph. To compute the similarity
between two graphs, graph-edit distance technique has been used
(Brown, 2004; Bunke & Riesen, 2008). The key idea of Graph-edit
Distance approach is to model structural variation by edit operations
reflecting modifications in structure and labeling. A standard set of edit
operations is given by insertions, deletions, and substitutions of both
nodes and edges. While calculating graph edit distance for time-series
Graph for g1 (source graph) & g2 (destination graph), requires only
substitutions of edges (change in angle) in g 2 to make it similar to g1 and a
summation of cost incurred with each edit operation is calculated. The
graph with least edit cost is most similar & selected as a graph that will
form the basis, of the prediction.
To make the prediction, model takes into account the structural
difference between two graphs in vertex ordered weighted average
manner. To make the prediction on graph g1 (last-pattern-observedGraph) using graph g2 (Knowledge Graph which is most similar to g1),
every vertex in g1 predicts the angle between itself and the predicted
value using the knowledge of g2 and taking into account the difference of
edges between itself & it’s corresponding vertex in g2 in a weighted
average manner (where edge difference to vertex that are closer to be
predicted are given more weight technique to apply exponential
smoothing in Graph based time series prediction approach), and thus in
this way each vertex predicts the angle. Every vertex makes the
prediction & the predicted value is average of value predicted by each
vertex. After making the prediction, once the actual observed value is
known, Knowledge graph is generated to capture the pattern
corresponding to the last observation and in this way model learns in an
iterative manner.
Experimental Results
The code to implement Graph Based Time Series prediction approach as
discussed above is written in java. The Graph Based Time Series
103
Yadav & Toshniwal
prediction approach was applied on the airline passenger data set, which
was first used in (Brown & Smoothing, 1962) and then in (Box, et al.,
1976). It represents the number of airline passengers in thousands
observed between January 1949 and December 1960 on a monthly basis.
I have used 2 years of data for training i.e., 1949 & 1950 and estimated
the remaining data on monthly basis implementing iterative learning as
an observation is recorded.
Fig. 2 represents Actual and Predicted number of Passenger using Graph
Based Framework for Time Series prediction applied on the Time Series
of airline passenger data set. Fig. 3 represents the corresponding
percentage error rate observed on monthly basis. The average error
recorded on time-series is 7.05.Fig. 4 represents the Actual and Predicted
Number of passenger using Graph Based Framework for Time Series
prediction applied on the De-seasonalized Time Series of airline
passenger data set (using concept of Moving Average). Fig. 5 represents
the corresponding percentage error rate observed on monthly basis. The
average percentage error recorded on De-seasonalized Time series is
5.81.
Fig. 2: Actual and Predicted number of Passenger using Graph Based Framework for Time
Series prediction applied on the Time Series of airline passenger data set (APTS).
Fig. 3: Percentage Error between Actual and predicted using Graph Based Framework for
Time Series prediction applied on the Time Series of airline passenger data set (APTS).
104
Yadav & Toshniwal
Fig. 4: Actual and Predicted number of Passenger using Graph Based Framework for Time
Series prediction applied on the De-seasonalized Time Series of airline passenger data set
(APTS).
Fig. 5: Percentage Error between Actual and Predicted values using Graph Based
Framework for Time Series prediction applied on the De-seasonalized Time Series of
airline passenger data set (APTS).
Conclusion & Discussion
A new approach for time series prediction has been proposed &
implemented which is based on graphs. The results reported show that
using graph based framework for time series prediction on Deseasonalized Time Series (Computed Using Concept of Moving Average)
on The Airline Passenger Data has 94.19 percent accuracy and on direct
Time Series of The Airline Passenger Data has 92.95 percent accuracy.
The accuracy on De-seasonalized time series is better since this time
series has only two factors, cyclic and trend factors which leads to less
error rate as compared to direct application of proposed approach on
time-series which has all the four factors cyclic, trend, seasonal and
randomness, which makes the prediction difficult. Thus application of
Graph based framework in conjunction to Moving average offers good
accuracy.
105
Yadav & Toshniwal
Graph based framework approach for time series prediction has
incorporated the concept of exponential smoothing, moving average and
graph mining to enhance its accuracy. Graph based framework approach
for time series prediction is a good alternative to regression. In the
proposed approach there is no need of domain expert knowledge to
know the curve equation and number of parameters in it. The result
validate that the new approach has good accuracy rate.
References
Balestra, P., & Nerlove, M. (1966). Pooling cross section and time series
data in the estimation of a dynamic model: The demand for
natural gas. Econometrica, 34(3), 585-612.
Bollerslev, T. (1987). A conditionally heteroskedastic time series model
for speculative prices and rates of return. The review of
economics and statistics, 69(3), 542-547.
Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1970). Time series analysis.
Oakland, CA: Holden-Day.
Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1976). Time series analysis:
forecasting and control (Vol. 16): San Francisco, CA: Holden-Day.
Box, G. E. P., & Pierce, D. A. (1970). Distribution of residual
autocorrelations in autoregressive-integrated moving average
time series models. Journal of the American Statistical
Association, 65(332), 1509-1526.
Brown, R. G. (2004). Smoothing, forecasting and prediction of discrete
time series. Mineola, NY: Dover Publications.
Brown, R. G., & Smoothing, F. (1962). Prediction of Discrete Time Series.
Englewood Cliffs, NJ: Prentice Hall.
Bunke, H., & Riesen, K. (2008). Graph Classification Based on Dissimilarity
Space Embedding. In N. da Vitoria Lobo, T. Kasparis, F. Roli, J.
Kwok, M. Georgiopoulos, G. Anagnostopoulos & M. Loog (Eds.),
Structural, Syntactic, and Statistical Pattern Recognition (Vol.
5342, pp. 996-1007): Berlin / Heidelberg: Springer
Clifton, C. (2011). Data Mining. In Encyclopaedia Britannica. Retrieved
from
http://www.britannica.com/EBchecked/topic/1056150/datamining
Cormen, T. H. (2001). Introduction to algorithms. Cambridge, Mass: The
MIT press.
Hamilton, J. D. (1989). A new approach to the economic analysis of
nonstationary time series and the business cycle. Econometrica,
57(2), 357-384.
Han, J., & Kamber, M. (2006). Data mining: concepts and techniques:
Morgan Kaufmann.
106
Yadav & Toshniwal
Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical
analysis (Vol. 5): NJ: Prentice Hall Upper Saddle River.
Madsen, H. (2008). Time series analysis. Boca Raton: Chapman and
Hall/CRC Press.
Shumway, R. H., & Stoffer, D. S. (1982). An approach to time series
smoothing and forecasting using the EM algorithm. Journal of
time series analysis, 3(4), 253-264.
Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and
neural network model. Neurocomputing, 50, 159-175. doi:
10.1016/s0925-2312(01)00702-0
107
Quality Practices in Open Source Software…
Farooq & Quadri
Quality Practices in Open Source Software Development Affecting
Quality Dimensions
Sheikh Umar Farooq
S. M. K. Quadri
Abstract
Purpose: The quality of open source software has been a matter of debate for a
long time now since there is a little concrete evidence to justify it. The main
concern is that many quality attributes such as reliability, efficiency,
maintainability and security need to be carefully checked, and that fixing software
defects pertaining to such quality attributes in OSDM (Open Source Development
Model) can never be guaranteed fully. In order to diminish such concerns, we need
to look at the practices which affect these quality characteristics in OSS (Open
Source Software) negatively. This paper presents an exploratory study of the
quality dimensions and quality practices and problems in OSDM. An insight of
these problems can serve as a start point for improvements in quality assurance of
open source software.
Design/Methodology/Approach: A survey was administered based on existing
literature. On the basis of this survey those practices in OSDM are described which
affect quality attributes negatively in OSS.
Findings: The quality characteristics which should be taken into consideration to
select or evaluate OSS are presented. Furthermore, quality practices in OSDM
which affect the quality of OSS in a negative manner have also been highlighted.
Research Implications: Further research is suggested to identify other quality
problems not found in this paper and to evaluate the impact of different practices
on project quality.
Originality/Value: As a first step in the development of practices and processes to
assure and further improve quality in open software projects, in addition to
quality attributes, existing quality practices and quality problems have to be
clearly identified. This paper can serve as a start point for improvements in quality
assurance of open source software’s.
Keywords: Open Source Software; Software Quality; Quality Practices; Quality
Problems.
Paper Type: Survey Paper
Introduction
here are more than hundred thousand open source software of
varying quality. The OSS model has not only led to the creation of
significant software, but many of these software show levels of
quality comparable to or exceeding that of software developed in a
closed and proprietary manner (Halloran & Scherlis, 2002; Schmidt &
Porter, 2001). However, open source software also face certain
T

Research Scholar. P. G. Department of Computer Sciences, University of Kashmir, Jammu

Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and
Kashmir. 190 006. India. E-Mail: [email protected]
108
Farooq & Quadri
challenges that are unique to this model. For example, due to the
voluntary nature of open source software projects, it is impossible to fully
rely on project participants (Michlmayr & Hill, 2003). This issue is further
complicated by the distributed nature because it makes it difficult to
identify volunteers who are neglecting their duties, and to decide where
more resources are needed (Michlmayr, 2004). While most research on
open source has focused and hyped popular and successful projects such
as Apache (Mockus, Fielding & Herbsleb, 2002) and GNOME (Koch &
Schneider, 2002), there is an increasing awareness that not all open
source software projects are of high quality. SourceForge, which is
currently the most popular hosting site for free software and open source
projects with over 95,000 projects, is not only a good resource to find
well maintained free software applications – there are also a large
number of abandoned projects and software with low quality (Howison
& Crowston, 2004). Some of these low quality and abandoned projects
may be explained in terms of a selection process given that more
interesting projects with a higher potential will probably attract a larger
number of volunteers, but it has also been suggested that project failures
might be related to the lack of project management skills (Senyard &
Michlmayr, 2004). Nevertheless, large and successful projects also face
important problems related to quality (Michlmayr & Hill, 2003;
Michlmayr, 2004; Villa, 2003). In order to ensure that open source
software remains a feasible model for the creation of mature and high
quality software suitable for corporate and mission-critical use, open
software quality assurance has to take these challenges and other quality
problems into account and find solutions to them. As a first step in the
development of practices and processes to assure and further improve
quality in open software projects, existing quality practices and quality
problems have to be clearly identified. To date however, only a few
surveys on quality related activities in open software projects (that too
mostly in successful OSS) have been conducted (Zhao & Elbaum, 2000;
Zhao, 2003). This paper presents an exploratory study of the quality
dimensions and quality practices and problems in open source software
based on existing literature.
I
Software Quality and its Characteristics
Software quality is imperative for the success of a software project.
Boehm (1984) defines software quality as “achieving high levels of user
satisfaction, portability, maintainability, robustness and fitness for use”.
Jones (1985) refers to quality as “the absence of defects that would make
software either stop completely or produces unacceptable results”.
These definitions of software quality cannot be applied directly to OSS.
Unlike CSS, user requirements are not formally available in OSS. We can
109
Farooq & Quadri
evaluate the project and its program on a number of important
attributes. Important attributes include functionality, reliability, usability,
efficiency, maintainability, and portability. The benefits, drawbacks, and
risks of using a program can be determined from examining these
attributes. The attributes are same as with proprietary software, of
course, but the way we should evaluate them with OSS is often different.
In particular, because the project and code is completely exposed to the
world, we can (and should!) take advantage of this information during
evaluation. We can divide OSS into two major categories: Type- 1:
Projects that are developed to replicate and replace existing CSS software;
and Type-2: Projects initiated to create new software that has no existing
equivalent CSS software. Linux is an example of Type-1 software, which
was originally developed as a replacement for UNIX. Protégé, ontology
development software is an example of Type-2 software.
Existing quality models provide a list of quality carrying characteristics
that are responsible for high quality (or otherwise) of software. Software
quality is an abstract concept that is perceived and interpreted differently
based on one’s personal views and interests. To dissolve this ambiguity,
ISO/IEC 9126 provides a framework for the evaluation of software
quality. ISO/IEC 9126 is the standard of quality model to evaluate a single
piece of software (Software Engineering-Product Quality-Part 1, 2001;
Software Engineering-Product Quality-Part 2, 2001). ISO/IEC 9126
defines six software quality attributes, often referred to as quality
characteristics along with various sub-characteristics to evaluate the
quality of software as shown in Fig. 1.
Fig. 1: ISO 9126 Software Quality Model
110
Farooq & Quadri
 Functionality
Functionality refers to the capability of the software product to provide
functions which meet stated and implied needs when the software is
used under specified conditions. Functionality means the number of
functions must be available in software that fulfils the minimum usage
criteria of the user (Raja & Barry, 2005). ISO 9126 Model describe the
functionality attribute as “a set of attributes that bear on the existence of
a set of functions and their specified properties. The functions are those
that satisfy stated or implied needs”. This set of attributes characterizes
what the software does to fulfil needs, whereas the other sets mainly
characterizes when and how it does so (International Organization for
Standardization, 1991). It is fundamental characteristic of the software
development and it is close to the property of the correctness (Fenton,
1993). The specific functions that we need obviously depend on the kind
of program and our specific needs. However, there are also some general
functional issues that apply to all programs. In particular, we should
consider how well it integrates and is compatible with existing
components we have. If there are relevant standards, does the program
support them? If we exchange data with others using them, how well
does it do so? For example, MOXIE: Microsoft Office – Linux
Interoperability Experiment downloaded a set of representative files in
Microsoft Office format, and then compare how well different programs
handle them (Venkatesh et al, 2011). For Type-1 OSS there are no formal
functionality requirements, yet there will be a certain level of expectation
in terms of its functionality compared to an existing CSS. Type-1 OSS will
be considered of a high quality and new users will adopt Type-1 software,
if it provides the basic functionality of its CSS equivalent. In case of Type2 OSS, there is no existing software to derive functional requirements
from, thus new users will be defining such requirements according to
their own needs. The sub characteristics of functionality attribute
specified by Punter, Solingen & Trienekens (1997) are as:
 Accuracy
This refers to the correctness of the functions i.e. to provide the right or
agreed results or effects with the needed degree of precision. e.g. an
ATM may provide a cash dispensing function but is the amount correct?
 Compliance
Where appropriate certain industry (or government) laws and guidelines
need to be complied with, i.e. SOX. This sub characteristic addresses the
compliant capability of software.

Interoperability
A given software component or system does not typically function in
isolation. This sub characteristic concerns the ability of a software
component to interact with other components or systems.
111
Farooq & Quadri
 Security
This sub characteristic relates to unauthorized access to the software
functions (programs)/data.
 Suitability
This characteristic refers to the appropriateness (to specification) of the
functions of the software.
 Reliability
Reliability refers to the capability of the software product to maintain its
level of performance under stated conditions for a stated period of time.
The reliability factor is concerned with the behavior of the software. It is
the extent to which it performs its intended functions with required
precision. The software should behave as expected in all possible states
of environment. Although OSS is available free of cost, yet such software
needs to have a minimum operational reliability to make it useful for any
application. Many of the open source projects do not have resources to
dedicate to accurate testing or inspection so that the reliability of their
products must rely on community's reports of failures. The reports stored
in the so-called bug tracking systems, are uploaded by the community,
and moderated by internal members of the open source project. Reports
are archived with various pieces of information including the date of
upload and the description regarding the failure. What information can
be collected from these repositories and how to mine them for reliability
analysis is still an open issue (Li, Herbsleb & Shaw, 2005; Godfrey &
Whitehead, 2009). Problem reports are not necessarily a sign of poor
reliability - people often complain about highly reliable programs,
because their high reliability often leads both customers and engineers to
extremely high expectations. Indeed, the best way to measure reliability
is to try it on a "real" work load. Reliability has a significant effect on
software quality, since the user acceptability of a product depends upon
its ability to function correctly and reliably (Samoladas & Stamelos, n.d).
ISO 9126 defines reliability as “a set of attributes that bear on the
capability of software to maintain its performance level under stated
conditions for a stated period of time” (International Organization for
Standardization, 1991). Further, sub characteristics of reliability attribute
stated by Punter, Solingen & Trienekens (1997) are as:
 Fault Tolerance
The ability of software to withstand and maintain a specified level of
performance in case of software failure.
 Maturity
The Capability of the software product to avoid failures, as a result of
faults in the software. It is refined into an attribute Mean Time to Failure
(MTTF).
112
Farooq & Quadri
 Recoverability
Ability to bring back a failed system to full operation, including data and
network connections.
 Efficiency
Efficiency refers to the capability of the software product to provide
appropriate performance, relative to the amount of resources used, under
stated conditions. According to the ISO Model, efficiency is “a set of
attributes that bear on the relationship between the software's
performance and the amount of resources used under stated conditions”
(International Organization for Standardization, 1991). Efficiency
describes that the response of the software should be faster in the form of
any input. The sub characteristics of efficiency attribute are as (Punter,
Solingen & Trienekens, 1997):
 Resource Behavior
Amount and type of resources used and the duration of such use in
performing its function. It involves the attribute complexity that is
computed by a metric involving size (space for the resources used and
time spent using the resources).
 Time Behavior
The capability of the software product to provide appropriate response
time, processing time and throughput rates when performing its function
under stated conditions. It is an attribute that can be measured for each
functionality of the system.
 Usability
Usability refers to the capability of the software product to be
understood, learned, used and attractive to the user, when used under
specified conditions (the effort needed for use). ISO 9126 describe the
usability attribute as “a set of attributes that bear on the effort needed
for use and on the individual assessment of such use by a stated or
implied set of users” (International Organization for Standardization,
1991). The usability of open source software is often regarded as one
reason for this limited distribution. The usability problem in most OSS is
because of the following reasons:
 Developers are not users so they usually do not take user
perception into consideration.
 Usability experts do not get involved in OSS projects
 The incentives in OSS work better for improvement of
functionality than usability
 Usability problems are harder to specify and distribute than
functionality problems
 Design for usability really ought to take place in advance of any
coding
113
Farooq & Quadri

Open source projects lack the resources to undertake high
quality usability work
 OSS development is inclined to promote power over simplicity
It's important to note that to improve usability many OSS programs are
intentionally designed into at least two parts: an "engine" that does the
work and a “GUI” that lets users control the engine through a familiar
point and click interface (fragmentation). This division into two parts is
considered an excellent design approach; it generally improves reliability,
and generally makes it easier to enhance one part. Sometimes these
parts are even divided into separate projects: The "engine" creators may
provide a simple command line interface, but most users are supposed to
use one of the available GUIs available from another project. Thus, it can
be misleading if you are looking at an OSS project that only creates the
engine - be sure to include the project that manages the GUI, if that
happens to be a separate sister project. In many cases an OSS user
interface is implemented using a web browser. This actually has a
number of advantages: usually the user can use nearly any operating
system or web browser, users don't need to spend time installing the
application, and users will already be familiar with how their web
browser works (simplifying training). However, web interfaces can be
good or bad, so it's still necessary to evaluate the interface's usability.
The sub characteristics of the usability attribute are as (Punter, Solingen
& Trienekens, 1997):
 Learn ability
Learning effort for different users, i.e. novice, expert, casual etc.
 Operability
Ability of the software to be easily operated by a given user in a given
environment.
 Understandability
Determines the ease of which the systems functions can be understood,
relates to user mental models in Human Computer Interaction methods.
 Portability
Portability refers to the capability of the software product to be
transferred from one environment to another. The environment may
include organizational, hardware or software environment. ISO 9126
Model defines the portability attribute as “A set of attributes that bear on
the ability of software to be transferred from one environment to
another (including the organizational, hardware, or software
environment)” (International Organization for Standardization, 1991).
Portability is also a main issue of today and with respect to it, Open
Source Software could run and give better results on different platforms
(Loannis & Stamelos, 2011). From its early days, portability has been a
central issue in OSS development. Various OSS systems have as first
114
Farooq & Quadri
priority the ability of their software to be used on platforms with
different architectures. Here, we have to stress on important fact, which
originates from the nature of OSS, and helps portability, namely the
availability of the source code of the destination software. If the source
code is available, then it is possible for the potential developer to port an
existing OSS application to a different platform than the one it was
originally designed for. Perhaps the most famous OSS, the Linux kernel,
has been ported to various CPU architectures other than its original one,
the x86. In the end, evaluating usability requires hands-on testing. The
sub characteristics of portability attribute are as (Punter, Solingen &
Trienekens, 1997):
 Adaptability
Characterizes the ability of the system to change to new specifications or
operating environments.
 Install ability
Characterizes the effort required to install the software in a specified
environment.
 Replaceability
The capability of the software product to be used in place of another
specified software product for the same purpose in the same
environment.
 Maintainability
Maintainability refers to the capability of the software product to be
modified. Modifications may include corrections, improvements or
adaptations of the software to changes in the environment and in the
requirements and functional specifications (the effort needed to be
modified).Maintainability in general refers to the ability to maintain the
system over a period of time. This will include ease of detecting, isolating
and removing defects. Additionally, factors such as ease of addition of
new functionality, interface to new components, programmers ability to
understand existing code and test team’s ability to test the system
(because of option like test instructions and test points) will enhance the
maintainability of a system. ISO 9126 defines it as “A set of attributes that
bear on the effort needed to make specified modifications (which may
include corrections, improvements, or adoptions of software to
environmental changes and changes in the requirements and functional
specifications)” (International Organization for Standardization, 1991).
Maintainability of OSS projects is a factor that was one of the first to be
investigated by the OSS literature. This was done mainly because OSS
development emphasizes on the maintainability of the software released.
Making software source code available over the Internet allows
developers from all over the world to contribute code, adding new
functionality (parallel development) or improving present one and
115
Farooq & Quadri
submitting bug fixes to the present release (parallel debugging). A part of
these contributions are incorporated into the next release and the loop
of release, code submission/bug fixing, incorporation of the submitted
code into the current and new release is continued. This circular manner
of OSS development implies essentially a series of frequent maintenance
efforts for debugging existing functionality and adding new one to the
system. These two forms of maintenance are known as corrective and
perfective maintenance respectively.
Maintenance is a huge cost driver in software projects. OSS is
downloaded and used by a global community of users. There are no faceto-face interactions among the maintainers of the software. They have to
rely upon the documentation within the source code and on
communication through message boards. Therefore OSS is required to be
highly maintainable. Lack of proper interface definition, structural
complexity and insufficient documentation in an existing version of OSS
can discourage new contributions. Since participation is voluntary, low
maintainability will generate minimum participation of active users and
hence will have a negative effect on quality. The sub characteristics of the
maintainability are as (Punter, Solingen & Trienekens, 1997):
 Changeability
It refers to the capability of the software product to enable a specified
modification to be implemented. It also characterizes the amount of
effort to change a system.
 Stability
The capability of the software product to avoid unexpected effects from
modifications of the software (the risk of unexpected effect of
modifications)
 Testability
Characterizes the effort needed to verify (test) a system change.
 Analyzability
It characterizes the ability to identify the root cause of a failure within the
software.
Different users have different expectations of the same software and
user’s expectations of software evolve with time. For instance, some
users may view performance and reliability as the key features of
software, while others may consider ease of installation and maintenance
as key features of the same software. Therefore, software applications
today must do more than just meet technical specifications; they must be
flexible enough to meet the varying needs of a diverse user base and
provide reasonable expectations of future enhancements. The last five
characteristics are not related to the task performed by the software and
therefore are regarded as non-functional attributes. In many cases
though software requirements and testing methodologies are mostly
116
Farooq & Quadri
focused on functionality and pay little if any attention to non-functional
requirements. Since nonfunctional requirements affect the perceived
quality of software (quality in use), failure to meet them often leads to
late changes and increased costs in the development process. For
example Reliability is a non-functional requirement that needs to be
addressed in every software project. Therefore badly-written software
may be functional, but not a reliable one.
II
Quality Problems under Open Source Model
Although many high profile cases of successful OSSD projects exist (e.g.,
Apache, OpenOffice, PHP), the harsh reality is that the majority of OSS
projects are of low quality. No doubt open source practices have been
remarkable success as can be seen in some successful OSS, we believe
there are several areas where there are opportunities for improvement.
A commonly cited reason for the failure of OSS projects to reach a
maturity level is in coordination of developers and project management,
leading to some duplication of efforts by multiple developers, inefficient
allocation of time and resources, and lack of attention to software
attributes such as ease of use, documentation, and support, all of which
impact conformance to specifications. Only few projects have explicit
documentation describing ways of contributing to and joining a project.
One more critical problem due to voluntary nature of open source is that,
reliance on project participants can never be guaranteed (Michlmayr &
Hill, 2003). Regarding to its distributed nature issues like to identify who
gets what to be done or to decide where more resources to break
bottleneck need to be examined (Michlmayr, 2004). Following issues
usually lead to low quality software’s under OSDM:
 Missing or Incomplete Documentation
Documentation is necessary for every project. Programmers and users
have always criticized projects which lacks documentation regarding
development practices (Michlmayr, Hunt & Probert, 2005). A study in QA
reveals that over 84% of the respondents prepare a ‘‘TODO’’ list including
list of pending features and open bugs. 62% build installation and
building guidelines, 32% projects have design documents, and 20% have
documents to plan releases including date and content (Zhao, 2003).
Most of the open source projects / software’s have little or no
documentation. However, some projects with a large number of
contributors have good documentation about coding styles and code
commit (Michlmayr, Hunt & Probert, 2005). Lack of documentation
reduces the motivation of new users and programmers, because they
always confront the difficulty to understand the project, whatever in
order to make usage or improvement. New developers, who would like to
117
Farooq & Quadri
participate into a project potentially, have to understand a part of the
project well enough (Ankolekar, Herbsleb & Sycara, 2003). Volunteers
may like to contribute in an area but they might not know how to start
and where to start without proper documentation. The lack of developer
documentation also implies that there is no assurance that everyone
follows the same techniques and procedures. At the very beginning of
Mozilla project, the community has faced problem to attract new
developers, the situation did slow down the proceeding of project. After
more well-formed documentations and tutorials were provided, the
number of participants significantly raised (Mockus, Fielding & Herbsleb,
2002). Due to the nature of the open source less attraction to users and
developers may leads to low quality product or even abend of project
(Zhao, 2003). A survey, which explored QA activities in open source,
concluded in that OS project starts regularly without a planning (Zhao &
Elbaum, 2000). While there is no specific definition of program, the
program varies regularly during the development process. Worse off,
those changes are most poorly recorded in documentation.
Undocumented planning and program changes make the measure and
validation of end product impossible.
 Problems in Collaboration
Software development is an interactive behaviour, often with tight
integration and interdependencies between modules, and therefore
requires a substantial amount of coordination and communication
between developers if they are to collaborate on features (Ankolekar,
Herbsleb & Sycara, 2003). Strong user involvement and participation
throughout a project is a central view of OSSD. In some projects, there
are problems with coordination and communication which can have a
negative impact on project quality. It is more difficult to achieve
coordination and agreeing to goals in OSS development than in closed
source software development. Sometimes it is not clear who is
responsible for a particular area and therefore things cannot be
communicated properly. There may also be duplication of effort and a
lack of coordination related to the removal of critical bugs. Some features
may for example be duplicated under open-source development because
there is some chance that developers with the same needs will not meet
– or will not agree on their objectives and methods when they meet and
will end-up developing the same types of features independently
(forking). In traditional development team, developers can work effective
together, as long as the team members understand with each other. Due
to convenient communications possibility those team tends to advance
efficiently (Thayer & McGetrick, 1993).Since the team members may
cooperate on module or single one feature, to be aware of the activities
118
Farooq & Quadri
of cooperating members is important (Ankolekar, Herbsleb & Sycara,
2003). Individuals and small teams take the advantages of convenient
communication and simpler decision method. In any case, the potential
for collaborative and group maintenance in successfully resolving a
serious quality assurance issue is obvious and its importance and
prominence in successful projects, in one form or another, seems like a
good possibility (Michlmayr & Hill, 2003).
 Lack of global view of system constraints
Large-scale open-source projects often have a large number of
contributors from the user community (i.e., the periphery). When these
users encounter problems, they may examine the source code,
propose/apply fixes locally, and then submit the results back to the core
team for possible integration into the source base. Often these users in
the periphery have much less knowledge of the entire architecture of an
open-source software system than the core developers. As a result, they
may lack a global view of broader system constraints that can be affected
by any given change, so their suggested fixed may be inappropriate.
 Dependence on Participants
No participants in OSS can be held responsible; the strong reliance on
individual developers comes to be a consideration of quality assurance.
It's a conflict that a project expects predictability and reliability from
participants, who claims to be irresponsible for the project (Raymond,
1999). A large user group is usually the fundamental of open source
project (Zhao, 2003). Without new volunteers the project seemed hard to
proceed, because when project begins, it also starts losing participants.
No member is obligated to contribute until the end of project (Raymond,
1999), developers are free to decide, if stay with project or just leave. For
open source project, regular demand on new developers keep itself
proceeding steadily. A problem some projects face, especially those that
are not very popular, is attracting volunteers. A study has confirmed that
unlike big and mature projects, small projects may not receive much
feedback from developers and co-users (Mockus, Fielding & Herbsleb,
2002). There are usually many ways of contributing to a project, such as
coding, testing or triaging bugs. However, many projects only find
prospective members who are interested in developing new source code.
As a result, developers have to use a large portion of their time for tasks
other people could easily handle. Few contributors are interested in
helping with testing, documentation and other activities. These are vital
activities, particularly as projects mature and need to be maintained and
updated by new cohorts of developers. Good documentation, tutorials,
development tools, and a reward and recognition culture facilitate the
creation of a sustainable community.
119
Farooq & Quadri
 Unsupported Code
One of the unsolved problems is how to handle code that has previously
been contributed but which is now unmaintained. A contributor might
submit source code to implement a specific feature or a port to obscure
hardware architecture. As changes are made by other developers, this
particular feature or port has to be updated so that it will continue to
work. Unfortunately, some of the original contributors may disappear
and the code is left unmaintained and unsupported. Lead developers face
the difficult decision of how to handle this situation.
 Release Problems
Release management is one of the most important controller to ensure
the quality of open source software. The state of release management
guidelines remains remarkably informal since the beginning of open
source development (Erenkrantz, 2003). Carefully defined criteria are
needed to regulate the release management. Oftentimes, release
manager are adopted in decentralized open source model to fit the
rapidly scaled project dimensions (Zhao, 2003). Under open source, it's
recommended to release often and release early (Raymond, 1999). The
argument behind this principle is that, users will take the responsibility to
find the bugs. It has been confirmed that a good part of debugging tasks
are shifted to users (Zhao, 2003). But as new versions are frequently
released with poorly tested by core team, users burden the most tasks of
debugging. The activities of testing increase, the quality of program gets
worse (Hendrickson, 2001). Though software quality investments can
reduce overall software cycle costs by minimizing rework later on, many
software manufacturers sacrifice quality in favor of other objectives such
as shorter development cycles and meeting time constraints. As one of
the manager said, "I would rather have it wrong than have it late" (Paulk,
Weber, Curtis & Chrissis, 1994). In contrast traditional conception of
software quality is centred on a product-centric, conformance view of
quality (Prahalad & Krishnan, 1999). Absence of static testing on
developer side delivers much more bugs as usually can be caught by a
number of users. Often it turns out to be impossible for developers to
keep up with a mass of bug reports. Release may be frequently
performed, when every claimed stable version fulfills the settled release
qualifications. Otherwise, it must be labeled as unstable version. It can be
hard, however, to ensure consistent quality of open-source software due
to the short feedback loops between users and core developers, which
typically result in frequent “beta” releases, e.g., several times a month.
Although this schedule satisfies end-users who want quick patches for
bugs they found in earlier betas, it can be frustrating to other end-users
who want more stable, less frequent software releases. In addition to our
120
Farooq & Quadri
own experiences, Gamma describes how the length of the release cycles
in the Eclipse frame-work affected user participation and eventually the
quality of the software (Gamma, 2005).
 Version Authorization
The many different commercial versions of Linux already pose a
substantial problem for software providers developing for the Linux
platform, as they have to write and test applications developed for these
various versions. The availability of source code often encourages an
increase in the number of options for configuring and sub setting the
software at compile and runtime. Although this flexibility enhances the
software’s applicability for a broad range of use cases, it can also
exacerbate QA costs due to a combinatory increase in the QA space.
Moreover, since open-source projects often run on a limited QA budget
due to their minimal/non-existent licensing fees, it can be hard for core
developers to validate and support large numbers of versions and
variants simultaneously, particularly when regression tests and
benchmarks are written and run manually. Smith reports an exchange
with an IT manager in a large Silicon Valley firm who lamented, “Right
now, developing Linux software is a nightmare, because of testing and
QA—how can you test for 30 different versions of Linux?” (Feller, et al,
2005).
 Testing and Bug Reporting
The study of 200 OSS projects discovered that
 fewer than 20 percent of OSS developers use test plans;
 only 40 percent of projects use testing tools, although this
increases when testing tool support is widely available for a
language, such as Java; and
 less than 50 percent of OSS systems use code coverage concepts
or tools.
 Larger projects do not spend more time in testing than smaller
projects.
OSS development clearly doesn’t follow structured testing methods. The
methodology an OSS project adopts will depend largely on the available
expertise, resources, and sponsorship. Formal testing techniques and test
automation are expensive and require sponsorship. Some high-profile
open source projects can achieve this, but most don’t, so the user base is
often the only choice (Aberdour, 2007). As more users with few technical
skills use free software, developers see an increase in useless or
incomplete bug reports. In many cases, users do not include enough
information in a bug report or they file duplicate bug reports. Such
reports take unnecessary time away from actual development work.
Some projects have tried to write better documentation about reporting
121
Farooq & Quadri
bugs but they found that users often do not read the instructions before
reporting a bug. Many popular open-source projects (such as GNU GCC,
CPAN, Mozilla, the Visualization Toolkit, and ACE+TAO) distribute
regression test suites that end users can run to evaluate the success of an
installation on a user’s platform. Users can – but frequently do not – return the test results to project developers. Even when results are
returned to core developers, however, the testing process is often
undocumented and unsystematic, e.g., core developers have no record of
what configurations were tested, how they was tested, or what the
results were, which loses crucial QA-related information. Moreover,
many QA configurations are executed redundantly by thousands of users
(e.g., on popular versions of Linux or Windows), whereas others are never
executed at all (e.g., on less widely used operating systems).
 Configuration Management
Many free software and open source projects offer a high level of
customization. While this gives users much flexibility, it also creates
testing problems. It is very difficult or impossible for the lead developer
to test all combinations so only the most popular configurations tend to
be tested. It is quite common that, when a new release is made, users
report that the new version broke their configuration. Well-written opensource software (e.g., based on GNU autoconf) can be ported easily to a
variety of OS and compiler platforms. In addition, since the source is
avail-able, end-users can modify and adapt their source base readily to fix
bugs quickly or to respond to new market opportunities with greater
agility. Support for platform-independence, however, can yield the
daunting task of keeping an open-source source software base
operational despite continuous changes to the underlying platforms. In
particular, since developers in the core may only have access to a limited
number of OS/compiler configurations, they may release code that has
not been tested thoroughly on all platform configurations on which users
want to run the software.
Although in some cases OSS seems to do better than closed source
software, there are many things that need to be to be improved and
further expanded, so that we avoid typical problems that arise from
practices usually employed in OSS. To achieve the maturity level and to
produce high quality open source software’s one should also employ
proved practices and methods usually employed in closed source
software development in beneficial manner. Aberdour (2007) compares
quality management practices in open source and closed source software
development as shown in Table 1. We should strive to employ these
proven practices in all types of projects whether small or large to achieve
high quality and matured Open Source Software.
122
Farooq & Quadri
Table 1: Quality Management in Open Source & closed Source
Closed Source
Well-defined developed methodology
Extensive project documentation
Formal, structured testing and quality
assurance methodology
Analysts define requirements
Formal Risk assessment process – monitored
and managed throughout project
Measurable goals used throughout project
Defect discovery from black-box testing as
early as possible
Empirical evidence regarding quality routinely
to aid decision making
Team members are assigned work
Formal design phase is carried out and signed
off before programming starts
Much effort put into project planning and
scheduling
Open Source
Development methodology often not
defined or documented
Little project documentation
Unstructured and informal testing and
quality assurance methodology
Programmer define requirements
No formal risk assessment process
Few measurable goals
Defect discovery from black-box testing
late in the process
Empirical evidence regarding quality isn’t
collected
Team members choose work
Projects often go straight to programming
Little project planning or scheduling
Conclusion and Future Work
OSS quality is an open issue and it should continue striving for even
better quality levels if it has to outperform traditional, closed source
development and target corporate and safety critical systems. The quality
of selected software and the standards of evaluating the quality of OSS
are often wrongly defined. Therefore, in this paper the quality
characteristics which should be taken into consideration to select or
evaluate OSS are also presented. The paper also presents insights into
quality practices of open source software projects which affects the
quality of OSS in a negative manner. Avoiding such practices and using
proven quality management practices can result in high quality OSS.
Further research is suggested to identify other quality problems not
found in this paper and to evaluate the impact of different practices on
project quality.
References
Aberdour, M. (2007). Achieving Quality in Open Source Software. IEEE
Computer Society. 24 (1), 58-64. doi: 10.1109/MS.2007.2
Ankolekar, A., Herbsleb, J.D., & Sycara, K. (2003). Addressing Challenges
to Open Source Collaboration with the Semantic Web. Retrieved
from
http://www.cs.cmu.edu/~anupriya/papers/icse2003.pdf
Boehm, B. W. (1984). Software Engineering Economics. IEEE Transactions
on
Software
Engineering.
10
(1),
4-21.
doi:
10.1109/TSE.1984.5010193
123
Farooq & Quadri
Erenkrantz, J.R. (2003). Release Management within Open Source
Projects. Retrieved from
http://www.erenkrantz.com/Geeks/Research/Publications/Relea
seManagement.pdf
Feller, J., et al (Eds.)(2005). Perspectives on free and open source
software. Cambridge, Mass: MIT.
Fenton, N. E. (1993). Software Metrics: A Rigorous Approach. London:
Chapman and Hall.
Gamma, E. (2005). Agile, open source, distributed, and on-time: inside
the eclipse development process. Retrieved from
http://www.inf.fu-berlin.de/inst/ag-se/teaching/SBSE/034_Eclipse-process.pdf
Godfrey, M. W., & Whitehead, J. (2009). Proceedings of the 2009 6th IEEE
International Working Conference on Mining Software
Repositories, Vancouver Canada, May 16-17.
Halloran, T. J., & Scherlis, W. L. (2002). High quality and open source
software practices. Retrieved from
http://flosshub.org/system/files/HalloranScherlis.pdf
Hendrickson, E. (2001). Better Testing – Worse Quality? In International
Conference on software Management & Applications of Software
Measurement, February 12-16, 2001 San Diego, CA, USA
Howison, J., & Crowston, K. (2004). The perils and pitfalls of mining
SourceForge. Retrieved from
http://msr.uwaterloo.ca/papers/Howison.pdf
International Organization for Standardization. (1991). Information
technology-Software product evaluation: Quality characteristics
and guidelines for their use. Berlin: Beuth-Verlag: ISO/IEC.
Jones, C. L. (1985). A Process-Integrated Approach to Defect Prevention.
IBM Systems Journal. 24 (2), 150-167. doi:10.1147/sj.242.0150
Koch, S., & Schneider, G. (2002). Effort, cooperation and coordination in
an open source software project: GNOME. Information Systems
Journal. 12 (1), 27–42. doi: 10.1046/j.1365-2575.2002.00110.x
Li, P.L., Herbsleb, J., & Shaw, M. (2005). Forecasting field defect rates
using a combined time-based and metrics-based approach: a
case study of OpenBSD. 16th IEEE International Symposium on
Software Reliability Engineering (ISSRE) (pp. 193-202).
Washington, DC, USA: IEEE Computer Society. doi:
10.1109/ISSRE.2005.19
Michlmayr, M. (2004). Managing volunteer activity in free software
projects. In Proceedings of the 2004 USENIX Annual Technical
Conference (pp. 39-33), FREENIX Track, Boston, MA: USENIX
Association. Retrieved from
http://dl.acm.org/citation.cfm?id=1247415.1247454
124
Farooq & Quadri
Michlmayr, M., & Hill, B. M. (2003). Quality and the reliance on
individuals in free software projects. In Proceedings of the 3rd
Workshop on Open Source Software Engineering (pp. 105–109).
Portland, OR, USA: ICSE.
Michlmayr, M., Hunt, F., & Probert, D. (2005). Quality Practices and
Problems in Free Software Projects. In Proceedings of the First
International Conference on Open Source Systems Geneva, 11th15th July (pp. 24-28)
Mockus, A. R., Fielding, T., & Herbsleb, J. D. (2002). Two case studies of
open source software development: Apache and Mozilla. ACM
Transactions on Software Engineering and Methodology. 11 (3),
309–346. doi:10.1145/567793.567795
Paulk, M. C, Weber, C., Curtis, W., & Chrissis, M. (1994). The Capability
Maturity Model: Guidelines for Improving the Software Process.
Reading, Mass: Addison-Wesley.
Prahalad, C. K., & Krishnan, M. S. (1999 September). The New Meaning of
Quality in the Information Age. Harvard Business Review. 77 (5),
109-118. Retrieved from
http://hbr.org/1999/09/the-new-meaning-of-quality-in-theinformation-age/ar/1
Punter, T., Solingen,R.V., & Trienekens, J. (1997). Software Product
Evaluation. 4th Conference on Evaluation of Information
Technology (30-31 Oct. 1997). MB Eindhoven Netherland.
Raja, U., & Barry, E. (2005). Investing Quality in Large –Scale Open Source
Software. U.S.A: Texas A&M University.
Raymond, E. S. (1999). The Cathedral and the Bazaar. Sebastopol, CA:
O’Reilly & Associates.
Samoladas, I., & Stamelos, I. (n.d). Assessing Free/Open Source Software
Quality. Retrieved from
http://ifipwg213.org/system/files/samoladasstamelos.pdf
Schmidt, D. C., & Porter, A. (2001). Leveraging open-source communities
to improve the quality & performance of open-source software.
In Proceedings of the 1st Workshop on Open Source Software
Engineering. Toronto, Canada: ICSE.
Senyard, A., & Michlmayr, M. (2004). How to have a successful free
software project. In Proceedings of the 11th Asia-Pacific
Software Engineering Conference (pp. 84-91). Busan, Korea: IEEE
Computer Society.
Software Engineering-Product Quality-Part 1. (2001, June). Software
Engineering-Product Quality-Part 1: Quality Model. ISO/IEC
9126-1.
125
Farooq & Quadri
Software Engineering-Product Quality-Part 1. (2001, June). Software
Engineering-Product Quality-Part 1: Quality Model. ISO/IEC
9126-2.
Thayer, R.H., & McGettrick, A.D. (1993). (Eds.), Software Engineering: A
European Perspective. IEEE Computer Society Press, Los
Alamitos, CA.
Venkatesh, C., et al. (2011). Quality Prediction of Open Source Software
for e-Governance Project. Retrieved from
www.csi-sigegove.org/emerging_pdf/16_142-151.pdf
Villa, L. (2003). Large free software projects and Bugzilla. In Proceedings
of the Linux Symposium (July 23-26, 2003) Ottawa, Canada, pp.
447-456.
Zhao, L. (2003). Quality assurance under the open source development
model. Journal of Systems and Software. 66 (1), 65–75.
doi:10.1016/S0164-1212(02)00064-X
Zhao, L., & Elbaum, S. (2000). A survey on quality related activities in
open source. SIGSOFT Software Engineering Notes, 25 (3), 54–
57. doi: 10.1145/505863.505878
126
Open Source tools for varied professions
Khan
Open Source Tools for Varied Professions
Nadim Akhtar Khan
*
Abstract
Purpose: The popularity of open source software in contemporary world
with the emergence of globally distributed base of developers,
contributors and users has given new identity to the present software
development industry with growing use of freely available software tools
along with the source code by non-profit organisations, universities and
commercial establishments to suit their varied requirements. The success
of such software tools is evident with the growing number of downloads
and users from diverse professions. The present paper makes an attempt
to explore some of the most Prominent Open Source tools used for highly
specialised professional tasks in the fields of Business Management and
Health.
Design/Methodology/Approach:
Six most prominent open source tools available in both categories have
been identified and discussed based on the popularity owing to number of
downloads and prominent features that catch user attention for using
such software.
Implication:
These tools not only provide means for managing resources in a more
sophisticated manner but also provides ample opportunity for non-profit
organisations and commercial establishments to attain their goals by
taking advantage of prominent features and utilities of such software.
Research Limitations:
The paper only highlights prominent open source software tools available
in two fields based on their specialised utilities best suited for professional
requirements and operations. The scope can further be extended to reveal
user satisfaction by way of analysing the experiences of working with
such software in different setups.
Keywords: Open Source Software, Business Management, Health.
Paper Type: Article
Introduction
he open source software Movement has gained momentum over
time and has revolutionized software development approaches
throughout the world especially with the distributed developer
base and frequent updates. The availability of source code to tailor and
T
*
Assistant Professor. Department of Library and Information Science, University of Kashmir.
Hazratbal, Srinagar, 190006. Jammu and Kashmir. India.
127
Khan
customize the software to suit needs and requirements of users in
different setups has given new dimensions to software development
approaches and as such has captured the attention of software
developers, information and computer professionals throughout the
world. A large number of open source applications are already in the
market deriving support and adoption from world bodies like UNESCO,
WHO etc besides many Open Source Organizations and forums coming
together to conduct research for enhancing the features and
functionalities of Open Source Software systems. Multinational
corporations, nonprofit research institutes, university libraries, and
individual organizations are all using open source software to gather,
organize and provide access to information. Open Source software has
brought powerful information management tools within reach of
organizations that could have never afforded to purchase comparable
commercial products (Dunlap, 2006).
Open Source advocates argue that OSS is primarily a development
methodology grounded in the philosophy of making source code open
and free to all who want it. Users and developers co-exist in a community
where software grows and expands based on personal needs. These
enhancements make the project more globally desirable as it fits more
and more requirements. Linus Torvalds, the epitome of the open source
developer says:
 Release early and often
 Delegate everything you can
 Be open (Raymond, 2001, as cited in Grodzinsky, Miller & Wolf,
2003)
Open source software (OSS) products have rapidly acquired a notable
importance among consumers and firms all over the world. They are
mostly developed and distributed through online social networks.
However, their innovation and development has to face up the existence
of free-riders which can benefit from the knowledge developed in the
online social network and identifying the factors that moderate the
opportunistic behavior in OSS development and distribution for
facilitating the OSS innovations (Casaló, Flavián & Guinalíu, 2008)
Open source software has the seemingly useful feature that at any point,
any one with appropriate technical skills can modify the code and take
the project in a direction that diverges from the direction others are
taking it (called ‘code forking’). Grodzinsky, Miller and Wolf (2003)
stresses that open source project leaders and developers must show a
great willingness to take in new ideas, evaluate them thoughtfully, and
respond constructively in order to nurture both the idea and the
developer of the idea.
128
Khan
User participation is indeed both direct and indirect in the OSS
development context. Some users actively take part in the development
work by commenting on the existing solutions, which has been identified
as a typical form of user participation in OSS development, others have
acquired a consultative role in the development work (Livari, 2009).The
European Commission’s (2001) (as cited in Spinello, 2003) open source
study declared that this software “permits a greater rate of innovation,
with greater efficiency.”
Objective
The study is undertaken to identify and describe most popular open
source tools in Business Management and Health, which best suits the
professional demands in two fields.
Scope
Open Source Software utilization has found its place as a success in
almost all disciplines and specializations. However, the present study is
confined to Open Source Software tools in the fields of Business
Management and Health.
Open Sources tools in Business Management
 Magnolia CMS (http://www.magnolia-cms.com/)
Magnolia CMS is an open-source Web Content Management System that
focuses on providing an intuitive user experience in an enterprise-scale
system. Its combination of ease-of-use and a flexible, standards-based
Java architecture has attracted enterprise customers throughout the
globe and is widely being used both by government and private
enterprises in more than 100 countries.
Magnolia CMS is distributed as two web-applications, one acting as the
authoring instance and the other as the public environment. This allows
for better security, having one application inside your firewall and one
outside. It also enables clustering configurations.
129
Khan
Author instance is where all authors work. It typically resides in a secure
location such as behind a corporate firewall, inaccessible from the
Internet. The author instance publishes content to public instances.
Public instance receives the public content and exposes it to visitors on
the Web. It resides in a public, reachable location. You can have more
than one public instance, serving the same or different content.
Public instances that receive the activated content are known as
subscribers. Any number of subscribers can subscribe to a single author
instance. Subscribers are key to building high-availability, load-balanced
environments. Magnolia CMS stores all content (web pages, images,
documents, configuration, data) in a content repository.
 SugarCRM (http://www.sugarcrm.com/crm/)
SugarCRM is open source Customer Relationship Management software
for companies of all sizes. It can easily be customized and integrated with
other software to allow companies to build and maintain flexible systems.
Its core functionality includes sales force automation, marketing
campaigns, support cases, project management, leads, opportunities,
accounts etc. Ideal for small and medium-sized companies, large
enterprises and government organizations, Sugar can run in the Cloud or
on-site. It comes in different edition like Sugar Ultimate, Sugar Enterprise,
Sugar Corporate, Sugar Professional, & Sugar Community Edition. With
over five million downloads and more than 500,000 users, SugarCRM has
been recognized for its success and innovation by CRM Magazine,
InfoWorld, Customer Interaction Solutions and Intelligent Enterprise.
SugarCRM comes with complete marketing and sales force automation
features. It helps to share data across individuals and teams, while
monitoring business performance and provides a central hub to manage
and share all customer service issues to ensure that customer cases are
130
Khan
handled efficiently and effectively. Open-Source CRM platform lets users
quickly and easily customize the system to streamline business processes
to match specific requirements.
 Tomato CMS (http://www.tomatocms.com/)
TomatoCMS is an impressive open source Content Management System
powered by Zend Framework, jQuery and 960 Grid System. It allows
customization of themes rapidly and easily without need of HTML
knowledge. Its flexible module system helps to choose the best and most
needed components in maximizing target. Drag and drop widget system
helps to construct website in minute. Like module system, tomato CMS
provides suite of widget that can be flexibly customized. It is designed to
operate under many server structures: from share host (micro
code), dedicated server and above all, cluster server. It provides many
solution to optimise your website: opcode (eAccelerator, xCache, APC),
database (memcached, filecached,..), database balancing (Replicate,
Shard), web balancing (LVS, Big IP) and resource servers, static servers,
etc.

CiviCRM: A Free and Open Source eCRM Solution
(http://civicrm.org/)
CiviCRM is a free, libre and open source constituent relationship
management solution. It is web-based, internationalized, and designed
specifically to meet the needs of advocacy, non-profit and nongovernmental groups. It allows to record and manage information and
execute transactions, conversations, events or any type of
correspondence with each constituent and store it all in one, easily
accessible and manageable source. It is designed for the civic sector. It
integrates directly into the popular open source Content Management
systems Drupal and Joomla. Registration and visitor interactions are
logged directly into the system, including end-user maintenance of their
own addresses and custom fields. It can store data in many localized
131
Khan
formats and has been translated into a number of languages - including
French, Spanish, German, Dutch, and Portuguese. It is affordable and cost
effective.
 Opentaps(http://opentaps.org/)
Opentaps Open Source ERP + CRM is a fully integrated application suite
that help management of business more effectively. It supports
ecommerce, Customer Relationship Management, Warehouse and
Inventory Management, Supply Chain Management, and Financial
Management to Business Intelligence and mobility integration. It
supports physical products, digital and downloadable products, variant
products and configurable products. It provides Integration with major
payment gateways with Browser-based email server. It provides for
Customer services and case management. It manages marketing
campaigns, including outbound emails and call management, tracking
code reporting and management facilities. It provides Integration with
Asterisk open source Voice over IP (VOIP) system and with GetResponse
email marketing with module. It has also Support for Value Added Taxes
(VAT) through VAT module.
132
Khan
 Joomla (http://www.joomla.org/)
Joomla is an award-winning content management system (CMS), which
enables one to build Web sites and powerful online applications. Many
aspects, including its ease-of-use and extensibility, have made it the most
popular Web site software available. Best of all, it is an open source
solution that is freely available to everyone. It is the most popular open
source CMS currently available as evidenced by a vibrant and growing
community of friendly users and talented developers. Its roots go back to
2000 and, with over 200,000 community users and contributors. Its
powerful application framework makes it easy for developers to create
sophisticated add-ons that extend its power into virtually unlimited
directions. The core Joomla framework enables developers to quickly and
easily build Inventory control systems, data reporting tools, application
bridges, custom product catalogs, integrated e-commerce systems,
Complex business directories, Reservation systems and Communication
tools.
Open Sources Tools in Health
 OpenEMR(http://www.oemr.org/)
OpenEMR is a certified electronic health records and medical practice
management application with fully integrated electronic health records,
practice management, scheduling, electronic billing and interoperability.
OpenEMR is licensed under the GNU, General Public License (General
GPL). It is a free open source replacement for medical applications such
as Medical Manager, Health Pro, and Misys. Its features support EDI
billing using ANSI X12. Its main features include Multilanguage Support,
free Upgrades and online support, electronic Billing (includes
Medicare),document management, Integrated practice management, ePrescribing, Insurance tracking (3 insurances),Easy to customize, Easy
Installation,Voice recognition ready (MS Windows Operating Systems),
133
Khan
Web based (Secure access with SSL certificates) Integration with external
general accounting program SQL-Ledger, Built in scheduler, multi-facility
capable, prescriptions by printed script, fax or email.
 Hospital OS Software(http://www.hospital-os.com/en/)
Hospital OSS is a Hospital Information System for managing hospital
operations. It is a Client - Server software in which the server works as a
central unit that stores all of the information and the clients are the units
that feed the information into the server. Hospital OS Server uses the
Linux operating system and PostgreSQL as the database. Both Linux and
PostgreSQL are open source programs available for download on the
Internet. The Client software is developed by using Java and it can be
used with Windows XP, 7, MacOS, Ubuntu and other operating systems
that have the Java Virtual Machine installed. Hospital OS is being
designed to support "Registration, Medical Records, Patient Screening
Counter, X-Ray Laboratory, Pharmacy, Medical Statistics, IPD cashier, One
Stop Service Point and system administrator.
134
Khan
 OpenMRS (http://openmrs.org/)
Open Medical Record System (OpenMRS) was created in 2004 as a open
source medical record system platform for developing countries. It is a
software platform and a reference application which enables design of a
customized medical records system with no programming knowledge
(although medical and systems analysis knowledge is required). It is a
common platform upon which medical informatics efforts can be built.
The system is based on a conceptual database structure which is not
dependent on the actual types of medical information required to be
collected or on particular data collection forms, so can be customized for
different uses. Its main features include Central concept dictionary,
Security, Privilege-based access, Patient repository, Multiple identifiers
per patient, Data entry, Data export, Standards support, Modular
architecture, Patient workflows, Cohort management, Relationships,
Patient merging, Localization / internationalization, Support for complex
data, Reporting tools, Person attributes.
 Connect (http://www.connectopensource.org/)
CONNECT is an open source software solution that supports health
information exchange both locally and at the national level. CONNECT
uses Nationwide Health Information Network standards and governance
to make sure that health information exchanges are compatible with
other exchanges being set up throughout the country. This software
solution was initially developed by federal agencies to support their
health-related missions, but it is now available to all organizations and
can be used to help set up health information exchanges and share data
using nationally-recognized interoperability standards.
135
Khan
 PHYAURA EHR (https://www.phyaura.com/)
PHYAURA EHR community edition is free and open source software which
allows all healthcare practitioners in the United States to document
clinical notes, schedule office visits, and bill for medical services, all
without any vendor lock in. The PHYAURA community and open source
software was built to create a collaborative platform for healthcare
practitioners, developers, vendors and staff members aimed at improving
the healthcare technology experiences and ultimately patient care. The
PHYAURA community is a quick and easy way to read answers to
commonly asked questions and post questions that have not already
been addressed. Its core practice management software and electronic
medical records software is written in open source code conforming to
the GNU General Public License. This is the one of the most efficient
methods to collaborate and provide a community based EMR.
136
Khan
 OsiriX radiologist workstation (http://www.osirix-viewer.com/)
Another example of open-source software success is the OsiriX
radiologist workstation. This full-featured radiology viewing and
interpretation system integrates 3D and web-access features that are
rarely included in commercial workstations that cost tens of thousands of
dollars each. OsiriX has been specifically designed for navigation and
visualization
of
multimodality
and
multidimensional
images: 2D Viewer, 3D Viewer, 4D Viewer (3D series with temporal
dimension, for example: Cardiac-CT) and5D Viewer (3D series with
temporal and functional dimensions, for example: Cardiac-PET-CT). The
3D Viewer offers all modern rendering modes: Multiplanar
reconstruction (MPR), Surface Rendering, Volume Rendering and
Maximum Intensity Projection (MIP). All these modes support 4D data
and are able to produce image fusion between two different series (PETCT and SPECT-CT display support). The OsiriX open-source approach
encourages doctors to write their own extensions for image analysis and
workflow automation. Because radiology workstations are regulated as
medical devices by the FDA, a number of commercial vendors now offer
FDA-registered versions of the free open-source OsiriX for a fraction of
what proprietary workstations cost.
Conclusion
Open Source has been growing in popularity owing to its lower cost of
development, ease in downloading and installation with no licensing
issues. GOOGLE, FACEBOOK, Sun Microsystems, and RedHat are just a
few very successful companies using the collaboration of open source
software in their products and services (Open Source Technology, 2012).
Open-source software offers incredible benefits in all fields of human
progress including ethical advantages, access, innovation, cost,
137
Khan
interoperability, integration, standardization, support and safety.
Business Management and Health stand no exception to this scenario.
Huge amount is being spent for implementing the electronic health
record system owing to EMR license prices and maintenance of
commercial Content management systems and these costs tend to recur,
while financial advantage of Open Source Software become obvious.
Secondly, OSS being generally supported by worldwide users enable
companies to reach a broader user base. With more reputed
organizations like WHO, UNESCO and other companies adhering to OSS
for carrying their vital business and professional operations the success of
OSS is becoming more evident.
References
Casaló, L. V., Flavián, C. and Guinalíu, M. (2008). Towards loyalty
development in the e-banking business. Journal of Systems and
Information
Technology, 10(2),
120
–
134.
doi:
10.1108/13287260810897756
Dunlap, I. H. (2006). Open Source Database Driven Web Development: A
Guide for information Professionals (pp 11-24). Oxford: Chandos
Pub.
Grodzinsky, F. S., Miller, K. and Wolf, M. J. (2003). Ethical issues in open
source software. Journal of Information, Communication and
Ethics
in
Society,
1(4),
193
–
205.
doi:
10.1108/14779960380000235
Iivari, N. (2009). Constructing the users in open source software
development: An interpretive case study of user participation.
Information Technology & People, 22 (2), 132 – 156. doi:
10.1108/09593840910962203
Open Source Technology. (n.d). PHYAURA. Retrieved from
https://www.phyaura.com/resources-2/open_source/
Spinello, R. A. (2003). The future of open source software: Let the market
decide. Journal of Information, Communication and Ethics in
Society, 1 (4), 217 – 233. doi: 10.1108/14779960380000237
138
Analysis of Operating Systems and Browsers ...
Lone & Wani
Analysis of Operating Systems and Browsers: A Usage Metrics
Mohammad Ishaq Lone
Dr. Zahid Ashraf Wani
Abstract
Purpose: The purpose of this paper is to examine the growth of FOSS and
proprietary operating systems and browsing software used in computers and
various types of mobile phone devices around the world.
Design/Methodology/Approach: The data is gathered from StatCounter
(http://gs.statcounter.com) - one of the biggest web analytics service. The
collected data is analysed keeping objectives of the study in view.
Findings: It offers a thorough insight of yearly and cumulative growth of software
industry. As for as OS market is concerned Mac OSX and Linux have increased their
share. Linux has increased from 0.69% in 2009 to 0.78% in 2010. Accordingly year
wise growth of mobile operating systems show iOS is losing its market share by
dipping to 25.48% in 2010 from 34.01% in 2009, while as BlackBerry and Android
have increased their share by 8.34% and 6.41% respectively. Browser Internet
Explorer (IE) is showing declining trend with 52.77% share in May, 2010 against
44.52% in April, 2011, whereas Firefox is maintaining a study trend during same
period with 31.64% share in May, 2010 with slight depreciation (29.67%) in May,
2011. However, in mobile browser arena all the browsers are showing a declining
trend in 2010 when compared to 2009 except Android, BlackBerry, Samsung and
NetFront. BlackBerry has increased by 8.15% and Android- an open source mobile
browser has increased its market share by 6.63% augurs well for FOSS movement.
Originality/Value: The paper explore the market share of FOSS in OSs and
browsers. It deciphers in detail the FOSS growth and increasing market share and
can help stakeholders to take future course of action in this arena.
Keywords
FOSS, Proprietary Software, Operating Systems, Web Browsers – Mobile, Web
Browsers – computer
Paper Type: Research paper
Introduction
pen Source software can be analysed as a process innovation: a
new and revolutionary process of producing software based on
unconstrained access to source code as opposed to the
traditional closed and property-based approach of the commercial world.
The production of Open Source software is a form of intellectual
gratification with an intrinsic utility similar to that of a scientific
discovery, involving elements other than financial remuneration (Perkins,
1999). Emerging as it does from the university and research environment,
O

PhD. Scholar, Department of Library & Information Science, University of Kashmir, Jammu
& Kashmir. email: [email protected]
 Assistant Professor, Department of Library & Information Science, University of Kashmir,
Jammu & Kashmir. email: [email protected]; [email protected]
139
Lone & Wani
the movement adopts the motivations of scientific research, transferring
them into the production of technologies that have a potential
commercial value. The sharing of results enables researchers both to
improve their results through feedback from other members of the
scientific community and to gain recognition and hence prestige for their
work. The same thing happens when source code is shared: other
members of the group provide feedback that helps to perfect it, while the
fact that the results are clearly visible to everyone confers a degree of
prestige which expands in proportion to the size of the community.
In the new paradigm of development, programmers frequently
rediscover the pleasure of creativity, which is being progressively lost in
the commercial world, where the nightmare of delivery deadlines is
transforming production into an assembly line. Proprietary software is
primarily perceived as not being very reliable. Produced by a restricted
group of programmers in obedience to market laws, it is in diametric
opposition to the law expressed by Raymond (1999): “Given enough
eyeballs, all bugs are shallow”. So it can be safely concluded that
Intellectual gratification, aesthetic sense, and informal work style are all
recurrent features of the set of different motivations underlying the
invention of Open Source.
Over the years use of free and open sources software has increased
considerably in every sphere of human activity like education, industry,
business, medicine, agriculture etc. It has given strong competition to the
proprietary software and is also encouraged at government level in
different developing and emerging economies of the world due to its
umpteen benefits. FOSS advocate groups round the globe promote its
use and motivate programmers to develop new applications for human
good. sourceforge.net is on such platform which has united programmers
from different countries to develop and improve FOSS in range of areas.
Even after umpteen efforts by volunteers proprietary software industry is
occupying lion’s share in the market place and is expected to be a
dominant player in future as well but endeavours by FOSS advocate
groups could be very important for weaker economies and
underdeveloped societies therefore should be encouraged.
Problem
The study is an endeavour to understand and appraise the use of
different open source and proprietary browsers and operating systems
used in computer and mobile phone devices around the globe.
Scope
The scope of the study is confined to assess the growth and use of
Proprietary and FOSS operating systems and browsers in computer and
140
Lone & Wani
mobile phone devices around the globe. The study covered the period of
2009 and 2010 to gauge cumulative growth and April, 2010 to May, 2011
for analysing latest trend.
Objectives
 To understand the use and growth of proprietary and FOSS
computer operating systems and browsers.
 To assess the use and growth of proprietary and FOSS mobile
operating systems and browsers.
 To measure the cumulative growth of these software during
2009 and 2010.
Methodology
The data is gathered from StatCounter (http://gs.statcounter.com) which
is the one of the biggest web analytics service in the form of .csv files. The
data as such collected is analysed and compressed keeping objectives of
the study in view.
Limitations of the study
StatCounter tracking code is installed on more than 3 million sites
globally. Every month, more than 15 billion hits are recorded to these
sites even then the data collected do not claim to be sole representation
of whole internet user community.
Related work
Lehman et al. have built the largest and best known body of research on
the progressive use of large, long-lived software systems (Lehman, &
Belady, 1985; Lehman M. et al., 1997; Lehman, Perry & Ramil, 1998;
and, Turski, 1996). Lehman’s laws of software evolution, which are based
on his case studies of several large software systems, suggest that open
source systems are growing in size. Turski’s (1996) statistical analysis of
these case studies suggests that system growth (measured in terms of
numbers of source modules and number of modules changed) is usually
sub-linear, slowing down as the system gets larger and more complex.
Kemerer and Slaughter (1999) have presented an excellent survey of
research on software development. They also note that there has been
relatively little research on empirical studies of software progression.
Parnas (1994) has used the metaphor of decay to describe how and why
software becomes increasingly brittle over time. Eick et al., (2001) extend
the ideas suggested by Parnas by characterizing software “decay” in ways
that can be detected and measured. They used a large telephone
switching system as a case study. They suggest, for example, that if it is
common for defect fixes to require changes to large numbers of source
141
Lone & Wani
files, then the software system is probably poorly designed. Their metrics
are predicated on the availability of detailed defect tracking logs that
allow, for example, a user to determine how many defects have resulted
in modifications to a particular module. We note that no such detailed
change logs were available for our study of Linux. Perry (1994) presented
evidence that the use of a software system depends not only on its size
and age but also on factors such as the nature of the system itself (i.e., its
application domain), previous experience with the system, and the
processes, technologies, and organizational frameworks employed.
Pfaffman (2008) observes that though many educators are unaware or
dismissive of Free/Open Source Software, the number of FOSS tools
continues to grow. According to him as Netscape released the source
code to its Netscape Communicator package. Netscape’s decision
resulted in Mozilla, a full-featured suite of software and, subsequently,
the Firefox web browser. These Open Source programs continue to
benefit Netscape’s commercial products. Similarly, Google’s servers run
the FOSS Linux operating system; when Google’s programmers find
problems and their solutions, those solutions are given back to the
community so that all may benefit from them.
Pearson (2000) concludes that Linux operating system has now reached
the stage where it is being adopted commercially by the big computer
manufacturers, as a competitor to the Microsoft proprietary Windows
operating system in the server market. Mozilla and other important open
source software Apache, which runs a majority of Internet servers,
SendMail Internet E-mail software and Perl, the standard Internet
scripting language. One variant of UNIX, the Berkeley BSD Unix, has been
open source for many years.
The market share of Windows NT has increased from 25.6% in 1996 to
41.9% in 2003, while the market share of Linux has also increased from
mere 6.5% in 1996 to 38.0% in 2003. Indeed the honour of open source
software speaks for itself in the busy world of information technology.
Apache Web server has over 60% of market share, it’s nearest rival
Microsoft’s IIS server has only a 25% share (Bitzer, 2004).
With a reputation for speed, reliability, and efficiency, GNU/Linux now
has more than 12 million users worldwide and an estimated growth rate
of 40% per year (www.linux.org). With more than one-half of Fortune 500
companies now using GNU/Linux instead of Microsoft’s proprietary
software, the market threat of F/OSS to Microsoft is more evident. With
the recent surge in the use of GNU/Linux by individuals and companies, is
it possible that users of F/OSS could eventually surpass those using
Microsoft’s proprietary software (Elliott and Scacchi, 2008)
142
Lone & Wani
Analysis and discussion
Operating Systems Growth – Global Scenario
 Operating Systems - Monthly Use
With the launch of new version of Windows OS i.e. Windows 7, growth of
Windows XP has declined from 58.02% in May, 2010 to 46.57% in April,
2011 and growth of Windows 7 has increased from 14.84% in May, 2010
to 31.91% in April, 2011. While as growth of Windows Vista has declined
but of Mac OS has remained stable for the same period. Linux, an open
source operating system, has shown a slight declining trend.
 Operating Systems - Yearly Use
Use of Windows XP has decreased from 69.57% in 2009 to 56.11% in
2010. Similarly, use of Windows Vista has also decreased. Windows 7,
obviously, has increased its usage. Mac OSX and Linux have increased
their share. Linux has increased from 0.69% in 2009 to 0.78% in 2010.
143
Lone & Wani
 Operating Systems –Cumulative Use (2009 – 2010)
The overall use of operating system shows that Windows XP is the
dominant OS in the market with 59.84% market share followed by
Windows Vista with 19.33%. Newly entered Windows 7 is at the 3rd place
(13.49%) within few months of arrival and Mac OSX is at 4th place. Linux
is having a meagre 0.75% share for this period is little used by people
throughout the world but nevertheless is increasing its share. The
dominancy of Window based O.S is quite vivid and may stay for a long
time to come due to its user friendly features and partly strong
promotional marketing.
Mobile Operating Systems – Global Scenario
 Mobile Operating Systems - Monthly Use
Monthly use of mobile operating systems shows that Symbian operating
system (OS) is maintaining a steady growth. This is followed by iOS which
is showing a declining trend as it was having 29.01% share in May, 2010
against 23.34% share in April, 2011. BlackBerry OS shows a fluctuation as
it increased from 14.15% in May, 2010 to 19.25% in November, 2010 and
then decreasing to 13.54%. Open source operating system Android has
shown a tremendous growth by jumping from 3.94% share in May, 2010
to 16.05% share in April, 2011. Android OS has surpassed BlackBerry in
Feb, 2011 and is the only open source operating system among the top
ranked. A vivid picture is provided in Fig 2.1.
144
Lone & Wani
 Mobile Operating Systems - Yearly Use
Symbian OS, being the most used OS, has lost its market from 35.49%
share to 32.29% due to the entry of new OS in the market. Year wise
growth of mobile operating systems show that iOS is losing its market
share by dipping to 25.48% in 2010 from 34.01% in 2009. BlackBerry and
Android have increased their share by 8.34% and 6.41% respectively. The
growth of other mobile OS can be seen from the fig.8. The increased
market share of android is quite encouraging and is expected go further
up in near future given the fact more mobile companies are keen to
adopt android for their upcoming smart phones.
 Mobile Operating Systems - Cumulative Use (2009 – 2010)
Symbian OS has the highest market share among all mobile OS with total
share of 32.65%. It is followed by iOS BlackBerry with 26.46% and 15.54%
th
share respectively. Android, the open source OS has retained the 4
145
Lone & Wani
place with total share of 8.08% for 2009 and 2010. Sony Ericsson and
th
th
Samsung are holding 6 and 7 spots respectively. The other open source
mobile OS in the list include Linux but is having a negligible share of
0.01%. With the presence of two variants of open source OS the growth
of FOSS OS can be expected to improve.
Web Browsers – Global Scenario
 Web Browsers - Monthly Use
The global scenario of web browsers use from May, 2010 to April, 2011
shows that Internet Explorer (IE) is showing declining trend with 52.77%
share in May, 2010 against 44.52% in April, 2011 whereas Firefox is
maintaining a study trend during same period with 31.64% share in
May,2010 with slight depreciation (29.67%) in May, 2011. Open source
browser - Firefox is followed by Chrome which is showing a huge increase
th
in usage growing from meagre 8.61% to 18.29%. Safari and Opera hold 4
th
and 5 positions with a study growth. Since all browsers are free in
nature the use of proprietary browsers due to strong marketing
background is not unusual phenomenon while open source browser
rd
Firefox is occupying 1/3 of market share is a good sign for the
promoters of open source movement.
146
Lone & Wani
 Web Browsers - Yearly Use
IE was used more in 2009 (59.71%) than in 2010 (51.45%) while as use of
Firefox is observed more in 2010 with 31.27% users worldwide against
30.48% in 2009 indicating steady growth. Likewise Chrome increased its
users from 3.27% (2009) to 10.25% (2010). The figure below shows the
usage of other browsers in 2009 and 2010 as well.
 Web Browsers - Cumulative Use (2009 – 2010)
IE (53.74%) was the most used browser for 2009 and 2010. Firefox
nd
(31.24%) holds the 2 spot followed by Chrome (8.32%) and Safari
(3.97%) while as Opera has 2.14% share. Among the other open source
browsers SeaMonkey (0.03%), Flock (0.02%) Camino (0.01%), Konqueror
(0.01%) and Minefield (0.01%) are also showing their presence in the list
of browsers but with meagre use. Firefox’s strong presence as an open
147
Lone & Wani
source browser augurs well and is expected to increase its share due fast
accessibility feature and regular updates.
Mobile Browsers – Global Scenario
 Mobile Browsers - Monthly Use
The growth of Opera browser, the leading mobile web browser has
declined from 26.68% in May, 2010 to 21.9% in April, 2011.iPhone and
Nokia have maintained a steady trend but the growth of BlackBerry is
declining after touching the peak in November, 2010. The interesting
thing is that Android, only influencing open source mobile browser, has
increased its share from meagre 6.3% in May, 2010 to 15.49% beating
BlackBerry and inching toward Nokia and iPhone.
148
Lone & Wani
 Mobile browsers - Yearly Use
Use of Opera browser has declined in 2010 (23.9%) as compared to 2009
(25.33%) but still retains the top spot. Almost all the other browsers are
showing a declining trend in 2010 when compared to 2009 except
Android, BlackBerry, Samsung and NetFront. BlackBerry has increased by
8.15% while as Android, open source mobile browser has increased by
6.63% showing a good promise in the near future.
 Mobile Browsers - Cumulative Use (2009 – 2010)
For the last 2 years, Opera has maintained the first position with 18.63%
nd
rd
th
share while as iPhone, Nokia and BlackBerry hold 2 , 3 and 4 places.
Android is the lone open source browser among the top 7 mobile
browsers as shown in the Fig. 4.3.
149
Lone & Wani
Conclusion
The growth of FOSS operating systems and browsers in global market
augurs well, though there is a plenty of scope for the open source
software to expand its reach to length and breadth of the software
industry given the fact Microsoft have dominant market share among
operating systems and seems it does not confront some tough
competition is near future. However, same cannot be said about
rd
browsers as OSS Firefox has already occupied 1/3 of market
approximately giving tough time to proprietary software companies.
While as proprietary software are dominant among mobile operating
systems with marginal presence of Android. But given the hype and
success android has gained in a short period and predictions by experts
that android shall occupy future mobile operating system marketing
promises well. Although same cannot be said about the mobile browsers
where free mobile bowser Opera runs supreme with promising growth of
BlackBerry and OSS Android.
References
Bitzer, Ju¨rgen (2004). Commercial versus open source software: the role
of product heterogeneity in competition. Economic Systems, 28,
369–381. Retrieved April 5, 2011 from
http://www.sciencedirect.com/science/article/pii/S0939362505
000026
Dalle, J. & Jullien, N. (1999). NT vs. Linux or some explanations into
economics of Free Software. Paper presented at “Applied
Evolutionary Economics”, Grenoble, June 7-9.
Eick, S. G. et al., (2001). Does code decay? Assessing the evidence from
change management data. IEEE Trans. on Software Engineering,
27(1). Retrieved April 25 from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.37.9
674&rep=rep1&type=pdf
Elliott, Margaret S. and Scacchi, Walt (2008). Mobilization of software
developers: the free software movement Information
Technology & People, 21(1): 4-33. Retrieved April 5, 2011 from
www.emeraldinsight.com/0959-3845.htm
Pearson, Hilary E (2000). OPEN SOURCE — THE DEATH OF PROPRIETARY
SYSTEMS? Computer Law & Security Report, 16 (3): 151-156.
Retrieved April 5, 2011 from
http://www.sciencedirect.com/science/article/pii/S0267364900
889062
Pfaffman, Jay (2008). Transforming High School Classrooms with
Free/Open Source Software: It's Time for an Open Source
Software Revolution. The High School Journal, 91 (3): 25-31.
150
Lone & Wani
Retrieved
April
5,
2011
from
http://muse.jhu.edu/journals/high_school_journal/v091/91.3pf
affman.html
Kemerer, C. F., and Slaughter, S. (1999). An empirical approach to
studying software evolution. IEEE Trans. on Software
Engineering, 25(4). Retrieved April 12 from
Lehman M. et al. (1997) Metrics and laws of software evolution—the
nineties view. In Proc. of the Fourth Intl. Software Metrics
Symposium (Metrics’97), Albuquerque, NM. Retrieved April
11from
Lehman, M. M. & Belady L. A. (1985). Program Evolution: Processes of
Software Change. Academic Press.
Lehman, M., Perry, M. D. E., and Ramil, J. F.(1998). Implications of
evolution metrics on software maintenance. In Proceeding of the
1998 International Conference on Software Maintenance
(ICSM’98), Bethesda, Maryland, November . Retrieved April 15,
2011from
Parnas, D. L. (1994) Software aging. In Proc. of the 16th Intl. Conf. on
Software Engineering (ICSE-16), Sorrento, Italy, May. Retrieved
April 14, 2011 from
http://libresoft.es/grex/seminarios_files/parnas-sw-agingromera.pdf.
Perkins, G. (1999). Culture clash and the road of word dominance. IEEE
Software, 16(1), 80-84. Retrieved April 25, 2011 from
Perry, D. E. (1994). Dimensions of software evolution. In Proc. of the 1994
Intl. Conf. on Software Maintenance (ICSM’94). Retrieved April
11, 2011 from
Raymond, E., (1999). The cathedral and the bazaar. Retrieved April 5,
2011 from http://www.redhat.com/redhat/cathedral-bazar/
Turski W. M. ( 1996). Reference model for smooth growth of software
systems. IEEE Trans. on Software Engineering, 22(8). Retrieved
April 7, 2011 from
http://www.computer.org/portal/web/csdl/doi/10.1109/TSE.19
96.10007
151
Hashim & Jan
*
Tabasum Hashim
**
Tariq Rashid Jan
Abstract
Purpose: The paper examines five web based open access repositories for the
purpose of identifying their strength and limitations, using pre-defined standard
parameters.
Design/Methodology/Approach: The study used Directory of Open Access
Repositories (Open DOAR) as a base for the collection of data.
Findings: The analysis found that the repositories are credible and are equipped
with rich sets of functionalities to facilitate depositing, accessing and retrieving
scholarly materials.
Originality/Value: The paper highlights the credibility related issues of
institutional repositories in the present web based information retrieval
environment.
Keywords: Institutional Repositories (IR); Evaluation; Open Access Repositories
Introduction
T
he availability of open-source IR systems has encouraged a
proliferation of institutional repositories (IRs) worldwide,
particularly among academic and research institutions. Based on
the number of institutional repositories established over the past few
years, the IR service appears to be quite attractive and compelling to
institutions. They are beneficial for access to knowledge and the
development of Science. These provide a permanent record of the
research output of the institution and maximize the visibility, usage and
impact of research through global access.
Institutional repositories have become a platform of researchers and
other academicians worldwide. They have helped the researchers to
break the chains of time and space. Exposure to research and long term
preservation have tempted the institutional elements to accept the
repository technology with open arms.
*
Senior Professional Assistant. P.G. Department of Chemistry. University of Kashmir, Jammu
**
Associate Professor. P.G. Department of Statistics. University of Kashmir. Jammu and
Kashmir. 190 006. India. email: [email protected]
152
Hashim & Jan
With the first academic institutional repository projects, the EPrints
archive at Southampton (founded in 2001, and now internationally
renowned as e-Prints Soton) and the DSpace initiative at MIT (2002) that
begun in parallel with the Open Access Initiative (Cullen & Chawner,
2011), the growth of Institutional repositories has become ceaseless as is
evident from sources like Open DOAR (http://www.opendoar.org/) and
Open ROAR (http://roar.eprints.org/) .
Institutional repositories have been successfully introduced, because of
innumerable benefits associated with them they indeed provide a
solution to concerns about the system of scholarly publishing (Cullen &
Chawner, 2011). They have resulted in institutional progress in general
and research community in particular. The development of institutional
repositories emerged as a new strategy that allows universities to apply
serious, systematic leverage to accelerate changes taking place in
scholarship and scholarly communication, both moving beyond their
historic relatively passive role of supporting established publishers in
modernizing scholarly publishing through the licensing of digital content,
and also scaling up beyond ad-hoc alliances, partnerships, and support
arrangements with a few select faculty pioneers exploring more
transformative new uses of the digital medium (Lynch, 2003).
All repositories hold a similar mission to disseminate the research output
of the scholarly community. The success of a repository depends on the
quality of its content and service it provides. So, it is important that
various features like acquisition, access of various materials and
associated policies and various issues are needed to evaluate.
A sizable literature is available on evaluation of repositories. A study by
Fernandez (2006) reflects the status of open access repositories across
India. Bertot and McClure (1998) also evaluated nine open access
repositories in the field of Computer Science and Information
Technology. The repositories have been evaluated using content;
preservation policies; right management; promotion advertisements;
153
Hashim & Jan
services; feedback and access status as important parameters. Lynch
(2003) has also discussed about the infrastructure of institutional
repositories visualizing the future developments also. Carpenter,
Graybill, Offord, Jr, and Piorun (2011) have also envisioned new features
in the institutional repository world. Workflow pattern in institutional
repositories has been researched by Hanlon and Ramirez (2011). A
shifting landscape of institutional repositories is well knitted by various
authors (Shreeves & Cragin, 2008; Nykanen, 2011). Repository
management has been well been researched by number of authorities
(Bide, 2002; Genoni, 2004; Medeiros, 2003; Poynder 2006; Markey,
Rieh, St. Jean, Kim, & Yakel, 2007; McDowell, 2007). Metadata issues in
institutional repositories has been researched by Dunsire (2008);
Goldsmith & Knudson (2006) .
Scope
The scope of is limited to five web based open access repositories.
Objective(s)
The main objective of the study is to evaluate the various features of the
institutional repositories using standard parameters identified for this
purpose.
Methodology
After reviewing existing literature useful and relevant inform about
evaluation activities of institutional repositories were studied. The
literature proved extremely useful in identifying the main elements and
issues. Five repositories were randomly selected using Toppet’s (1927)
Random Number Table. These were:

The Sydney eScholarship Repository, University of Sydney

University of Melbourne Digital Repository

Digital Repository of the University of Wolverhampton

National Aerospace Laboratories Institutional Repository (NAL)

Open MED@NIC
154
Hashim & Jan
Questionnaires were sent via e-mail to repository administrators to
ascertain the content management policies of the repositories.
Results & Discussion
The collective discussion about evaluation of the repositories under study
is given under different headings already being chosen for the purpose.

Overview
Out of five repositories selected for evaluation, the softwares used are
DSpace, digitool , Open Repository Software and E-print Software.The
repositories are generally maintained by the Information Management
Web team. Open MED@NIC is maintained by bibliographic information
Division, National Informatics Centre (NIC). The collection of most of the
institutional repositories has records in thousands but one repository has
records in hundreds. The contents of the selected repositories comprise
mainly of journal articles, books, book chapters, conference papers,
research
datasets,
journal/magazine
articles,
patents,
preprints,
presentations, research reports, technical reports, theses, multimedia
files, documents of creatives, documents of patents, digital version of
library collection. Repositories provide usage statistics on the monthly
basis about uploading and downloading status, etc.

Visual Interface
The repositories have a clear user Interface and simple enough for nonexperienced users. Each repository is designed with their own
branding/web interface design. The site functions are fairly simple and
intuitive to use. The FAQ provides help about the most common
problems, tailored to the user account type. Online help is also available
in all the repositories.

Resource Discovery
The present study adopted the parameters used by Smith (2000) to
evaluate the search features of different institutional repositories. The
display features of any search interface are very pivotal for it. The results
showed that the institutional output can be displayed by the order of
relevancy, title, author, submit date, issue date, either ascending order or
descending order. Again a sort bar that enables users to sort by author,
155
Hashim & Jan
date or title and change the number of results is also evident in the
selected repositories. The full metadata records can also be viewed and
the item recommended was sent via email to individuals. The repositories
have browsing facilities in well organized forms. The traditional subject,
author and collections listings, besides listing by title, date issued and
date submitted can also be generated.

Access
The repositories have some mechanism to control access to their
collection. Options permit access to free abstracts without any
registration. Most repositories seek the user to register for accessing full
text collection. Some repositories restrict full text to intranet having an
agreement with publishers or owners of content.

System Features
The repository need basic software and hardware. They also support the
LDAP authentication. Text/document file support in the form of HTML,
PDF, Postscript, plain text, Richtext format ,XML,MS-WORD, MS-EXCEL,
MS-Power point JPEG,PNG, GIF, BMPetd are found in all the repositories.
Three types of metadata form the structural framework of selected
repositories viz, descriptive, administrative and structural. Some
metadata elements are auto generated in repositories.W3C standards
XHTML 1.0 label is present on site. Metadata standards include MARC,
Dublin Core, Metadata Object Description Schema (MODS), SRW and
Metadata Encoding and Transmission standards (METS).The workflow
integration supports use workflow tools.

Content Management Policy
Almost all the institutional repositories accept post and preprints of
research publications of in house researchers, annual reports, theses,
institutional publications etc. All repositories under study support for
web-based document management, auditing, simple workflow, including
research status, publishing rights and ability to edit incorrect content. All
the contents have to pass through an administration process before
publication. The provision for storage and long term preservation is
there. OpenMed does not have well documented collection policies.
156
Hashim & Jan
Most institutions allow both unmediated and mediated submission of
documents. The most commonly accepted document formats are
MSWord, PDF and LaTeX.
Conclusion
All repositories systems are equipped with rich sets of functionalities to
facilitate depositing, accessing, and retrieving scholarly materials and all
repositories take advantage of web technology for their cross-server
functionality. By introducing their product to the scholarly communities
all over the world, they have taken successful steps towards making the
repositories they have developed an integrated part of the new means of
international information dissemination. Changes in repository content
management are changing rapidly. The effectiveness and efficiency of the
institutional repositories is reflective in the policies adopted by the
institutional repositories to work successfully in the present web based
information retrieval systems.
References
Bertot, J.C., & McClure, C.R. (1998). Measuring electronic services in
public libraries; issues and recommendations. Public Libraries
37(3), pp.176–180
Bide, M. (2002). Open archives and intellectual property: incompatible
world views? Open Access Forum, Bath. Retrieved from
www.oaforum.org/otherfiles/oaf_d42_cser1_bide.pdf
Carpenter, M., Graybill, J., Offord, J., Jr., & Piorun, M. (2011). Envisioning
the Library’s Role in Scholarly Communication in the Year 2025.
portal:
Libraries
and
the
Academy.
11
(2),
659–681.
doi:10.1353/pla.2011.0014
Cullen, R., & Chawner, B. (2011). Institutional Repositories, Open Access,
and Scholarly Communication: A Study of Conflicting Paradigms.
The Journal of Academic Librarianship Volume. 37 (6), 460–470.
157
Hashim & Jan
Dunsire, G. (2008). Collecting metadata from institutional repositories.
OCLC
Systems
&
Services:
International
digital
library
perspectives. 24 (1), 51-58. doi: 10.1108/10650750810847251
Hanlon, A., & Ramirez, M. (2011). Asking for Permission: A Survey of
Copyright Workflows for Institutional Repositories. portal:
Libraries
and
the
Academy.
11
(2),
683–702.
doi:
10.1353/pla.2011.0015
Fernandez, L. (2006). Open access initiatives in India: An evaluation. The
Canadian Journal of Library and Information Practice and
Research, 1(1). Retrieved from
http://www.dlib.org/dlib/january05/foster/01foster.html
Genoni, P. (2004). Content in institutional repositories: a collection
management issue. Library Management. 25 (6-7), 300-306. doi:
10.1108/01435120410547968
Goldsmith, B., & Knudson, F. (2006). Repository librarian and the next
crusade: the search for a common standard for digital repository
metadata.
D-lib
Magazine.
12
(9).
Retrieved
from
http://dlib.ukoln.ac.uk/dlib/september06/goldsmith/09goldsmit
h.html
Lynch, C. A. (2003). Institutional Repositories: Essential Infrastructure For
Scholarship In The Digital Age. portal: Libraries and the Academy.
3 (2), 327-336. doi: 10.1353/pla.2003.0039
Markey, K., Rieh, S. Y., St. Jean, B., Kim, J., & Yakel, E. (2007). Census of
Institutional Repositories in the United States: MIRACLE Project
Research Findings. Washington, D.C: CLIR. Retrieved from
http://www.clir.org/pubs/reports/pub140/pub140.pdf
McDowell, C. S. (2007). Evaluating institutional repository deployment in
American academe since early 2005: Repositories by the
numbers, Part 2. D-Lib Magazine, 13 (9/10). Retrieved from
http://www.dlib.org/dlib/september07/mcdowell/09mcdowell.
html
158
Hashim & Jan
Medeiros, N. (2003). E-prints, institutional archives, and metadata:
disseminating scholarly literature to the masses. OCLC Systems &
Services. 19 (2), 51-3. doi: 10.1108/10650750310481757
Nykanen, M. (2011). Institutional Repositories at Small Institutions in
America: Some Current Trends. Journal of Electronic Resources
Librarianship. 23 (1), 1-19. doi: 10.1080/1941126X.2011.551089
Poynder, R. (2006). Clear blue water. Retrieved from
http://poynder.blogspot.com/2006/03/institutionalrepositories-and-little.html
Shreeves, S. L., & Cragin, M. H. (2008). Introduction: Institutional
Repositories: Current State and Future. Library Trends. 57 (2),
89-97. doi: 10.1353/lib.0.0037
Smith, A. G. (2000). Search features of digital libraries Information
Research, 5(3). Retrieved from
http://informationr.net/ir/5-3/paper73.html
159
Open Source Code doesn’t always help…
Bhat & Quadri
Open Source Code Doesn’t Always Help: Case of File System
Development
Wasim Ahmad Bhat
S.M.K. Quadri
Abstract
Purpose: One of the most significant and attractive features of Open Source
Software (OSS), other than its cost, is its open source code. It is available in both
flavours; system and application. It can be customized and ported as per the
requirements of the end user. As most of the system software run in the kernel
mode of operating system and system programmers constitute a small chunk of
the programmers, the code customization of Open Source System Software is less
realized practically. In this paper, the authors present file system development as
a case of Kernel Mode System Software development and argue that
customization of Open Source Code available for file systems is not preferred. To
support the argument, the authors discuss various challenges that a developer
faces in this process. Furthermore, the authors look into the user mode file system
development for possible solution and discuss the architecture, advantages and
limitations of most popular and widely used framework called File system in UserSpace (FUSE). Finally, the authors conclude that the user mode alternative for file
system development and/or extension supersedes kernel mode development.
Design/Methodology/Approach: The broad domain, complexity, irregularity and
limitations of kernel development environment are made as a base to put forth
our argument. Moreover, the existence of rich and capable user-mode file system
development frameworks are used to supplement the argument.
Findings: The research highlights the fact that kernel mode file system
development is difficult, bug prone, time consuming, exhaustive and so on, even
with source code at disposal. Furthermore, it highlights the existence of user mode
alternative which is easy, reliable, portable, etc.
Research Implications: The research considers file system development as a case
of kernel mode development. Fortunately, in this case, the authors have choice of
user mode alternatives. However, author argument cannot be generalised for
those kernel modules wherein there is no user mode alternative. Furthermore, the
authors did not take into consideration the benefits of extending file systems in
kernel mode.
Originality/Value: The research stresses that having open source code is not
enough to make a choice when we cannot use it in a reliable and productive
manner.
Keywords: Open Source Software); Open Source System Software, Source Code,
File System, Kernel Mode, User Mode, File system in User-Space (FUSE)
Paper Type: Argumentative

Ph.D. Scholar. P. G. Department of Computer Sciences, University of Kashmir, Jammu and

160
Bhat & Quadri
Introduction
pen Source Software is consistently gaining on its software
market share because of its two most notable strengths which
include low cost and availability of source code. Although, for
some low cost is enough to make a choice while for others availability of
source code is mandatory. Having source code at our disposal, the
product can be customized or optimized as per the requirements or can
be used to fix unanticipated bugs. Open source ideology is logically
simple; one creates OSS project, uploads the project along with source
code and license to download, customize, distribute, compile and use it.
There are many portals that host OSS projects; http://sourceforge.net
being the most popular one. OSS paradigm started unknowingly early in
1960’s when RFCs for network protocols were created by ARPANET,
followed by a big boost by Linus Torvalds’ Linux OS. The paradigm has
spread geographically because of Internet and has penetrated into every
aspect of software development, be it an application software or system
software. This penetration is largely because of Linux Operating System
which is the one of the most prominent example of OSS and provides an
excellent platform to develop such software. OSS has attracted Computer
Science researchers all over the globe because of the availability of
source code that too just couple of clicks away. Specifically, researchers
working on the system side aspects of Computer Science have been using
Linux OS to implement and test their ideas and innovations by
customizing the source code and recompiling it.
One of the most notable system side research areas that specifically
depend upon the availability of source code is File System Development
(FSD). FSD includes designing and developing a new file system from
scratch and/or extending the existing ones in order to accommodate and
cope up with change in hardware technology and user requirements.
Designing and developing a file system from the scratch is practically
rarely practiced. There are many reasons for this; a significant innovation
is required in new design, a number of good designs are already available
and implemented a lot of knowledge about operating systems internals
and experience with system programming is required for development.
But, because the hardware technology is both getting advanced and
affordable, the digital data proliferation rate is very high. This has created
voluminous amount of digital data which needs to be managed
efficiently, reliably and securely. This change in hardware technology and
user requirements asks for optimization, refinement and fine tuning of
existing file systems.
Linux, the Open Source System Software, provides a good platform for
testing and implementing such refinements. Linux is a pioneer and
prominent software in OSS community. The Linux kernel comes with
O
161
Bhat & Quadri
more than two dozen file systems along with source code. These inkernel file systems are difficult to develop and debug. In this paper,
authors argue that code customisation of in-kernel file systems, to extend
their capabilities, is not preferred even with open source code. To
support the argument authors discuss various challenges that are faced
by a developer in this process. Furthermore, authors navigate to file
system development in user space to look for possible solution. The
authors present an overview of various user space frameworks and
discuss the architecture, advantages and disadvantages of most popular
and widely used framework called FUSE. This existence of rich and
capable user space framework supplement author argument.
Why in-kernel code customization of File System is not preferred?
File systems represent one of the most important aspects of operatingsystem services. Traditionally, file systems are integrated with the
operating system kernel. Earlier, file system syscals directly invoked file
system methods. This architecture made it difficult to add multiple file
systems to an OS. In 1986, to address this problem, Kleiman (1986)
introduced virtual node or vnode which provides a layer of abstraction
that separates core OS from file systems. This architecture finally
matured into VFS in UNIX like and UNIX based OSes. Rosenthal (1992)
proposed layering to extend capabilities of file systems and modified VFS
of SunOS to support it. All these demarcations and modifications
remained within the boundary and domain of the kernel.
As mentioned earlier, file systems need to evolve. Customizing in-kernel
file systems is a challenging task because of variety of reasons. First, this
approach requires the programmer to understand and deal with
complicated kernel code and data structures. Thus, a deep understanding
of operating system (kernel) internals is required even to make a small
change in existing code or to add some new code. The situation is worse
than it seems as the operating systems vary in their kernel architectures,
same architectures vary in major aspects for different flavours, same
flavours vary in crucial implementations for different versions and same
versions vary in degree of cohesion for different underlying hardware. All
these factors finally lead to a time consuming and exhaustive effort of a
programmer to understand internals of a specific kernel release.
Furthermore, such programmers constitute a small chunk of
programmers.
Second, even if this is all what is required and is successfully done, the
code customization can induce more bugs than expected. The kernel
development environment lacks facilities that are available to application
programmers. For instance, the kernel code lacks memory protection as
it runs in supervisor mode of operating system and as such a single wild
162
Bhat & Quadri
pointer can bring down the system which otherwise could have only
terminated the application. Also, it requires careful use of
synchronization primitives, can only be written in C and that too without
being linked against the standard C library and so on. All these factors
lead to a higher probability of not only a simple bug induction but a bug
that is capable enough to bring down the system and hence affects the
reliability of the operating system.
Third, if the customization is inevitable then debugging not only is
obvious but tedious. Debugging kernel code is much difficult than
debugging user space code as kernel code development lacks facilities
found in IDE for most programming languages for user mode
development. For instance, the famous “Blue Screen of Death” on
Windows platform is still there since the inception of Windows.
Fourth, even a fully functional in-kernel file system still has several
disadvantages. Porting a file system written for a particular kernel to a
different one can require significant changes in the design and
implementation, though the use of similar file system interfaces (such as
the VFS layer) on several Unix-like systems makes the task somewhat
easier.
Finally, an in-kernel file system can be mounted only with super user
privileges. This can be a hindrance for file system development and usage
on centrally administered machines, such as those in universities and
corporations.
How File Systems can be extended in User Space?
In contrast to kernel development, programming in user space minimizes
or completely eliminates several of the aforementioned issues. By
developing and/or extending file systems in user space, the programmer
need not to worry about the intricacies and challenges of kernel-level
programming and has access to a wide range of familiar programming
languages, third-party tools and libraries. Further, a highly dangerous bug
can at most terminate the application and hence can never break the
reliability of kernel. Moreover, debugging is comparatively much easier.
Of course, user space file systems may still require some effort to be
ported to different operating systems. However, this depends on the
extent to which a file system’s implementation is coupled with a
particular operating system’s internals.
In order to develop and/or extend file systems in user space, a framework
is required which traps the file system calls in kernel and passes them to
user space to be processed. The framework should also provide a simple
and powerful set of APIs in user space that are common amongst most
operating systems. Various projects have aimed to support development
of user space file systems while exporting an API similar to that of the VFS
163
Bhat & Quadri
layer. A brief introduction of such popular frameworks is as follows.
UserFS consists of a kernel module that registers a UserFS file system
type with the VFS (Fitzhardinge, n.d). All requests to this file system are
then communicated to a user space library through a file descriptor. The
Coda distributed file system contains a Coda kernel module which
communicates with user space cache manager, Venus, through a
character device /dev/cfs0 (Satyanarayanan, Kistler, Kumar, Okasaki,
Siegel & Steere, 1990). UserVFS, which was developed as a replacement
for UserFS, uses this Coda character device for communication between
the kernel module and the user space library (Machek, n.d). Similarly,
Arla is an AFS client that consists of a kernel module, xfs, which
communicates with the arlad user space daemon to serve file system
requests (Westerlund & Danielsson, 1998).
The ptrace () system call can also be used to build an infrastructure for
developing file systems in user space (Spillane, Wright, Sivathanu &
Zadok, 2007). An advantage of this technique is that all OS entry points,
rather than just file system operations, can be intercepted. The downside
is that the overhead of using ptrace() is significant, which makes this
approach unsuitable for production level file systems.
The number of production-quality systems that provide a standardized
API for developers to design a unique file system in user space is still
small, but there is one commonly used and well deployed system called
FUSE; part of the Linux kernel since version 2.6.14.
FUSE: A Widely Used Framework for File Systems in User Space
The fundamental design consideration of microkernel implementations
such as Mach and the MIT exo-kernel is to reduce the complexity of the
kernel. Both approaches remove all but the most basic operating system
services from the kernel, moving it to programs residing in userspace.
FUSE (File system in User-Space) is a recent example of this general trend
in operating system design (Szeredi, n.d). FUSE is the most well-known
example of a user space file system framework. FUSE design provides a
thin layer in kernel which traps and forwards file system calls meant for
mounted FUSE file system to user space. In user space, FUSE provides a
library interface to implement the corresponding file system call
functionality.
Architecture of FUSE
FUSE is a three-part system (shown as shaded blocks in Fig. 1).The first of
those parts is a kernel module, FUSE, which hooks into the VFS code and
looks like a file system module. It registers fusefs file system type with
VFS and also implements a special-purpose device /dev/fuse. In user
space, FUSE implements a library, libfuse, which manages
164
Bhat & Quadri
communications with the kernel module. It accepts file system requests
from the FUSE device and translates them into a set of function calls
which look similar (but not identical) to the kernel's VFS interface. Finally,
there is a user-supplied component (userfs in our example in Fig. 1) which
actually implements the file system of interest. It fills a structure with
pointers to its functions which implement the required operations in
whatever way makes sense.
Fig. 1: Path of a read () call for a file residing in FUSE file system
Fig. 1 shows the path of a read()call for a file residing in FUSE file system.
In this example, the user space file system functionality is implemented
as a set of callback functions in the userfs program, which is passed the
mount point, /fuse. Once userfs FUSE file system is mounted, all file
system calls targeting the mount point, /fuse, are forwarded to the FUSE
kernel module. When an application issues a read() system call for the file
/fuse/file, the VFS invokes the appropriate handler in fusefs. If the
requested data is found in the page cache, it is returned immediately.
Otherwise, the system call is forwarded over a character device,
/dev/fuse, to the libfuse library, which in turn invokes the callback
defined in userfs for the read() operation. The callback may take any
action, and return the desired data in the supplied buffer. For instance, it
may do some pre-processing; request the data from the underlying file
system (such as Ext4 in our example) and then post-process the read data
(Mathur, Cao, Bhattacharya, Dilger, Tomas & Vivier, 2007). Finally, the
result is propagated back by libfuse, through the kernel to the application
that issued the read() system call.
165
Bhat & Quadri
Advantages of using FUSE framework
FUSE’s relatively loose policy of implementing file system APIs allows
developers to run file systems with only a few functions implemented.
Also, FUSE presents an application with a well-known, standardized and
native file system that accepts regular system calls. This means that
applications can use interesting and cutting edge file systems on FUSE
without changing any code inside the application. This easy prototyping
and application friendliness of FUSE’s design clearly encourages not only
file system developers but also other people who are not familiar with
kernel programming to challenge themselves by implementing their own
file systems.
Developers implementing a file system in user space no longer have to
recompile the kernel or worry about crashing the operating system
during development. FUSE moves a step further by allowing unprivileged
users to safely mount their own file systems, even ones they make
themselves, as long as the system administrator loads the FUSE kernel
module.
More than twenty different language bindings are available for FUSE,
allowing file systems to be written in languages other than C. This means
that programmers can use languages that are based on different
programming paradigms, offer different levels of type safety and type
checking, and are generally intended for different usage scenarios.
The main advantage of FUSE over other similar projects is its large and
active user community. The FUSE user community has developed several
dozen file systems to date, several of which provide significant
functionality to the platforms supported by FUSE. Among the more
interesting FUSE file systems are Wayback (Cornell, Dinda & Bustamante,
2004), NTFS-3g (NTFS-3g, n.d) and SSHFS (Szeredi, n.d). These provide a
versioning file system, safe read and write support for NTFS volumes, and
a file system based on secure communications over SFTP, respectively.
Furthermore, FUSE framework has been ported to almost all platforms
including Windows (Driscoll, Beavers & Tokuda, n.d). Thus, FUSE file
systems not only are reliable and easy to develop and debug, but are also
highly portable.
Performance Issues in FUSE File Systems
There are certain performance issues related to FUSE framework’s
architecture (Rajgarhia & Gehani, 2010). First, when only anin-kernel file
system (such as Ext4 alone) is used, there are two user-kernel mode
switches per file system operation (i.e. to and from the kernel), and no
process context switches. User-kernel mode switches are inexpensive and
involve only switching the processor from unprivileged user mode to
166
Bhat & Quadri
privileged kernel mode, or vice versa. However, FUSE introduces two
process context switches for each file system call. There is a process
context switch from the user application that issued the system call to
the FUSE user space library, and another one in the opposite direction. A
context switch can have a significant cost, although the cost may depend
vastly on a variety of factors such as the processor type, workload, and
memory access patterns of the applications between which the context
switch is performed.
Second, while using an in-kernel file system alone, data need to be copied
in memory only once, either from the kernel’s page cache to the
application issuing the system call, or vice versa. FUSE also introduces
two additional memory copies. While writing data to a FUSE file system,
the data is first copied from the application to the page cache, then from
the page cache to libfuse via /dev/fuse, and finally from libfuse to the
page cache when the system call is made to the in-kernel file system. For
read(), the copying is similarly performed in opposite direction. If the
FUSE file system is mounted with the DIRECT_IO option, then the FUSE
kernel module bypasses the page cache and forwards the applicationsupplied buffer directly to the user space daemon. In this case, only one
additional memory copy is performed. The advantage of using DIRECT_IO
is that writes are significantly faster due to the reduced memory copy.
The downside is that each read() request has to be forwarded to the user
space file system, as data is not present in the page cache, thus affecting
read performance severely.
Finally, when using the in-kernel file system alone, all data that is read
from or written to the disk is cached in the kernel’s page cache. With
FUSE, fusefs also caches the data in the page cache, resulting in two
copies of the same data being cached. Although the use of the page
cache by FUSE is very beneficial for read operations since it avoids
unnecessary context switches and memory copying, the fact that the
same data is cached twice reduces the efficiency of the page cache. In
Linux, one can open files on the native file system using the O_DIRECT
flag and thereby eliminate caching by the native file system. However,
this is generally not a feasible solution since O_DIRECT imposes alignment
restrictions on the length and address of the write() buffers and on the
file offsets.
Conclusion
In this paper the authors argued that the in-kernel code customization of
open source system software like file system, is not practically feasible as
it requires deep understanding of operating system internals, experience
with kernel level programming, is time consuming and exhaustive, and so
on. Furthermore, the process is highly prone to simple bugs which can
167
Bhat & Quadri
crash operating system, and demand studious and exhaustive debugging.
All these factors lead to slow progress in file system development with
higher probability of low operating system reliability and file system
productivity; all this with even source code at the disposal.
The research also highlighted the concept of file system development
(extension) in user space and explained the basic architecture of most
popular user space file system development framework called FUSE.
Although, there are certain performance issues related to FUSE but the
gains outcast the issues. It can be safely argued that a file system
extended using FUSE framework is very easy to develop and debug, in
addition to being highly reliable and portable as compared to the one
which is extended by customising and recompiling the source. This is
enough to validate the argument that almost all user-space file system
frameworks are from open source community; surfaced to overcome the
code customisation problem in one of their pioneer and flagship product,
Linux OS.
References
Cornell, B., Dinda, P., and Bustamante, F. (2004). Wayback: A User-level
Versioning File System for Linux. In Proceedings of the annual
conference on USENIX Annual Technical Conference (ATEC '04),
Article 27.
Driscoll, E., Beavers, J., & Tokuda, H. (n.d). FUSE-NT: Userspace File
Systems for Windows NT. Retrieved from
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.131.
3896
Fitzhardinge, J. (n.d). UserFS. Retrieved from
http://www.goop.org/~jeremy/userfs
Kleiman, S. R. (1986). Vnodes: An architecture for multiple File system
types in Sun UNIX. In Proceedings of the Summer USENIX
Technical Conference, pp. 238-247.
Machek, P. (nd). UserVFS. Retrieved from
http://sourceforge.net/projects/uservfs
Mathur, A., Cao, M., Bhattacharya, S., Dilger, A., Tomas, A., & Vivier, L.
(2007). The new ext4 filesystem: Current status and future plans.
In Proceedings of the Ottowa Linux Symposium.
NTFS-3g. (n.d). Retrieved from
http://www.tuxera.com/community/ntfs-3g-manual/
Rajgarhia, A., & Gehani, A. (2010). Performance and extension of user
space file systems. In Proceedings of the 2010 ACM Symposium
on Applied Computing (SAC '10), pp. 206-213.
Rosenthal, D. S. H. (1992). Requirements for a “Stacking” Vnode/VFS
interface. Tech. Rep. SD-01-02-N014, UNIX International.
168
Bhat & Quadri
Satyanarayanan, M., Kistler, J. J., Kumar, P., Okasaki,M. E., Siegel, E. H., &
Steere, D. C. (1990). Coda: A highly available file system for a
distributed workstation environment. IEEE Transaction on
Computers, 39(4), 447-459.
Spillane, R. P., Wright, C. P., Sivathanu, G., and Zadok, E. Rapid file system
development using ptrace. In Proceedings of the 2007 workshop
on Experimental computer science (ExpCS '07), ACM, Article 22.
Szeredi, M. (n.d). File system in user space. Retrieved from
http://fuse.sourceforge.net
Szeredi, M. (n.d). SSH filesystem. Retrieved from
http://fuse.sourceforge.net/sshfs.html
Westerlund, A., & Danielsson, J. (1998). Arla- a free AFS client. In
Proceedings of the annual conference on USENIX Annual
Technical Conference(ATEC '98), Article 32.
169
A new Approach of CLOUD…
Srivastava and Kumar
A New Approach of CLOUD: Computing Infrastructure on Demand
*
Kamal Srivastava
**
Atul Kumar
Abstract
Purpose: The paper presents a latest vision of cloud computing and identifies
various commercially available cloud services promising to deliver the
infrastructure as a service (IaaS).
Design/Methodology/Approach: Cloud computing provides the architectural
detail and different types of clouds such as we studied different cloud based
architectures like Blue Cloud built on IBM's massive scale computing initiatives,
Google Cloud which claims business can get started using Google Apps online
pretty much instantly, Salesforce.com consists of development as service, a set of
development tools and APIs that enables enterprise developers to easily harness
the promise of the cloud computing.
Findings: It was found that Cloud computing is changing the way we provision
hardware and software for on-demand capacity fulfillment and changing the way
we develop web applications and make business decisions.
Keywords: Cloud Computing, Amazon Elastic Compute Cloud, Google App Engine,
Microsoft Azure, Salesforce.com.
Paper Type: Survey
Introduction
he term “cloud”, as used in this white paper, appears to have its
origins in network diagrams that represented the internet, or
various parts of it, as schematic clouds. “Cloud computing” was
coined for what happens when applications and services are moved into
the internet “cloud.” Cloud computing is not something that suddenly
appeared overnight; in some form it may trace back to a time when
computer systems remotely time-shared computing resources and
applications. More currently though, cloud computing refers to the many
different types of services and applications being delivered in the internet
cloud, and the fact that, in many cases, the devices used to access these
services and applications do not require any special applications. Cloud
Computing refers to both the applications delivered as services over the
Internet and the hardware and systems software in the data centers that
provide those services. A cloud computing platform dynamically
provisions, configures, reconfigures, and deprivations servers as needed.
Cloud applications are those that are extended to be accessible through
the Internet. The datacenter hardware and software is what we will call a
Cloud. Cloud computing is changing the way we provide hardware and
T
*
Department of Computer Science, Shri Ramswaroop Memorial College of Engg. & Mgmt.
Lucknow, U.P., India. email: [email protected]
**
Department of Computer Science , Shri Ramswaroop Memorial College of Engg. & Mgmt.
Lucknow, U.P, India. email: [email protected]
170
software for on-demand capacity fulfillment and changing the way we
develop web applications and make business decisions. Cloud computing
is a computing paradigm in which tasks are assigned to a combination of
connections, software and services accessed over a network. This
network of servers and connections is collectively known as “the cloud.”
Computing at the scale of the cloud allows users to access
supercomputer-level power. Users can access resources as they need
them.
Understanding Cloud Computing
Cloud computing describes how computer programs are hosted and
operated over the Internet. The key feature of cloud computing is that
both the software and the information held in it live on centrally located
servers rather than on an end-user's computer. How does cloud
computing works? The concept is fairly simple. First, consider the
traditional means of running application, and application appears to run
on a dumb terminal or, these days your PC; practicality, this is only frontend of the application. Your computer is connected to a server that
actually runs the program or application and returns output to personal
computer. The server constitutes the backend and it can be located in the
same building as you are or not. With cloud computing application
program runs somewhere within the cloud; ideally the user concern only
with applications that are available and need not to be aware of the
underlying technology or the physical location of the Application's
computer. User desktop is connected via internet to a server farm, a
collection of remote servers that runs many, many applications at once.
Which server or servers an application runs on is determined by the
application program already running on the machines; there is an
attempt to balance the load so that all of the programs run optimally.
There are number of companies that offer cloud computing services like
Amazon offers something called Amazon Elastic Compute Cloud (EC2),
Google with its own cloud computing offering, Google App Engine and
Microsoft offers Microsoft Azure. When a Cloud is made available in a
pay-as-you-go manner to the public, we call it a Public Cloud; the service
being sold is Utility Computing. We use the term Private Cloud to refer to
internal datacenters of a business or other organization that are not
made available to the public. Thus, Cloud Computing is the sum of SaaS
(Software as a Service) and Utility Computing, but does not normally
include Private Clouds. From a hardware point of view, three aspects are
new in Cloud Computing (Vogels, 2008):
1. The illusion of infinite computing resources available on demand,
thereby eliminating the need for Cloud Computing users to plan far
ahead for provisioning.
171
2. The elimination of an up-front commitment by Cloud users, thereby
allowing companies to start small and increase hardware resources
only when there is an increase in their needs.
3. The ability to pay for use of computing resources on a short-term basis
as needed (e.g., processors by the hour and storage by the day) and
release them as needed, thereby rewarding conservation by letting
machines and storage go when they are no longer useful. As a
successful example, (Armbrust, et al, 2009) Elastic Compute Cloud
(EC2) from Amazon Web Services (A WS) sells 1.0-GHz x86 ISA
“slices” for 10 cents per hour, and a new “slice”, or instance, can be
added in 2 to 5 minutes. Amazon's Scalable Storage Service (S3)
charges USD 0.12 to USD 0.15 per gigabyte month, with additional
bandwidth charges of USD 0.10 to USD 0.15 per gigabyte to move
data in to and out of A WS over the Internet.
Commercially Available Cloud Services
1) Google: The core of Google's business is all in Cloud Computing.
Services delivered over network connections include search, e-mail,
online mapping, office productivity (including documents,
spreadsheets, presentations, and databases), collaboration, social
networking and voice, video, data services. Users can subscribe to
these services for free or pay for increased levels of service and
support.
2) Amazon: As the world's largest online retailer, the core of Amazon's
business is ecommerce. While ecommerce itself can be considered
Cloud Computing, Amazon has also been providing capabilities which
give IT department’s direct access to Amazon compute power. Key
examples include S3 (Simple Storage Services) and EC2. Any internet
user can access storage in S3 and access stored objects from
anywhere on the Internet. EC2 is the Elastic Compute Cloud, a virtual
computing infrastructure able to run diverse applications ranging
from web hosts to simulations or anywhere in between. This is all
available for a very low cost per user
3) Microsoft: Traditionally Microsoft's core business has been in device
operating systems and device office automation software. Since the
early days of the Internet Microsoft has also provided web hosting,
online e-mail and many other cloud services. Microsoft now also
provides office automation capabilities via a cloud (“Office Live”) in
an approach referred to as “Software Plus Services” vice “Software
as a Service” to allow synchronous/asynchronous integration of
online Cloud documents with their traditional offline desktopresident versions.
4) Salesforce.com: The core mission of Salesforce.com has been in
172
delivery of capabilities centered on customer relationship
management. However, in pursuit of this core Salesforce.com has
established themselves as thought leaders in the area of Software as
a Service and is delivering an extensive suite of capabilities via the
Internet. A key capability provided is the site Force.com, which
enables external developers to create add-on applications that
integrate into the main Salesforce.com application and are hosted on
the infrastructure Salesforce.com.
5) VMware: Provides several technologies of critical importance to
enabling cloud computing, and has also started offering its own cloud
computing on demand capability called vCloud. This type of
capability allows enterprises to leverage virtualized clouds inside
their own IT infrastructure or hosted with external service providers.
New Application Opportunities
1) Mobile interactive applications: Tim O'Reilly believes that “the future
belongs to services that respond in real time to information provided
either by their users or by nonhuman sensors” (Li., et al, 2009). Such
services will be attracted to the cloud not only because they must be
highly available, but also because these services generally rely on
large data sets that are most conveniently hosted in large
datacenters. While not all mobile devices enjoy connectivity to the
cloud 100% of the time, the challenge of disconnected operation has
been addressed successfully in specific application domains, so we
do not see this as a significant obstacle to the appeal of mobile
applications.
2) Parallel batch processing: Cloud computing presents a unique
opportunity for batch-processing and analytics jobs that analyze
terabytes of data and can take hours to finish. If there is enough data
parallelism in the application, users can take advantage of the cloud's
new “cost associativity”: using hundreds of computers for a short
time costs the same as using a few computers for a long time. For
example, Programming abstractions such as Google's MapReduce
(Dean & Ghemawat, 2004) and its open-source counterpart Hadoop
(Bialecki, Cafarella, Cutting & O'Malley, 2005 allow programmers to
express such tasks while hiding the operational complexity of
choreographing parallel execution across hundreds of Cloud
Computing servers.
3) Analytics: A special case of compute-intensive batch processing is
business analytics. While the large database industry was originally
dominated by transaction processing, that demand is leveling off. A
growing share of computing resources is now spent on
understanding customers, supply chains, buying habits, ranking, and
173
so on. Hence, while online transaction volumes will continue to grow
slowly, decision support is growing rapidly, shifting the resource
balance in database processing from transactions to business
analytics.
4) Extension of compute-intensive desktop applications: The latest
versions of the mathematics software package Matlab and
Mathematica are capable of using Cloud Computing to perform
expensive evaluations. Other desktop applications might similarly
benefit from seamless extension into the cloud. Again, a reasonable
test is comparing the cost of computing in the Cloud plus the cost of
moving data in and out of the Cloud to the time savings from using
the Cloud.
Cloud Architectures and Infrastructure
Cloud computing architecture comprised of two components (hardware
and application). These two components have to work together
seamlessly or else cloud computing will not be possible. Cloud computing
requires an intricate interaction with the hardware which is very essential
to ensure uptime of the application. If application fails, the hardware will
not be able to push the data and implement certain processes. On the
other side, hardware failure will mean stoppage of operation.
Applications built on Cloud Architectures are such that the underlying
computing infrastructure is used only when it is needed (for example to
process a user request), draw the necessary resources on demand (like
compute servers or storage), perform a specific job, then relinquish the
unneeded resources and often dispose themselves after the job is done.
While in operation the application scales up or down elastically based on
resource needs. Applications built on Cloud Architectures run in-the
cloud where the physical location of the infrastructure is determined by
the provider. They take advantage of simple APIs of Internet-accessible
services that scale on demand, that are industrial-strength, where the
complex reliability and scalability logic of the underlying services remains
implemented and hidden inside-the-cloud. The usage of resources in
Cloud Architectures is as needed, sometimes ephemeral or seasonal,
thereby providing the highest utilization and optimum bang for the buck.
Instead of building your applications on fixed and rigid infrastructures,
Cloud Architectures provide a new way to build applications on ondemand infrastructures. Cloud Architectures address key difficulties
surrounding large-scale data processing. In traditional data processing it
is difficult to get as many machines as an application needs. Second, it is
difficult to get the machines when one needs them. Third, it is difficult to
distribute and coordinate a large-scale job on different machines, run
processes on them, and provision another machine to recover if one
174
machine fails. Fourth, it is difficult to auto scale up and down based on
dynamic workloads. Fifth, it is difficult to get rid of all those machines
when the job is done. Cloud Architectures solve such difficulties.
A. The on-demand, self-service, pay-by-use model
The on-demand, self-service, pay-by-use nature of cloud computing is
also an extension of established trends .From an enterprise perspective,
the on-demand nature of cloud computing helps to support the
performance and capacity aspects of service level objectives (Sun
Microsystems, 2009). The self-service nature of cloud computing allows
organizations to create elastic environments that expand and contract
based on the workload and target performance parameters and the pay
by- use nature of cloud computing may take the form of equipment
leases that guarantee a minimum level of service from a cloud provider.
Virtualization is a key feature of this model. IT organizations have
understood for years that virtualization allows them to quickly and easily
create copies of existing environments -sometimes involving multiple
virtual machines - to support test, development, and staging activities.
The cost of these environments is minimal because they can coexist on
the same servers as production environments because they use few
resources. Likewise, new applications can be developed and deployed in
new virtual machines on existing servers, opened up for use on the
Internet, and scaled if the application is successful in the marketplace
(Sun Microsystems, 2009). The ability to use and pay for only the
resources used shifts the risk of how much infrastructure to purchase
from the organization developing the application to the cloud provider.
B. Cloud computing infrastructure models
There are many considerations for cloud computing architects to make
when moving from a standard enterprise application deployment model
to one based on cloud computing. There are public and private clouds
that offer complementary benefits, there are three basic service models
to consider, and there is the value of open APIs versus proprietary ones.
IT organizations can choose to deploy applications on public, private, or
hybrid clouds, each of which has its trade-offs. The terms public, private,
and hybrid do not dictate location. While public clouds are typically “out
there” on the Internet and private clouds are typically located.
1) Public clouds are run by third parties, and applications from different
customers are likely to be mixed together on the cloud's servers,
storage systems, and networks. Public clouds are most often hosted
away from customer premises, and they provide a way to reduce
customer risk and cost by providing a flexible, even temporary
extension to enterprise infrastructure.
175
2) Private clouds are built for the exclusive use of one client, providing
the utmost control over data, security, and quality of service. The
company owns the infrastructure and has control over how
applications are deployed on it. Private clouds may be deployed in an
enterprise datacenter, and they also may be deployed at a
collocation facility.
3) Hybrid clouds combine both public and private cloud models. They can
help to provide on-demand, externally provisioned scale. The ability
to augment a private cloud with the resources of a public cloud can
be used to maintain service levels in the face of rapid workload
fluctuations. This is most often seen with the use of storage clouds to
support Web 2.0 applications. A hybrid cloud also can be used to
handle planned workload spikes. Sometimes called “surge
computing,” a public cloud can be used to perform periodic tasks
that can be deployed easily on a public cloud. Hybrid clouds
introduce the complexity of determining how to distribute
applications across both a public and private cloud.
C. Architectural layers of cloud computing
Cloud computing can describe services being provided at any of the
traditional layers from hardware to applications. In practice, cloud service
providers tend to offer services that can be grouped into three
categories: software as a service, platform as a service, and infrastructure
as a service.
1) Software as a service (SaaS) features a complete application offered as
a service on demand. A single instance of the software runs on the
cloud and services multiple end users or client organizations. The
most widely known example of SaaS is salesforce.com, though many
other examples have come to market, including the Google Apps
offering of basic business services including email and word
processing (Sun Microsystems, 2009).
2) Platform as a service (PaaS) encapsulates a layer of software and
provides it as a service that can be used to build higher-level services.
There are at least two perspectives on PaaS depending on the
perspective of the producer or consumer of the services: Someone
producing middleware, application software, and even a
development environment that is then provided to a customer as a
service. Someone using PaaS would see an encapsulated service that
is presented to them through an API. The customer interacts with the
platform through the API, and the platform does what is necessary to
manage and scale it to provide a given level of service (Sun
Microsystems, 2009).
3) Infrastructure as a service (IaaS) delivers basic storage and compute
176
capabilities as standardized services over the network. Servers,
storage systems, switches, routers, and other systems are pooled
and made available to handle workloads that range from application
components to high-performance computing applications.
Conclusion and Future Work
In this paper we are presenting that Cloud computing promises significant
benefits, but today there are security, privacy, and other barriers that
prevent widespread enterprise adoption of an external cloud. In addition,
the cost benefits for large enterprises have not yet been clearly
demonstrated. The usage of resources in Cloud Architectures is as
needed, sometimes ephemeral or seasonal, thereby providing the highest
utilization and optimum bang for the buck. In Cloud, for the broader
vision of Cloud Interoperability to work, ranging from VM mobility to
storage federation to multicast and media streaming interoperability to
identity and presence and everything in between, analogous core
network extensions (or replacement) technologies need to be invented.
Finally, we need improvements in bandwidth and costs for both
datacenter switches and WAN routers. While we are optimistic about the
future of Cloud Computing but cloud platforms aren't yet at the center of
most people's attention. The attractions of cloud-based computing,
including scalability and lower costs, are very real. If you work in
application development, whether for a software vendor or an end user,
expect the cloud to play an increasing role in your future. The next
generation of application platforms is here which is “cloud; A computing
infrastructure on demand.”
References
Armbrust, Michael., et al. (2009). Above the Clouds: A Berkeley View of
Cloud Computing. CA: Electrical Engineering and Computer
Sciences, University of California at Berkeley. (Technical Report
No. UCB/EECS-2009-2). Retrieved from
http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-200928.pdf
Bialecki, A., Cafarella, M., Cutting, D., & O'Malley, O. (2005). Hadoop: A
Framework for Running Applications on Large Clusters Built of
Commodity Hardware. Retrieved from
http://lucene.apache.org/hadoop/
Dean, J., & Ghemawat, S. (2004). MapReduce: simplified data processing
on large clusters. In Proceedings of the 6th conference on
Symposium on Opearting Systems Design \& Implementation
(Vol. 6) (p. 10). Berkeley, CA, USA: USENIX Association. Retrieved
from
177
Li, Hong., et al. (2009). Developing an Enterprise Cloud Computing
Strategy. Intel Information Technology. Retrieved from
http://www.intel.com/en_US/Assets/PDF/whitepaper/icb_cloud
_computing_strategy.pdf
Sun Microsystems. (2009). Introduction to Cloud Computing architecture
st
White Paper (1 Ed.). Retrieved from
http://eresearch.wiki.otago.ac.nz/images/7/75/Cloudcomputing
.pdf
Vogels, W. (2008). A Head in the Clouds – The Power of Infrastructure as
a Service Infrastructure as a Service. In First workshop on Cloud
Computing and In Applications (CCA’08).
178
Arafat, Mustaq & Mothi
*
Yasser Arafat
Mohammed Mustaq
Mohammed Mothi
Abstract
Purpose: The Einstein’s compression technique is a new method of compression
and decompression of images by matrix addition and the possible sequence of the
sum. The main purpose of implementing a new algorithm is to reduce the
complexity of algorithms used for image compression in recent days. The major
advantage of this technique is that the compression is highly secure and highly
compressed. This method does not use earlier compression techniques. This
method of compression is a rastor compression. This method can be used for
astronomical images and medical images because the image compression is
considered to be lossless.
Design/Methodology/Approach: The idea uses the previous literature as a base
to explore the use of image compression technique.
Findings: This type of compression can be used to reduce the size of the database
for non- frequently used important data. This technique of compression will be in
future used for compression of colour images and will be researched for file
compression also.
Social Implications: This idea of image compression is expected to create a new
technique of image compression and will promote more researchers to research
more on this type of compression
Originality/Value: The idea intends to create a new technique of compression in
the compression of image research.
Keywords: Image Compression; Einstein’s Image Compression; New Compression
Technique; Matrix Addition Based Compression.
Paper Type: Technical
Procedure
he image is gained as an input preferably black. The value of
colour will range from 0 to 255 as 0 is completely blank and 255
full.
The image is processed in to the system and is converted in table of rows
and columns of pixels preferably .jpg or .bmp. The input image will be in
the form of Fig 1 and converted values will something be like Fig 2
T
*
Student, Department of Electronic sciences. Sathyabama University, Chennai-119, India.
179
Fig 1:
Courtesy: www.mathworks.com
Fig 2:
Courtesy: www.mathworks.com
The image may be of any number of rows and columns but all the rows
must have the same number of columns and vice versa. According to our
example we have taken a matrix of 255 x 255
Compression
Calculation of Rows and Columns
A counter will be assigned to calculate the number of rows and will be
stored in a variable as ρ
Another counter will be assigned to calculate the number of columns and
will be stored in a variable as χ
ρ is nothing but the number of cells in each column and χ is the number
of cells in each row
According to our example the pattern will be:
ρ=χ
180
The Database
A database of all the possible sum is created. This is one time creation of
database.
As our example image is 255 x 255 when processed will have [1 × (255 ×
255)] and the maximum possible values as our image is black and white is
255 so the maximum possible sum of the matrix is 16581375.
So the database is created for the values ranging from sum of all the
columns ranging from 0 to 16581375.
For every possible value there is number of possible values i.e. according
to permutations for sum σ and µ columns in the row matrix we get ν
combinations
i.e.
ν = σ + (µ - 1) c (µ + 1)
For example if there are 4 columns and sum of the matrix is 10 we get
715 combinations. And an extra column is added in the table for
generating sequence number Λ. The table is stored in ascending order
considering as digits (Table 1)
Table 1
0
0
0
0
0
0
1
1
1
2
MATRIX VALUES
0
0
0
1
0
2
1
0
1
1
2
0
0
0
0
1
1
0
0
0
2
1
0
1
0
0
1
0
0
0
Λ
1
2
3
4
5
6
7
8
9
10
Table 1 forms as a look up table for example if the sum of the matrix is 2
and the matrix is [0 2 0 0 ] then the sequence number of the matrix will
be 6 i.e. Λ = 6
Similarly, a database of all the possible values is generated. The database
is found to be so important that it is even required for decompression
also.
The Second Step
The second step involves the conversion of * ρ × χ +image into * 1 × (χρ)+.
The actual image will be in the form of Fig 3.
181
Fig. 3
On the first stage of conversion the image is cut into each row matrices
so that we get ρ * 1 × χ+. (Fig. 4)
Fig. 4
Then the row matrices formed is lined one after the other to form a [ 1 ×
(χρ)+ row matrix (Fig. 5)
Fig. 5
182
Adding for σ and Generation of Sequence Λ
As the row matrix is generated from the previous step. The values of the
cells in the rows are added and are stored in σ. This forms a new cell in
the compressed image
The next cell in the compressed image comprises of the sequence
number Λ. This number is generated by some random search algorithm
referring to the table created as database and the original image.
Extra cells
Some extra cells like τ which refers to the type of image compressed; the
extensions of the uncompressed images are converted and stored as
ASCII values.
Two cells containing the counter values like ρ and χ, and an extra cell
number of colours or the layers present in the cell i.e. 1 denote black and
white image and 3 denotes RGB.
The output image for a black and white image will be in the form of Fig.
6.
Fig. 6
σ =sum of matrix cells
Λ = sequence generated for the sum
τ =Type of image
ρ =Number of rows in the original matrix
χ =Number or columns in the original matrix
α =Number of colors in the image
Decompression
The image compressed by the above technique can be decompressed by
this method
The compressed image received if black and white (Fig 7).
Fig. 7
On seeing the value stored in last cell it understands whether the image is
colour or black and white. If the value is 1 it understands the image is
black and white and allots a single table for pixel storage else it allots 3
tables, each one for red, green and blue respectively
183
For Black and White Images
The table or the matrix allotted is in the form of
[1 × (χρ)+ cells (Fig. 8).
Fig. 8
Then the programs search the database of the sum σ and goes to the
value Λ and fills the table with the values stored in the table.
Then the [1 × (χρ)+ matrix is cut at each χ and makes a new row (Fig. 9).
Fig. 9
Then the rows are joined to form the complete image (Fig. 10).
Fig. 10
Then the value of τ are converted into extension and the image is stored
following the Dot (.) i.e. .jpeg etc.
Other Notes
 This type of compression is calculated to be highly compressible
and forms a lossless image when compressed.
 The software is calculated to be heavy as the database is heavy.
 The thumbnails of the compressed image are not possible as the
image is stored as a table.
184



The compression can be also secure if the value of Λ is sent
separately.
Multiple compressions are not possible for the image as onetime
compression is compressed on maximum basis.
The compression technique does not use any previous method
of compression.
Conclusion
Simplicity of matrix addition is the major advantage of the Einstein’s
image compression algorithm. The images compressed can be stored in
the database with less space. The technique is based purely upon a new
idea and does not contain any previous type of compression. The next
version of the compression technique will be in research for the
compression of colour images.
Acknowledgement
First of all I would like to thank Dr. Jeppiaar, Chancellor of Satyabama University;
Dr.C.D.Suriyakala, Head, Department of ETCE, Sathyabama University and Ms.
Ulagamudhalvi for providing me the opportunities to write this paper. This algorithm could
not have been possible without Dr. N.M. Nanditha, Professor, Satyabama University who not
only served as my supervisor but also encouraged and challenged me throughout the
completion of the paper. She and the other faculty members, Mr. Selvakumar and
Mr.Vedhanarayanan, guided me through the completion process, never accepting less than
my best efforts. I thank them all.
Additional Readings
Carpentieri, B., Weinberger, M. J., & Seroussi, G. (2000). Lossless
Compression of Continuous-Tone Images. Proceedings of IEEE. 88
(11), 1797-1807
Xu, D., et al. (2005). Parallel Image Matrix Compression for Face
Recognition. MMM '05 Proceedings of the 11th International
Multimedia Modelling Conference. Washington DC, U.S.A: IEEE
Computer Society.
185
Open Source Software (OSS)…
Kotwani & Kalyani
Open Source Software (OSS):
Realistic Implementation of OSS in School Education
*
Gunjan Kotwani
**
Pawan Kalyani
Abstract
Purpose: Freedom to think for the generation of new ideas and act to
conceptualize them, are the concepts which are revolutionizing today’s world. The
software world is also not left untouched. Open Source Software (OSS) has
brought the idea of sharing of ideas for the betterment of Computer Science to the
forefront. With the passage of time, open source software has not only gained
prominence in the server software segment, but is also penetrating the desktop
segment. Open source softwares are attracting attention all over the world;
especially governments of developing nations are working on the promotion and
spread of OSS. The advantages of localization, freedom to modify the software,
and easy availability are factors that are attracting people towards OSS. The
impact of OSS is felt in many arenas. Education is one of them; in India itself,
Kerala and Goa have pioneered the use of OSS in school education.
Design/Methodology/Approach: In this research paper, the authors focus on OSS
in education and its realistic implementation in school education. The authors
conducted an empirical study on school students to study the effect of OSS on
their learning curve.
Findings: The authors propose a curriculum for the school that is based on OSS.
Research Implications: The apt usage of information and communication
technologies (ICTs) has the potential to improve the quality of education.
However, educational institutions face many constraints, like financial, equipped
staff, resources, etc. The high cost of software along with the hardware poses
major challenge. OSS with its unique features proves to be of great help by
lowering the cost factor of the software. OSS not only provides financial benefits,
but also there are many other advantages of OSS which prove to be a boon for the
education sector.
Value: This research paper will aide policy-makers and decision-makers, to
understand the potential use of OSS in education—how and where it can be used,
why it should be used, and what issues are involved in its implementation. In
particular, officials in ministries of education, school and university administrators
and academic staff should find this research useful.
Keywords: Open Source Software; Education; School Education; Information and
Communication Technology (ICT); Realistic Implementation- OSS.
Paper Type: Empirical
*
Department of Computer Science and Information Technology, Management and
Commerce Institute of Global Synergy, Ajmer, Rajasthan, India.
**
Department of Computer Science and Information Technology, Management and
Commerce Institute of Global Synergy, Ajmer, Rajasthan, India.
186
Kotwani & Kalyani
Introduction
SS is a software that gives user the freedom to use, study, and
modify the software based on local needs and preferences. This
freedom is vital for the growth and development of Computer
Sciences. Certain distinctive advantages of OSS are as:
 Lower costs
 Reliability, performance and security
 Build long-term capacity
 Open philosophy
 Encourage innovations
 Alternative to illegal copying
 Possibility of localization
 Learning from source code
Previous studies show that OSS based educational infrastructure in
comparison to proprietary software to facilitate the process of teaching
and learning has proved to be more beneficial in stimulating crossboundary learning and modifying the technologies into the desires of the
users (Pearson & Koppi, 2002). Many more studies propagate the use of
OSS in education. Now, the next step is to design an age-appropriate
syllabus based on local needs and environment that could be
implemented in schools. It also requires the development of coursematerial for the teacher’s aide. Through this research paper we propose
an OSS based curriculum based on the recommendations of National
Curriculum Framework (NCF) 2005 proposed by the National Council of
Educational Research and Training (NCERT), India. We have also
developed the study material which can be instrumental in realistic
implementation of OSS in schools of India.
The paper investigates the need of OSS in education, its merits for the
students, educational institutions and the nations especially developing
ones. It further depicts an empirical study of effects of OSS inside the
classroom environment. The paper also presents an overview of the
proposed comprehensive integrated curriculum plan based on the
recommendations of the NCF 2005. Appendix A, gives the introduction of
proposed software included in the curriculum with a sample of the
course material developed. Appendix B shows the samples of the work
done by students using OSS.
O
Need of OSS in Education
As Computer Science educators, we constantly seek new channels,
methods, and technologies to reach and intrigue our students. We hope
to first capture their interest, then develop their understanding, work
towards retention of the concept, and finally encourage their own
187
Kotwani & Kalyani
independent creative work. Throughout this process, we try to teach
them skills that they can apply in the real world. The breadth of our field
and the variety of pedagogical approaches make this process very
difficult.
We believe that OSS can serve as a channel, method, and technology to
teach and learn Computer Science. OSS has the potential to expand group
work beyond the classroom to include much larger projects and more
distributed teams. OSS can also be used to introduce our students to the
larger Computer Science community and to the practice of peer-review.
Finally, OSS can provide us with free or lower-cost technology in the
classroom, permitting us to use technology that we might otherwise be
unable to afford.
Merits for the Students
 Students use open source in school, which substantially shortens
their learning curve when they go to work for software
companies.
 Students who are encouraged to build projects on top of OSS
bases can build more interesting and exciting systems than they
might have developed from scratch.
 The foothold of OSS is increasing in the industrial sector. Today’s
learner will be tomorrow’s professional. If he/she is not
equipped with the desired skill, he/she will find difficulty to
adapt in tomorrow’s job market. Teaching OSS from the
elementary years of education adapts the child for future
market and job requirements.
 Students, who take up Computer Science as a subject in higher
secondary school and take up professional computing courses in
under-graduate and post-graduate programs, remain largely
aloof with the actual coding segment taking place in the
software industry. Use of OSS will help them work and see the
actual software codes; how they can modify them and be a part
of a larger online community which is working on OSS.
Merits for the Educational Institutions
 Free and OSS can save the school’s money in a context where
schools – even the affluent ones are short of money.
 Teaching students’ way of life is the aim of education. Schools
should promote “open source software just as they promote
recycling”, which will benefit society as a whole.
 OSS does not demand high end hardware configurations which
result in “lowered carbon footprints”.
188




Kotwani & Kalyani
OSS opens the code for the students, permitting them to learn
how software works, thus helping to build good future coders.
Proprietary software rejects their thirst for knowledge by
keeping knowledge secret and “learning forbidden”.
Schools teach students to be good citizens – to cooperate and
share with others who need their help. This is the philosophy of
open source.
The training to use free software, and encouragement to
participate in the free software community, generates a sense of
importance of sharing and collaborative development amongst
the students.
Merits for the Nation
 Sovereignty and security issues.
 Promote growth of local software industry.
 Induce economic development tapping on local talent and
human resources.
 Encourage use of local software at national level.
 Reduced costs and dependency on imported technology and
skills.
 Affordable software for individual, enterprise and government.
 Access to government data without barrier of proprietary
software and formats.
 Ability to customise software to local languages and cultures.
 Lowered barriers to entry for software business.
Research Undertaken
 Effects of OSS inside the classroom (Subject: Mathematics)
We, along with a mathematics teacher, planned a research plan for
students of Class III, Section A and B. The strength of each section was 36
students.
Methodology
Research Plan 2010
Actions
Timescales/
Key dates
Collect
data
related to the
understanding
the students of
class III, already
have related to
the
topic
2nd week of
September.
Resources /
Sources of
support and
challenge
We will be
using
worksheets
and photocopy
of
student’s
class-work.
Success
Criteria
Comments
/Amendments to
plan
The
worksheets
will
be
completed
individually.
The
worksheet
assessment and the
oral assessment gave
a different output
for certain students
who were good in
oral work but poor in
189
Multiplication
and Money and
identify
the
student groups
who
are
struggling with
the concepts
Explain
the
concept
of
Multiplication
and
Money
using
multimedia
modules.
Provide students
with
opportunities to
use
their
concept
knowledge
to
play computer
games and to
improve
their
skills by trying to
improve
their
scores
Software used:
Tux Math and
GCompris
Kotwani & Kalyani
comprehension.
Mid
September
Using
computer and
the
module
available
related to the
topics.
3rd week of
September.
Using Free and
Open Source
Software. The
computer
teacher
will
also act as a
resource
person.
The
challenge will
be to adjust
the timetable
so that the
computer lab
is available to
this group of
students.
Assessment to
gauge
the
students level of
learning
Software used:
Tux Paint
A grid
Last week
of
September.
Assessment
sheets,
Classroom
observation,
interview with
students.
To see that
Students
have
achieved the
expected
learning
outcome
Feedback
1st week of
October
Feedback
Form
To get the
learner’s
point
of
view.
All
the
children will
have access
to
a
computer
and
the
module.
All
the
students will
be able to
play
the
games with
increasing
difficulty
level
Learners were keen
to
watch
the
multimedia modules.
The idea of taking a
mathematics class in
the computer lab
was enough to excite
the students. The
game play of Tux
Math
provided
ample opportunities
for oral and mental
mathematics
calculations.
The
results were saved
and the game play
could be continued
in the next lesson
which gave the
learners
an
opportunity to wait
for the upcoming
mathematics class.
Using the capabilities
of the free and open
source software tux
paint, a grid was
designed which was
included as a stamp
in the software. The
teacher
gave
questions that had
to be solved using
the grid and answers
be noted in the grid.
This was used later
by the teacher for
assessments.
The learners gave
positive responses
about the whole
exercise.
190
Kotwani & Kalyani
The above methodology was adopted in section ‘A’ of class III. In section
‘B’ with the same teacher the approach was kept conventional. To gauge
the performance of the students periodically assessments were
conducted. In this study, we conducted four (4) assessments. The results
of the assessments of both the sections were compiled and tabulated. A
comparative study was then conducted after the assessment of both the
sections.
Results
The study clearly showed that the number of students who grasped the
concept in less time period and with a better quality were more in section
‘A’ where certain open source software were adapted in conformance
with the syllabus of the class (Fig. 1).
Fig. 1: Comparative analysis of students of Class III A and III B
Discussion
After the completion of the study, a feedback was taken from the
students as well as the concerned subject (Mathematics) teacher (Fig. 2).
191
Kotwani & Kalyani
Fig. 2: Sample of student feedback forms
Appendix A
Review of the Mathematics teacher
Before starting my lesson on multiplication using computer aided
technology, I assessed the previous knowledge base and the level of
understanding of my Class III students through a worksheet. I found that
the majority of students understood that multiplication was grouping of
objects but were not clear about multiplication as repeated addition.
I also talked to my colleagues teaching Class III and all of them
unanimously agreed that the students of Class III (A) were very restless
with a short attention span and that they were also finding it difficult to
keep them engaged for longer periods. At this point I would like to
mention that I follow the activity based method of teaching and I teach
every topic through some activity to make it interesting to students. Yet
we were all facing the challenge of keeping Class III-A engaged. I also
observed the computer lesson of this class and was surprised to see the
level of engagement in the same students. This made me decide that
using computer as a tool for teaching mathematics will not only help in
improving student performance but will also increase student
engagement.
192
Kotwani & Kalyani
I had discussions with our computer teacher, Ms. Gunjan Kotwani who
has been working with OSS (Open Source Software) since the past few
years and is also working on integrated learning approach for students
of classes I to V. She went through the Mathematics syllabus of Class III
and gave me valuable inputs on which topics could be taught using
certain software. We both took Mathematics lessons in the computer
laboratory. We shared few tips on how to help students when they were
facing some difficulty in carrying out their Mathematics assignment on
computers.
I started my lesson on multiplication using multimedia modules. I
explained the concept of repeated addition using this software. We then
took the class to the computer laboratory where the students would
have access to individual computers and could apply whatever they had
grasped from their previous lesson in the given assignment. We noticed
that the level of student engagement was very high; in fact they did not
want to return to their classroom at the end of the lesson. After 3
lessons in the computer lab we assessed the student learning and were
surprised at the result as we found that there was not any significant
improvement in their learning. After discussions with other mathematics
teachers, we realized that what the students also needed in
Mathematics was daily practice and drilling which included pen and
paper exercises.
We made some basic changes in our plan and interspersed Mathematics
lessons with assignments on computer as well as exercises in the
notebooks, worksheets and home assignments. As we progressed we
noticed that the students were responding better.
Curriculum Planning
This study aims to provide a realistic implementation of OSS in schools.
The major problem faced by the schools willing to adopt OSS in Computer
Science curriculum is the lack of study material. The teachers are not
equipped enough to handle OSS in their classrooms. A series of training
sessions with adequate support in the form of study material and services
can play a defining role in the implementation of OSS.
This curriculum has been designed keeping in mind the recommendations
of National Curriculum Framework (NCF 2005).
Appendix B
Class
Age-Group of
Children
I
6-7
II
7-8
III
8-9
Suggested Software
TuxType,
TuxPaint,
GCompris
TuxPaint,
Tux Math,
Introduction to OpenOffice.org Word Processor
OpenOffice.org Word Processor,
193
IV
9-10
V
10-11
VI
11-12
VII
12-13
VIII
13-14
IX
X
XI
XII
14-15
15-16
16-17
17-18
Kotwani & Kalyani
Introduction to Logo programming using KTurtle
Introduction to OpenOffice.org Presentation, Basics of Logo
programming using KTurtle
Advanced OpenOffice.org Word Presentation, Programing Basics
of Logo programming using KTurtle
Internet Browser (Firefox), Raster Graphic Editor- GIMP,
OpenOffice.org Calc (Spreadsheet package)
Vector Graphics Editor-Inkscape,
Introduction to Database using MySQL,
HTML programming using BlueFish
Database concepts using MySQL,
Introduction to programming using Java NetBeans, Database
connectivity between Java and MySQL, Page-layout programScribus
Advanced Java programming using NetBeans
Introduction to programming in C++ using GCC compiler
Based on recommendations of CBSE
Based on recommendations of CBSE
Conclusion
The study aims to lay stress on the need of use of OSS in a developing
nation like India. The use of OSS will promote free thinking, innovation,
development of new software models and the field of Computer Science
can reach to great heights. The students need to be exposed to these
software at an early stage of their mental development. Use of OSS will
teach the usage of tools rather than laying importance to software. For
example, a document can be created in any word processor software.
The student should be comfortable in adapting to various word
processors available. Ultimately the tools of a word processor will be
similar, only their placement and arrangement might be different. Since
Computer Science is a rapidly evolving field in which new software and
technologies keep on emerging. This kind of flexibility with the software
is essential. In this context, the acceptance and adaptability for the
changing software is necessary for the students.
This research aims to provide a practical, feasible and working model of
OSS in education. For this the development of study tools like coursematerial, resource CDs, etc. are essential to provide support to the
teaching community and will also help in removing the hitch to adopt OSS
in education. Still there are many challenges in the implementation of
OSS in school curricula. The major one being the reluctance to change.
The teaching fraternity first needs to be convinced about the benefits,
OSS can give to their students. The unavailability of teaching resource
material for OSS is another hitch. And lastly teacher training and OSS
maintenance are the challenges which need to be overcome for effective
implementation of OSS in school education.
194
Kotwani & Kalyani
References
National Curriculum Framework (NCF). (2005). Retrieved from
http://www.ncert.nic.in/html/pdf/schoolcurriculum/framework0
5/prelims.pdf
Pearson, E. J., & Koppi, A. J. (2002). A WebCT course on making accessible
online courses. WebCT Asia Pacific Conference, Melbourne,
Australia, March 2002.
195
Measurement of Processes in OSS Development
Kaur & Singh
Measurement of Processes in Open Source Software Development
Parminder Kaur
Hardeep Singh
Abstract
Purpose: This paper attempts to present a set of basic metrics which can be used
to measure basic development processes in an OSS environment.
Design/Methodology/Approach: Reviewing the earlier literature helped in
exploring the metrics for measuring the development processes in OSS
environment.
Results: The OSSD is different from traditional software development because of
its open development environment. The development processes are different and
the measures required to assess them have to be different.
Keywords: Open Source Software (OSS); Free Software; Version Control; Open
Source Software Metrics; Open Source Software Development
Paper Type: Conceptual
Introduction
ree software [FS], term given by Richard Stallman, introduced in
1984, can be obtained at zero cost i.e. software which gives the
user certain freedoms. Open Source Software (OSS), term coined
by Eric Raymond, in 1998, is software for which the source code is freely
and publicly available, though the specific licensing agreements vary as to
what one is allowed to do with that code . In the case of FS, only
executable file is made available to the end user, through public domain
and end user is free to use that executable software in any way, but the
user is not free to modify that software. The alternative term Free/Libre
and Open Source Software (FLOSS) refers to software whose licenses give
users four essential ‘freedoms’:
 To run the program for any purpose,
 To study the workings of the program, and modify the
program to suit specific needs,
 To redistribute copies of the program at no charge or for a
fee, and
 To improve the program, and release the improved, modified
version (Perens, 1999; 2004).
The free software movement is working toward the goal of making all
software free of intellectual property restrictions which hamper technical
improvement whereas OSS users do not pay royalties as no copyright
exists, in contrast to proprietary software applications which are strictly
F

Department of Computer Science and Engineering, Guru Nanak Dev University, Amritsar143005, India. email: [email protected]

Department of Computer Science and Engineering, Guru Nanak Dev University, Amritsar143005, India. email: [email protected]
196
Kaur & Singh
protected through patents and Intellectual Property Rights (IPR’s) (Asiri,
2003; Wheeler, 2003). OSS is software for which the source code is
publicly available, though the specific licensing agreements vary as to
what one is allowed to do with that code (Stallman, 2007).
Open Source Software Development
Open Source Software Development (OSSD) produces reliable, high
quality software in less time and with less cost than traditional methods.
Adelstein (2003) is even more evangelical, claiming that OSSD is the
“most efficient” way to build applications. Schweik and Semenov (2003)
add that OSSD can potentially “change, perhaps dramatically, the way
humans work together to solve complex problems in computer
programming”. Even when there is a great level of exaggeration, OSS can
be used as an alternative to traditional software development. Raymond
(1998) compares OSSD to a “bazaar” – a loosely centralized, cooperative
community where collaboration and sharing enjoy religion status.
Conversely, traditional software engineering is referred to as a
“cathedral” where hierarchical structures exist and little collaboration
takes place.
 Problems with Traditional Development
Traditional software development projects suffer from various issues
such as time and cost overruns, largely unmaintainable, with
questionable quality and reliability. The 1999 Standish Group report
revealed that 75% of software projects fail in one or more of these
measures, with a third of projects cancelled due to failure. In addition,
systems often fail to satisfy the needs of the customer for whom they are
developed (Sommerville, 1995). These failures are ascribed to:
 Inadequate understanding of the size and complexity of IS
development projects coupled with inflexible, unrealistic
timeframes and poor cost estimates (Hughes & Cotterell, 1999;
McConnell, 1996).
 Lack of user involvement (Addison & Vallabh, 2002; Frenzel,
1996; Hughes & Cotterell, 1999; McConnell, 1996).
 Shortfalls in skilled personnel (Addison & Vallabh, 2002;
Boehm, 1991; Frenzel, 1996; Hughes & Cotterell, 1999;
Satzinger, Jackson & Burd, 2004).
 Project costs are further increased by the price of license fees
for software and tools required for application development as
well as add-on costs for exchange controls.
 Benefits of Open Source Software
The benefits with OSS (Feller & Fitzgerald, 2001; FLOSS Project Report,
2002) are as follows:
 Collaborative, parallel development involving source code
197
Kaur & Singh
sharing and reuse
Collaborative approach to problem solving through constant
feedback and peer review
 Large pool of globally dispersed, highly talented, motivated
professionals
 Extremely rapid release times
 Increased user involvement as users are viewed as codevelopers
 Quality software
 Access to existing code
Despite these benefits, perceived disadvantages of OSS are:
 In the rapid development environment, the result could be slower,
given the absence of formal management structures (Bezroukov,
1999; Levesque, 2004; Valloppillil, 1998).
 Strong user involvement and participation throughout a project is
becoming problematic as users tend to create bureaucracies which
hamper development (Bezroukov, 1999).
 OSS is premised on rapid releases and typically has many more
iterations than commercial software. This creates a management
problem as a new release needs to be implemented in order for an
organization to receive the full benefit (Farber, 2004).
 The user interfaces of open source products are not very intuitive
(Levesque, 2004; Valloppillil, 1998; Wheatley, 2004).
 As there is no single source of information as well as no help desk
therefore no ‘definitive’ answers to problems can be found
(Bezroukov, 1999; Levesque, 2004).
 System deployment and training is often more expensive with OSS
as it is less intuitive and does not have the usability advantages of
proprietary software.
 Open Source Software Development Models
There are several basic differences between OSSD and traditional
methods. The System Development Life Cycle (SDLC) of traditional
methods have generic phases into which all project activities can be
organized such as planning, analysis, design, implementation and support
(Satzinger, Jackson & Burd, 2004). Also, open source life cycle for OSSD
paradigm demonstrates several common attributes like parallel
development and peer review, prompt feedback to user and developer
contributions, highly talented developers, parallel debugging, user
involvement, and rapid release times.
Vixie (1999) holds that an open source project can include all the
elements of a traditional SDLC. Classic OSS projects such as BSD, BIND

198
Kaur & Singh
and SendMail are evidence that open source projects utilize standard
software engineering processes of analysis, design, implementation and
support.
Mockus, Fielding & Herbsleb (2000) describe a life cycle that combines a
decision-making framework with task-related project phases. The model
comprises six phases like roles and responsibilities, identifying work to be
done, assigning and performing development work, pre-release testing,
inspections, and managing releases.
Jorgensen (2001) provides a more detailed description of specific product
related activities that support the OSSD process. The model (Fig. 1)
explains the life cycle for changes that occurred within the Free BSD
project.
Fig.1: Jorgensen Life-Cycle, 2001
Jorgensen’s model is widely accepted (Feller & Fitzgerald, 2001; FLOSS
Project Report, 2002) as a framework for the OSSD process, on both
macro (project) and micro (component or code segment) levels.
However, flaws remain. When applied to an OSS project, the model does
not adequately explain where or how the processes of planning, analysis
and design take place.
Schweik and Semenov (2003) propose an OSSD project life cycle
comprising three phases: project initiation; going ‘open’; and project
growth, stability or decline. Each phase is characterized by a distinct set
of activities.
Wynn (2004) proposes a similar open source life cycle but introduces a
maturity phase in which a project reaches critical mass in terms of the
numbers of users and developers it can support due to administrative
constraints and the size of the project itself.
Roets, et al. (2007) expands Jorgensen life-cycle model and incorporates
aspects of previous models, particularly that of Schweik and Semenov
(2003). In addition, this model attempts to encapsulate the phases of the
traditional SDLC (Fig. 2). This model facilitates OSS development in terms
199
Kaur & Singh
of improved programming skills, availability of expertise and model code
as well as software cost reduction.
Fig. 2: Roets, et al. life-cycle model of OSSD projects, 2007
 Comparison of Traditional Life-Cycle with OSSD Life-Cycle
Fig. 3 compares different phases of traditional software development lifecycle with OSSD life-cycle mentioned in Fig.2.
Fig. 3: Comparison of Traditional Life-Cycle with OSSD Life-Cycle
200
Kaur & Singh
Initiation phase of OSSD life –cycle combines three phases i.e. planning
phase, analysis phase and design phase of traditional software
development life-cycle. As it is suggested that it may be more important
to get design right prior to actual programming so that all developers are
working towards a clearly defined common purpose. Implementation
phase combines the different aspects like review, contribution, precommit test and release of production. As multiple users as well as skilled
personnel are involved in OSSD, parallel debugging and different versions
of one piece of code can be grouped together with support phase of
traditional software development life-cycle.
Proposed Metrics for the Selected Model
Keeping in view OSSD life - cycle model proposed by Roets, et al. (2007),
a following set of metrics is proposed to keep a check over the generation
of multiple processes in OSSD.
Total Number of Contributions
Under the considered model, a large number of users in an open
environment contribute towards the development of the project. This
metrics assesses the total number of contributors for the projects. This
number can be a number of unique contributors or some contributors
may be associated with multiple projects. However, this metric is related
to the number of contributors for a given project irrespective of their
affiliations to other projects.
Average Domain Experience of Contributors
A particular project is developed on a specific domain. Usually the
contributors having some expertise and experience in that domain area
contribute to the project. This metric helps in evaluating the average
experience of all the contributors taken together and can be represented
as
Cumulative Experience of Contributor i.e.
n
E = ∑ ei
i=1
[Where ei is the experience of an individual contributor in that domain]
Average Experience of Contributors
i.e. Eavg = E / N
[Where N is total number of contributors]
This metrics tends to measure the extent of support to the development
of a project by the contributors. It is safe to assume that greater the
average experience of a contributor, more robust development of the
project would be.
201
Kaur & Singh

Average Time for a Completion of a Version of Project
Quick versioning is the essence of OSS development. However,
different versions at different rates of time depending upon the factors
like number of contributors, their experience, nature of project,
complexity of the project etc. The average time for a completion of a
version of project can be calculated as:
Average Time i.e. Tavg = Ttotal / Nversion
[Where Ttotal is total time taken to develop all the versions and N version is
the total number of versions generated]
Greater Tavg would indicate software development processes
resulting from various factors like low number of contributors, their lack
in experience or complexity of the project etc.
 Bugs Track per Version
Quality of OSS is always a question. However, with proper bug
tracking mechanism and tools in place, the bug tracking can be made very
effective and the quality of OSS can be enhanced. The number of bugs
tracked per version is an indication of quality and reliability of the
product. Hence, this measure can be put to an effective use for
enhancing the quality of the final product.
 Patch Accept Ratio
Every contributor sends a patch for the enhancement of the
product. However, it is not necessary that every patch sent by the
contributor(s) is accepted for updating the product. The Patch Accept
Ratio i.e. Pratio can be defined as:
ra o
Total no. of patches accepted
Total no. of patches accepted
A high Patch Accept Ratio shall effectively argue for a high competence of
contributors and reverse is true for the less patch ratios.
 Number of effective Reviews Received
In addition to the development of patches, some contributors send
their reviews about the products in making. A large number of effective
reviews indicate that some functionality was not taken care of by either
the developing contributor or the mentors. Greater the number of
effective reviews more is the gaps in the development process.
Therefore, the number of effective reviews can result in an effective
developmental methodology.
Conclusion
The OSSD is different from traditional software development because of
its open development environment. The development processes are
202
Kaur & Singh
different and the measures required to assess them have to be different.
This paper attempts to present a set of basic metrics which can be used
to measure basic development processes in an OSS environment.
However, these need to be validated and established by using them on
large number of systems.
References
Addison, T., & Vallabh, S. (2002). Controlling software project risks - an
empirical study of methods used by experienced project
managers. In Proceedings of the 2002 annual research
conference of the South African institute of computer scientists
and information technologists on Enablement through
technology (SAICSIT '02) (128-140). Republic of South Africa:
South African Institute for Computer Scientists and Information
Technologists. Retrieved from
Adelstein, T. (2003). How to misunderstand open source software
development. Retrieved from
http://www.consultingtimes.com/ossedev.html
Asiri, S. (2003). Open source software. SIGCAS Computers and Society, 33
(1), 2. doi: 10.1145/966498.966501
Bezroukov, N. (1999). Open source software development as a special
type of academic research: Critique of vulgar Raymondism. First
Monday. 4 (10). Retrieved from
http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/arti
cle/view/696/606
Boehm, B. (1991). Software risk management: principles and practices.
IEEE Software, 8(1), 32-41.
Farber, D. (2004). Six barriers to open source adoption. Retrieved from
http://www.zdnetasia.com/six-barriers-to-open-sourceadoption-39173298.htm
Feller, J., & Fitzgerald, B. (2001). Understanding open source software
development. London: Addison-Wesley.
FLOSS Project Report. (2002). Floss Project Report: Free/Libre and open
source software (FLOSS): Survey and study. Retrieved from
http://www.infonomic.nl/floss/report/
nd
Frenzel, C. (1996). Management of Information Technology (2 ed).
Cambridge, MA: CTI
nd
Hughes, B., & Cotterell, M. (1999). Software project management (2 ed).
Berkshire, United Kingdom: McGraw-Hill.
Jorgensen, N. (2001). Putting it all in the trunk: Incremental software
development in the FreeBSDopen source project. Information
203
Kaur & Singh
Systems Journal, 11(4), 321-336. doi:10.1046/j.13652575.2001.00113.x
Levesque, M. (2004). Fundamental issues with open source software
development. First Monday. 9 (4). Retrieved from
cle/view/1137/1057
McConnell, S. (1996). Avoiding Classic mistakes. IEEE Software, 13(5),
111-112. doi: 10.1109/52.536469
Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2000). Two case studies of
open source software
development: Apache and Mozilla.
ACM Transactions on Software Engineering and Methodology,
11(3), 309-346. doi:10.1145/567793.567795
Perens, B. (1999). The open source definition. In M. Stone, S. Ockman &
C. Dibona (Eds.), Open sources: Voices from the open source
revolution. Sebastopol, California: O'Reilly & Associates.
Perens, B. (2004). The open source definition. Retrieved from
http://opensource.org/docs/def_print.php
Raymond, E. (1998). The cathedral and the bazaar. First Monday. 3 (3).
Retrieved from
cle/view/1488/1403
Roets, Minnaar., et al. (2007). Towards Successful Systems Development
Projects in Developing Countries. In Proceedings of the 9th
International Conference on Social Implications of Computers in
Developing Countries, São Paulo, Brazil, May 2007.
Satzinger, J. W., Jackson, R. B., & Burd, S. D. (2004). Systems Analysis and
rd
Design in a Changing World(3 ed). Boston: Course Technology.
Schweik, C. M., & Semenov, A. (2003). The institutional design of open
source programming: Implications for addressing complex public
policy and management problems. Retrieved from
cle/view/1019/2426
th
Sommerville, I. (1995). Software Engineering (5 ed). Harlow: AddisonWesley Longman Limited.
Stallman, R (2007). Why “Free Software” is better than “Open Source”.
Retrieved from
http://www.gnu.org/philosophy/free-software-forfreedom.html
Valloppillil, V. (1998). Open source Initiative (OSI) Halloween I: A (new?)
software development methodology. Retrieved from
http://www.opensource.org/halloween/halloween1.php#comm
ent28
204
Kaur & Singh
Vixie, P. (1999). Software Engineering. In M. Stone, S. Ockman & C.
Dibona (Eds.), Open sources: Voices from the open source
revolution. Sebastopol, California: O'Reilly & Associates.
Wheatley, M. (2004). Open Source: The myths of open source. CIO, March
01, 2004. Retrieved from
http://www.cio.com/article/32146/Open_Source_The_Myths_o
f_Open_Source
Wheeler, D. A. (2003). Why open source software/free software (OSS/FS)?
Look at the numbers! Retrieved from
http://www.dwheeler.com/oss_fs_why.html
Wynn, Jr. D. E. (2004). Organizational structure of open source projects: A
life cycle approach. In proceedings of the 7th Annual Conference
of the Southern Association for Information Systems, Savannah.
Retrieved from
http://sais.aisnet.org/2004/.%5CWynn1.pdf
205
Open Source Systems and Engineering…
Iqbal, Quadri & Rasool
Open Source Systems and Engineering: Strengths, Weaknesses and
Prospects
Javaid Iqbal
S.M.K.Quadri
Tariq Rasool
Abstract
Purpose: This paper reviews the open source software systems (OSSS) and the
open source software engineering with reference to their strengths, weaknesses
and prospects. Though, it is not possible to spell out the better of the two software
engineering processes, the paper outlines the areas where the open source
methodology holds edge over conventional closed source software engineering.
Then, the weaknesses are also highlighted, which tilt the balance the other way.
Design/Methodology/Approach: The study is based on the works carried out
earlier by the scholars regarding the potentialities and shortcomings of OSSS.
Findings: A mix of strengths and weaknesses make it hard to pronounce open
source as the panacea. However, the open source does have very promising
prospect; owing to its radical approach to the established software engineering
principles, it has spectacularly managed to carve a “mainstream” role, that too in
just over a few decades.
Keywords: Open Source Software (OSS); Open Source Development Paradigm;
Software Engineering; Open Source Software Engineering.
Introduction
pen source traces back to early 1960s, yet as a term, “open
source initiative” was coined in 1998 (Open Source, n.d). The
history of open source is closely tied to that of UNIX. The rise of
open source paradigm marks the end of the dominance of the
proprietary-driven, close source software setup that dominated the arena
over many decades. A new ideology that promises a lot in terms of
economics, development environment and unrestricted user
involvement, has been evolving in a big way thrusted into the big picture
by loosely-centralized, cooperative, and gratis contributions from the
individual developer-user to startle the purists in the field of software
engineering. Eventually, open source software phenomenon has
systematically metamorphosed from a “fringe activity” into a more
mainstream and commercially viable form. The open source initiative
succeeded spectacularly well.
O

Assistant Professor. P.G Department of Computer Science, University of Kashmir- North
Campus (India). email: [email protected]

Kashmir. 190 006. email: India. [email protected]

Lecturer. P.G Department of Computer Science. Islamic University of Science and
Technology. email l: India. [email protected]
206
Defining Open source Software
The term open source software (OSS) refers to software equipped with
licenses that provide existing and future users the right to use, inspect,
modify, and distribute (modified and unmodified) versions of the
software to others. It is not only the concept of providing “free” access to
the software and its source code that makes OSS the phenomenon that it
is, but also the development culture (Raymond, 1999). Kogut and Metiu
(2001) also comment on open source as right offered to the users to
change the source code without making any payment. Nakakoji,
Yamamoto, Nishinaka, Kishida and Ye (2002) refer to OSS as software
systems that are free to use and whose source code is fully accessible to
anyone who is interested in them.
Open Source Software Engineering
The open source development (OSD) model fundamentally changes the
approaches and economics of traditional software development marking
a paradigm shift in software engineering. Open source is a software
development methodology that makes source code available to a large
internet-based community. Typically, open source software is developed
by an internet-based community of programmers. Participation is
voluntary and participants do not receive direct compensation for their
work. In addition, the full source code is made available to the public.
Developers also devolve most property rights to the public, including the
right to use, to redistribute and to modify the software free of charge.
This is a direct challenge to the established assumptions about software
markets that threaten the position of commercial software vendors (Hars
& Ou, 2001). Torvalds et al (2001) acknowledges that OSS is not
architected but grows with directed evolution. An open source software
system must have its source code available free, for its use, customtailoring or its evolution in general by anyone whosoever is interested.
Thus, from the point of view of a purist in traditional software
engineering, open source is a break-away paradigm in terms of its
defiance of conventional software engineering and non-adherence to the
standardized norms and practices of the maturing software engineering
process that we have been carrying along with so much of devotion all
the way through our legacy systems. The open source development
model breaks away from the normal in-house commercial development
processes. The self-involved/self-styled open source developer-user uses
the software and contributes to its development as well, giving birth to a
user- centered participatory design process.
207
What Leads to the Success of OSS?
Many important factors have catapulted the paradigm of OSS
development to the forefront in software industry, which include cost,
time, manpower, resources, quality, credit acknowledgement, spirit of
shared enterprise etc:
 Cost: OSS products are usually freely available for public
download.
 Time: The fact that OSS is massive parallel development and
debugging environment wherein the parallel but collaborative
efforts of globally distributed developers allow the development
of OSS products much more quickly than conventional software
systems, considerably narrowing down the gestation period.
 Manpower: With the development environment being spread
across the globe, the best-skilled professionals work under the
global development environment. This means more people are
involved in the process.
 Resources: Again, more skilled professionals offer their resources
for the development of OSS products.
 Quality: OSS products are recognized for their high standards of
reliability, efficiency, and robustness. Raymond (2001) suggests
that the high quality of OSS can be achieved due to high degree of
peer review and user involvement in bug/defect detection.
 Credit Acknowledgement: The fact that people across the globe
work on OSS find a chance collaborating with their peers gaining
immediate credit acknowledgement for their contribution.
 Informal Development Environment: The informal development
environment, unlike the organizational settings, liberates the
developers from formal ways and conduct; more students see a
chance working on real-time projects at their places.
 Spirit of Shared Enterprise: Organizations that deploy OSS
products freely offer advice to one another, sharing insights
regarding the quality upliftments and lessons learnt.
Open Source Software Development Process versus Conventional
Software Development Process
Open source development is attracting considerable attention in the
current climate of outsourcing and off-shoring (globally distributed
software development). Organizations are seeking to emulate open
source success on traditional development projects, through initiatives
variously labeled as inner source, corporate source, or community source
(Dinkelacker & Garg, 2001; Gurbani, Garvert & Herbsleb, 2005). The
conventional software development process encompasses the four
208
phases comprising the software development life cycle. These phases are
planning, analysis, design, and implementation. In open source software
development process, these phases are accomplished in a way that is
probably blurry in a sense as the first three phases of planning, analysis,
and design are, kind of, blended and performed typically by a single
developer or small core group. Given the ideal that a large number of
globally distributed developers of different levels of ability and domain
expertise should be able to contribute subsequently, the requirement
analysis phase is largely superseded. Requirements are taken as generally
understood and not in need of interaction among developers and endusers. Design decisions also tend to be made in advance before the larger
pool of developers starts to contribute. Modularization of system is the
basis for distributing the development process. Systems are highly
modularized to allow distribution of work and thereby reduce the
learning efforts to be made by new developers to participate (they can
focus on particular subsystems without needing to consider the system in
its totality). However, over-modularization can have reverse effects by
increasing the risk of common coupling, an insidious problem in which
modules unnecessarily refer to variables and structures in other modules.
Thus, there has to be a balanced approach vis-à-vis modularization.
In proprietary software, software quality testing is limited within a
controlled environment and specific scenarios (Lerner & Tirole, 2002).
However, OSS development involves much more elaborate testing as OSS
solutions are tested in various environments, by various skills and
experiences of different programmers, and are tested in various
geographic locations around the world (Lakhani & Hippel, 2003; Lerner &
Tirole, 2002; Mockus, Fielding & Herbsleb, 2002; West, 2003).
In the OSS development life cycle, the implementation phase consists of
several sub- phases (Feller & Fitzgerald, 2002):
 Code: Writing code and submitting to the OSS community for
review.
 Review: Strength of OSS is the independent and prompt peer
review.
 Pre-commit Test: Contributions are tested carefully before being
committed.
 Development Release: Code contributions may be included in the
development release within a short time of having been
submitted—this rapid implementation being a significant
motivator for developers.
 Parallel Debugging: The so-called Linus’ Law, “given enough
eyeballs, every bug is shallow” as the large number of potential
209
debuggers on different platforms and system configurations
ensures bugs are found and fixed quickly.
 Production Release: A relatively stable, debugged production
version of the system is released.
A common classification of the various stages of open source software is
planning (only an idea, no code written), pre-alpha (first release, code
written may not compile/run), alpha (code released works and takes
shape), beta (feature-complete code released but low reliability- faults
present), stable (code is usefully reliable, minor changes) and mature
(final stage- no changes).
Strengths of Open Source Software
According to Feller and Fitzgerald (2000), OSS is characterized by active
developers’ community living in a global virtual boundary. OSS has
emerged to address common problems of traditional software
development that includes software exceeding its budget both in terms
of time, and money, plus making the production of quick, inexpensive,
and high quality reliable software possible. The advantages and unique
strengths of open source software systems include release frequency,
solution to software-crisis, scalability, learnability and customer input and
so on.
 Release Frequency
One of the basic tenets of open source system is “release early, release
often” (Raymond, 1999).
It is this tenet which helps a significant
feedback on a global level to shape up the open source product. With
the exceptional globally distributed test-users, who report their fault
findings back, the frequent release policy is very feasible. However, highrelease frequencies are infeasible for production environments. For these
types of uses, stable releases are provided, leaving the choice about
tracking new releases and updates in the hands of the users.
 Solution to Software-Crisis
The recurring problems of exceeding of budget, failure to meet deadlines
in development schedule, and general dissatisfaction when the product is
eventually delivered especially in highly complex systems always
demands an alternative to circumvent these problems so that the socalled “software crisis” is dealt with. Open source software model does
promise a solution in this regard. The source of its advantage lies in
concurrence of development and de-bugging (Kogut & Metiu, 2001). In
fact, OSS is massively parallel development and debugging.
 Scalability
According to Brooks Law, “adding people to a late project makes it later”.
The logic underlying this law is that as a rule, software development
productivity does not scale up as the number of developers increases.
210
However, the law may not hold well when it comes to software
debugging and quality assurance actions. Unlike, the software
development productivity, quality assurance productivity does scale up as
the number of developers helping to debug the software increases.
Quality assurance activities scale better since they do not require as
much interpersonal communication as software development activities
(particularly design activities) often do. In an OSS, there is a handful of
core developers (who need not centralized but could be spread across
the globe) are responsible for ensuring the architectural integrity of the
software system. Then, there is a multitude of user-developers who form
a user community across the globe. This community conducts the testing
and debugging activities on the software released periodically by the core
team. There is an obvious dynamism in the roles of the developer-at core
and user- in community, in the sense that their roles may change in the
context of above discussion.
 Learnability
A very good thing about open source software development is that it is
an inherent learning process for anyone involved with it. A member does
contribute to the software development but at the same time learns
from the community. Thus, open source is a global campaign for skill-set
development. According to Edwards (2000), “open source software
development is a learning process where the involved parties contribute
to, and learn from the community”.
 Customer Input
The informal organizational structure of core and community does not
introduce any delays in the reporting of bug by a user to the core, who
can immediately fix it. Moreover, the use of some impressive internetenabled configuration management tools [e.g. GNU-CVS (concurrent
versioning system)] allows a quick synchronization with updates (issued
by core) on part of community. This mechanism of immediate reward, by
way of rapid bug-fixing, in open-source user community helps upholding
the quality assurance activity. There are no restrictions on bug-fixing by
them when the source code is open or they can design a test case of the
same for use by core. Such a positive influence of the user community
supplements debugging process in its entirety thereby leading to a visible
improvement in software quality.
This discussion should not drop a notion that OSSS are a panacea. Such
systems do have their weaknesses too, in fact, plenty of them.
Weaknesses of Open Source Software
Open source is by no means a panacea and does have its own
weaknesses. As expected, most of the weaknesses are the result of lack
of formal organization or clear responsibilities.
211
 Diversity in Communication
The globally distributed development environment brings in the
developers from different cultural backgrounds and differing time-zones,
having never met in person. Moreover, even if they cross these barriers,
they hit a stumbling block when skillful community members find it hard
to communicate in English. As a result, misunderstandings do crop up and
the communicated content may be misconstrued. This may set in a feel
of lack of cooperation, good manners, and useful information among the
community.
 Uncoordinated Redundant Efforts
With little coordination among the open source team, independent
parties sometimes carry out tasks in parallel without knowing about each
other. This consumes additional resources, but it may prove to be a
blessing in disguise as there may be several solutions to choose from. The
choice amongst the alternatives makes the selection difficult.
 Absence of Organizational Formalism
It surfaces multi-pronged weaknesses. Absence of laid down formal rules
and conventions, makes it hard for the community to work on systematic
lines. This may manifest as lack of organizational commitment in terms of
a time-schedule, and a diverging organizational focus. Without a timeschedule and without a concerted focus (spearheading), the distributed
nature helps offset priorities, which may be either nonexistent or
severely skewed towards the personal biases of influential contributors.
In this un-organizational setting where no one is boss and no one is
bossed, forcing the prioritization of certain policy matters is not possible.
 Non-Orientation of New-comers
The new-comers do not undergo skill-setting and behavior-shaping
orientation training. The new-comers have to learn the nitty-gritty
involved very subtly. The tightly-knit community can do well, sharing
their cultural backgrounds, but the new-comers are a problem. In fact,
every one competes for attention and talent; these barriers to entry are
very damaging to a project.
 Dependency on Key Persons
The bulk of the work is done by a few dedicated members or a core team
-- what Brooks calls a "surgical team" (Jones, 2000). Instead, we find that
many projects critically depend on a few key persons who have the level
of intimate knowledge that is required to understand all parts of a large
software system. It is usually the core contributors who are the key
persons. However, this dependency can become a liability if these key
persons are unable to continue work on the project for some reason. It
may be impossible to reconstruct the implicit knowledge of these persons
from their artifacts (source code, documentation, notes, and emails)
alone. This often leads to project failure.
212
 Leadership Traits
Open source leaders who, lead by persuasion alone, are judged on the
basis of their technical skills, vision and communication skills. Raymond
(1999) points out that the success of the Linux project was to a large
degree due to the excellent leadership skills demonstrated by its founder
Linus Torvalds. The scarcity of good leaders is one of the growthinhibiting factors in open source enterprises.
Prospects
The open source phenomenon raises many interesting questions. Its
proponents regard it as a paradigmatic change where the economics of
private goods built on the scarcity of resources are replaced by the
economics of public goods where scarcity is not an issue. Critics argue
that open source software will always be relegated to niche areas; that it
cannot compete with their commercial opponents in terms of product
stability and reliability (Lewis, 1999). Moreover, they also argue that
open source projects lack the capability to innovate.
The OSSS prospect sounds encouraging when the absence of direct pay
(compensations) and monetary rewards as well as property right claims
have never been a bottleneck for its pervasiveness. It has direct
implications on social welfare. Open source may hold key to the so-called
“software-crisis”. The flourishing of this model to the extent of a
significant market-share in absence of any marketing/advertising makes
the prospect even sounder. OSS having been known for operating
systems and development tools have already stepped into the arena of
entertainment applications’ development. Actively growing interaction
between academic institutions and the IT industry has contributed
significantly in research and development of such systems and the
progress is going great done. Open source is internet-based and hence
together with ICT (Information and Communication Technology) has a lot
of scope in terms of development and economics.
Conclusion
As an emerging approach the open source paradigm provides an effective
way to create a globally distributed development environment wherein
the community on a specific open source software project is interacting
constantly and providing feedback to activities such as defect
identification, bug fixing, new feature request, and support requests for
the further improvement. This activity is rewarded by peer recognition of
their work and immediate recognition through credit acknowledgement
creating a promotional influence on effective development practices
across the community.
213
Open source has its strengths and weaknesses. The strengths come from
its innovative development in and across a global development
community of user-turned-developer. The weaknesses stem from the
daring defiance of established and matured conventional software
engineering principles and practice. However, though a good mix of
strengths and weaknesses hold the open source in balance, the prospects
of this paradigm are promising. Fostering innovation to improve
productivity seems to be mission-statement of open source.
References
Dinkelacker, J., & Garg, P. (2001). Applying Open Source Concepts to a
Corporate Environment. In Proceedings of 1st Workshop on
Open Source Software Engineering, Toronto, May 15, 2001
Retrieved from
http://opensource.ucc.ie/icse2001
Edwards, K. (2000). Epistemic Communities, Situated Learning and Open
Source Software Development. Department of Manufacturing
Engineering and Management, Technical University of Denmark,
2000
Feller, J., and Fitzgerald, B. (2002). Understanding Open Source Software
Development, Addison-Wesley; London, 2002.
Feller, J., & Fitzgerald, B. (2000). A framework analysis of the open source
software development paradigm. In Proceedings of the 21st
Annual International Conference on Information Systems, pp.
58–69, Brisbane, Australia, 2000.
Gurbani, V. K., Garvert, A., & Herbsleb, J. D. (2005). A Case Study of Open
Source Tools and Practices in a Commercial Setting. In
Proceedings of the 5th Workshop on Open Source Software
Engineering, St. Louis, MO, May 17, 2005, pp. 24-29.
Open Souce. (n.d). The Open Source Definition. Open Source Initiative.
Retrieved from http://www.opensource.org
Hars, A., & Ou, S. (2001). Working for free?-Motivations of Participating
th
in Open Source Projects. In Proceedings of the 34 Hawaii
International Conference on System science-2001
Jones, P. (2000). Brooks' Law and open source: The more the merrier?
IBM Developer Works, May 2000.
Kogut, B., & Metiu, A. (2001). Open Source Software Development and
Distributed Innovation. April 2001
Lakhani, K. R., & Hippel, E. Von. (2003). How open source software works:
“free” user-to-user assistance. Research Policy. 32 (6), 923–943.
Lewis, T. (1999). The open source acid test. Computer, 32 (2), 125-128.
doi:10.1109/2.745728
214
Lerner, J., & Tirole, J. (2002). Some simple economics of open source.
Journal of Industrial Economics. 50 (2),197–234.
Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2002). Two case studies of
open source software development: Apache and Mozilla. ACM
Transactions on Software Engineering and Methodology. 11 (3),
309–346.
Nakakoji, K., Yamamoto Y., Nishinaka, Y., Kishida, K., & Ye. Y. (2002).
Evolution Patterns of Open-Source Software Systems and
Communities. In Proceedings of International Workshop on
Principles of Software Evolution (IWPSE 2002) (Orlando, FL,
2002), 76-85.
Raymond, E. S. (1999). The cathedral & the bazaar: Musings on Linux and
open source by an accidental revolutionary. Beijing: O'Reilly.
Raymond, E. S. (2001). The cathedral and the bazaar: Musings on Linux
and Open Source by an accidental revolutionary. Beijing: O'Reilly.
Torvalds, L. et al. (2001). Software Development as directed Evolution
Linux Kernel Mailing List, December 2001
West, J. (2003). How open is open enough? Melding proprietary and
open source platform strategies. Research Policy. 32 (7), 1259–
1285.
215
Appraisal & dissemination of OS Operating System…
Kumbhar, Ghotkar & Tumma
Appraisal and Dissemination of Open Source Operating Systems and
Other Utilities
Satish S. Kumbhar
Santosh N. Ghotkar
Ashwin K. Tumma
Abstract
Purpose: In recent years there has been a substantial development in the arena of
open source software (OSS) development. Both, academia as well as industry are
focusing on developing their software in the open source genre. This paper
presents a survey of the Open Source Operating System GNU/Linux and discusses
its intricacies at length. It also throws light on some of the extremely popular open
source utilities used in diverse sub-domains of Computer Science and Engineering.
Methodology: A profound survey and analysis has been undertaken with regard
to the open source to build this compilation.
Findings: The appraisal and dissemination has found that there is a square
increment in the usage of OSS in both academia as well as in the industry. It also
throws light on the near future of OSS usage.
Research Implications: Any appraise or survey that is conducted today is bound to
be superannuated tomorrow. Nevertheless, OSS will continue to remain in the
market with newer trends coming in. The paper can be a motivation for further
contributions to OSS.
Originality/ Value: The paper conglomerates diverse domain OSS that are used in
the market, and highlights their usage with emphasis on their other features as
well.
Keywords: Open Source; GNU/ Linux; Utilities; Debian; Mozilla; Apache Web
Server; My SQL.
Paper Type: Survey
Introduction
here has been a gigantic rise in the use of computers and
eventually the software used in them. Computers have
predominantly made their presence in almost all the domains of
mankind. Name any field, and we are bound to find the influence of
computers in it. As such, when computers grew, the software required
for them also started rising at an exponential pace, and today the
scenario is such that the development of software have been ad
nauseam.
With a galore rise in the computer industry, novel products keep on
creeping in the market adding complexities to the diligent customers or
T

Department of Computer Engineering and Information Technology, College of Engineering
Pune, India. email: [email protected]

Department of Computer Engineering and Information Technology, College of Engineering
Pune, India. email:: sng.comp @coep.ac.in

Department of Computer Engineering and Information Technology, College of
Engineering Pune, India. email:: tummaak08.comp
216
end users. Now, the end user has an array of options available at his
service which can be used for his needs and/ or business purpose.
Engineers and developers have assiduously been in the quest of pushing
the boundaries of Engineering and developing high quality software. This
development has mainly revolved around two broad categories, viz. open
source and closed source softwares. A recent trend in the arena of
software development is the open source genre.
OSS are the software which are publicly available in the form of source
codes and are distributed under software licenses that allow its users to
study the software, make changes to it as per the users' requirement and
convenience, improve the software in terms of quality or to cater the
users' necessity, even distribute the software with due diligence to the
owner, and conforming to the license of the software. The rationale
behind the open source code of the software is that the user requires
access to un-obfuscated source code because it is exceedingly
implausible to evolve programs without modifying them. Since the main
motif behind the software development is to make evolution easy, it is
mandatory that the modification be made easy. As such there are
numerous kinds of software being developed using the terminologies of
open source. Open source development has not left any aspect of
software untouched. Right from operating systems to benign utilities,
open source holds a prevalent share may it be any field.
It would not be an exaggeration to say that the number of open source
utilities available are much more efficient than those of the propriety
ones. The prime reason behind this is that, in most cases, there is scarcely
any monotony across the software development, quality and usage.
One of the path-breaking developments that happened was the
development of GNU/Linux Operating System, an open source operating
system. Its underlying source code can be used freely, modified and
redistributed, both commercially and non-commercially, by anyone under
licenses such as the GNU General Public License. GNU/Linux which falls
under the UNIX-like operating system family continuously evolves, with
various releases in play along with support for multi-lingual
environments. This paper provides a survey of this operating system and
the intricacies that are involved in its development, distribution, usage
and market shares.
Open source has also made its presence felt in other domains of software
technologies, like that of web browsers, database management systems,
web servers, web application development, data mining, artificial
intelligence, virtualization, network related tools, proxy servers, office
suites, web cache daemons, bug trackers, etc. Some of the widely known
software in the above mentioned technologies are: Mozilla Firefox web
browser, MySQL database management system, Apache Web Server,
217
LAMP software package or suite for web development, Oracle Virtualbox
as a virtualization suite, Weka data mining tool built in Java, Squid proxy
server, Open office suites and fields like dynamic and light web
application development using AJAX. The paper presents an introduction
to some of the above listed technologies in the form of a survey and also
explicates the involutions of the same.
Open Source Technologies
 GNU/Linux
GNU/Linux falls under the category of UNIX family of operating systems.
It is the most popular open source software, of which the underlying
source code can be used, freely modified as per the users’ requirements
and also redistributed, both in commercial as well as non-commercial
domains. The license on which this is built is GNU General Public License
(About GNU, 2009). Linux was first gestated by a Finnish software
engineer and hacker, Linus Torvalds in the year 1991. The name Linux
comes from the Linux Kernel written by him. Later, the primary user
space system tools and libraries were taken from the GNU Project.
Even after coming a long way after Linux development, the naming issue
still remains controversial. The Free Software Foundation has vox populi
that the Linux distributions that use GNU software be referred to as
GNU/Linux or Linux-based GNU system. But, the media and most
population around refer to it as simply, Linux. Authors have no biased
opinions to any of sides, and here onwards, GNU/Linux and Linux should
refer to the same.
Typically, Linux is distributed in the packaged format called the Linux
distributions. Linux distribution refers to the monolithic Linux Kernel
which handles process control, networking and file system access along
with its supporting utilities and libraries. Linux has undoubtedly made its
spectacular presence on a wide variety of computer hardware that range
from handheld mobile devices, phones and tablet machines to
mainframes and high end servers that use supercomputers. It is typically
available in two variants, one for the desktop machines and other for the
high end servers. Henry (2010) lists Linux as the leading server operating
system and flawlessly runs the 10 fastest supercomputers in the world
without any compromise and without any degradation or devolvement in
its performance.
As regards the development taking place in Linux, Linus Torvalds still
continues to direct the development of the Linux Kernel. The Linux kernel
(Torvalds, 1992) has undergone legion versions the current stable version
being: 3.1.1. For years together, the Linux kernel had versions with 2.6.x,
with x being a numeric value representing the release. The version 3
kernel has brought in a significant paradigm shift in the frames of kernels
218
(Linux Kernel, 2011). Richard Stallman, initiator of the GNU Project,
heads Free Software Foundation which supports the GNU components in
Linux distributions. Countless programmers worldwide develop third
party components that are integrated in the distributions.
With respect to the user interface amenities also, Linux stands at the
apex. It provides a powerful command-line interface as well as a graphical
user interface with many outstanding features which are built on KDE
Desktop, GNOME, etc. and popular one being the X system.
Today, Linux distributions hold a major share in most of the domains. It is
successful in securing a place in the server installations both in homes
and academia. Various local and national governments have also started
supporting and promoting Linux. In India, Kerala state has also enforced
that all high schools and other academic organizations run Linux on them.
The market shares of Linux are shooting up at a high pace with a gigantic
increase in the revenue of servers, desktops and packaged software.
Linux shares an overall 12.7% market share with more than 60% webservers running Linux as against many leading operating systems (IDC
Report, 2009). Also in many surveys conducted worldwide, seniors in this
domain, recommend Debian Linux distributions for servers because of
their sturdiness and power. Analysts and proponents of Linux attribute
this success to the security, reliability and low cost with freedom.
 Debian: The Universal Linux Distribution
As mentioned earlier, Debian is one of the Linux distributions available
today in abundance. It is a distribution composed of software packages
release under the GNU General Public License and other free software
licenses. Debian OS is very well known for its conformity to the UNIX and
free software terminologies as well as using collaborative software
development and testing processes. Debian was first incepted in August
1993, and since then it has earned wide popularity because of its ease in
the user operations. Aesthetic beauty of the graphical user interfaces has
been the charm of Debian and its variants. The most promising feature
that Debain offers is that currently it is available in more than 65
languages, along with support for many Indian vernacular languages. This
has contributed to end the tyranny of linguistic burdens of the masses.
Debian also uses Monolithic Linux kernel and the current release of it is
6.0.3 which is named as “Squeeze”. Debian managers strictly follow the
bugs and perform rigorous testing on the product and do not release
their product unless it is bug-free from their perspective (Distro Watch,
2008)
Software in Debian is available in the form of packages named .deb
packages and can be downloaded and installed lucidly. Even a novice
user of the Linux box can easily cope up with it. The reason of wide
219
popularity of this OS is its package manager. The package manager here is
named as “dpkg” and is the simplest of all the package managers
available. Debian simply maintains a repository at various geographical
locations in the world, and the users can download the software they
require merely by one command or one click, which proves that this is a
classic example of simplicity at its best.
One of the major variant of Debian is Ubuntu Operating System. Ubuntu
is primarily designed for desktop, notebook and server usage. It follows
the Debian philosophy and inherits their style. Ubuntu is one of the most
popular and favorite operating system amongst the student community
because of its ease of use, its free availability and simplicity of software
development (Ubuntu, 2011). Web statistics portray that Ubuntu shares
more than 50% of the market share of Linux desktop usage in the world.
Ubuntu is also gaining rise in the server editions as well (Stat Owl, 2010).
Recently, Ubuntu has also stepped in the world of Cloud Computing
which is at its peak today. It allows its users to build their own cloud
infrastructure, be it public or private clouds.
Its sophisticated
orchestration tools assist the users in deployment, managing and scaling
their cloud related services within seconds, thereby reducing the total
down time of any enterprise; which can then, in a long run, increase the
capital costs of the enterprise (Debian FAQ, 2008).
 The Mozilla Project
As their tag line says, “We are building a better Internet”, Mozilla project
is focused more on the development of internet based software. Their
primary and most popular used software being open source web
browser, the Mozilla Firefox. The current version of Mozilla Firefox is 8.0
and in a matter of few days, the downloads of the browser have
exceeded more than hundred million, which has been a path-breaking
record of its kind. Statistics point to the fact that that no other browser
has ever achieved such a high acclaim in such a short span of time. It
enjoys a world popularity of more than 25% usage (Synder, 2011). Also,
Mozilla Firefox has been the first web browser that has rolled out rapid
releases/ versions to its users. The aim of this faster-speed process is to
get new functions to the users faster.
The primary factor which has contributed to such a high popularity of this
browser is that, it is based on open source, users and programmers can
customize it as per their requirements and one of the vivid features being
the availability of add-on features to the browser. Programmers from all
around the world, write some add-on features for the browser which can
be freely downloaded from the World Wide Web and can be lucidly
integrated with the current browser. The aesthetic beauty of the browser
is far ahead of comparison with others in this domain. The charm of open
220
source development can be clearly seen in case of this browser. Mozilla
Foundation sets the policies that govern the development, operates key
infrastructure and controls trademarks and other intellectual property of
it. The Mozilla Foundation was founded by the Netscape-affiliated Mozilla
Organization and was incepted in the year 2003. Since then, the growth
of it has been magnificent due to its ideas of open source releases and
user satisfaction.
The most significant contribution that Mozilla Foundation has given to
the world is that, it is dedicated in preserving and promoting a healthy
online space by developing versions of Firefox. Mozilla Foundation has
partnered with Knoxville Zoo in effort to raise awareness about
endangered Red Pandas (Knoxville Zoo, 2011).
 Apache: The Open Source Web Server
Apache HTTP Server Project is developed to design and implement an
open source HTTP server for the modern operating systems including all
the families of UNIX as well as Windows operating systems (Apache,
2011). The main aim of the project is to provide a secure, efficient and
extensible server that provides HTTP services in synchrony with the
current HTTP standards. Its initial release was way back in 1995, and since
then it is shaping the web accordingly. Its current stable release is 2.2.17.
Apache Web server is written in C language and is a cross-platform
server. The license under which it is distributed is Apache Software
Foundation’s very own, Apache License 2.0.
It is a web server that has made significant contributions in the
tremendous growth of the World Wide Web. In 2009, Apache was
regarded as the only web server software to surpass the 100 million
website milestone (NetCraft, 2009). Studies have revealed that Apache is
undoubtedly the server that has the maximum market share having more
than 60% servers in the world running Apache on UNIX machines
(Servers, 2011). The combination of Apache and UNIX has proved to be
the most efficient in deployment of Web servers in the world since a long
time.
The main reason that has led to the wide spread acceptance and usage is
the ease and simplicity of deploying the web server software on the
machines. The configuration steps of the same are such that a benign
user can also set up a web server on his/her desktop machine and serve
the web pages. Apache web server also provides a strong security
backup. There are no severe attacks reported till date, with regard to the
security involvement while using the Apache Web server.
In nutshell, statistics show that Apache web server has been the most
promising and reliable web server till now. Recent study (Web Server,
2011) reveals that Apache has served over 59.13% of all websites and
221
more than 66.62% of the million busiest ones.
 MySQL Database Management System
In the domain of database management systems, MySQL stands out at an
apex. MySQL which is also referred to as “My Sequel” is a relational
database management system that runs as a server providing multi-user
access to a number of databases within it. It was developed by MySQL
AB, which is now a subsidiary of Oracle and was first incepted in the year
1995 and its current stable release is 5.5.18. MySQL was successful in
capturing the open source database management system market since
then. It is primarily written in C and C++ and is also a cross-platform
software system. The license under which it is distributed is GNU General
Public License.
MySQL stands as the world’s most popular open source database
(MySQL, 2011) which has a very high download rate as compared to the
others in this domain. MySQL also offers high performance and scalability
in all aspects related to relational database management systems as well
as many other enterprise databases. Recently, MySQL Query analyzer has
also been conceived which is built for Java and .Net applications for
performance optimization. MySQL also offers many profiling tools to
generate reports or profiles of the back-end databases.
MySQL has been so successful in its domain because of its ease and
simplicity in usage as well as administration. MySQL is offered in the form
of both character as well as graphical user interface. The graphical user
interfaces are built on top of MySQL servers and are used to manipulate
the data in the back-end database servers. The current releases of MySQL
claim that there has been a 1500% faster performance of MySQL on
Windows operating system (MySQL Stats, 2011), plus 37.0% faster
performance on Linux operating systems. Moreover, the scalability,
performance schema and partitioning options are enhanced in such a
way that they are way ahead of many other such softwares. MySQL also
offers a superior protection to the database and uses strong
authentication facilities while providing strong internal security
algorithms. This is reason, till date there have been no severe database
related attacks reported with the usage of MySQL.
MySQL powers the most often required Web, E-Commerce and Online
Transaction Processing applications very sophisticatedly. It is a fully
integrated transaction-safe, ACID compliant database management
system. It delivers the ease of use, scalability and performance that has
made it the world’s most popular open source database. Some of the
world’s most trafficked websites run MySQL for their business and other
critical applications.
222
Future Enhancements and Conclusion
Since the inception of the idea of developing software in an open source
way, the concept has come a long way, yet the awareness about the open
source terminologies and technologies is not up to the mark as it should
have been. One of the lacuna of open source development is that, the
source code being available in the hands of multiple personnel, there are
many bolts for the same nut; everyone comes up with their own
approach of software development and this, number of times, might
result in a chaos. No doubt there are versioning systems and profiling
systems available, still there needs to be more management in this
domain so that the influence of it will surely be more than what it is
today. There is a need for enhancement with due respect and due
diligence to the current perfectly working community, OSS systems. Open
source has still to travel a very long distance and will eventually minimize
the software monopolies of some propriety giants in the software world.
Open source technologies are now being made mandatory in most of the
academic as well as government organizations, but still their use is not up
to mark. The ideas and terminologies of developing software in the open
source way need to be inculcated amongst the masses from a benign
level itself. If this is done, then surely the end-users will get more
effective and user convenient software. The main beauty of open source
is that, users are able to edit and play with the source code of the system
as they wish. When any user is given the privilege of editing the source
code of the system as per their convenience and requirements, then
obviously there is a high probability of the acceptance of the software
system on a large scale. Understanding the users’ perspective and needs
is the main key factor that renders open source development on the
golden crown.
References
About GNU. (2009). GNU Operating System: Initial Announcement.
Retrieved from
http://www.gnu.org/gnu/initial-announcement.html
Apache (2011). Apache Web Server Project. Retrieved from
http://httpd.apache.org/
Debian FAQ (2008). Debian GNU/Linux. FAQ. Retrieved from
http://www.debian.org/doc/manuals/debian-faq/
Distro Watch (2008). Linux Distributions – Facts and Figures. Distro
Watch. com. Retrieved from
http://distrowatch.com/stats.php?section=popularity
Henry Burkhart, KSR (2010). TOP 500 Supercomputer sites. Retrieved
from
http://www.top500.org/lists/2010/06
223
IDC Report (2009). IDC Q1 report – Linux for devices.
Knoxville Zoo. (2011). Retrieved from Wikipedia
http://en.wikipedia.org/wiki/Knoxville_Zoo
Linux Kernel (2011). Linux Kernel Archives Download.
Synder, R. (2011). Glow 1.0. Firefox 4 Download Stats. Mozilla Website
Archive. Retrieved from
http://blog.mozilla.com/website-archive/2011/06/14/glow-1-0/
MySQL stats (2010). MySQL Statistics. Retrieved from
http://www.mysql.com/
MySQL (2011). MySQL Official Website. Retrieved from
http://www.mysql.com/
NetCraft (2009). February 2009 Web Server Survey. Retrieved from
http://news.netcraft.com/archives/2009/
Servers (2011). Welcome to the world of web server usage statistics.
Retrieved from
http://www.greatstatistics.com/serverstats.php
Stat Owl (2010). Operating System Version Usage: Market Share of
Operating System Versions (OS analysis). Stat Owl. Retrieved
from
http://www.statowl.com/operating_system_market_share_by_
os_version.php
Torvalds L. (1992). Release notes for Linux volume 0.12. Linux Kernel
Archives. Retrieved from
http://www.kernel.org/pub/linux/kernel/Historic/oldversions/RELNOTES-0.12
Ubuntu. (2011). Retrieved from Wikipedia
http://en.wikipedia.org/wiki/Ubuntu_(operating_system)
Web Server (2011). January 2011 Web Server survey. Retrieved from
http://news.netcraft.com/archives/2011/january-2011-webserver-survey-4.html
224
Morphological Analysis from the Raw Kashmiri Corpus….
Chachoo & Quadri
Morphological Analysis from the Raw Kashmiri Corpus Using Open
Source Extract Tool
Manzoor Ahmad Chachoo
S. M. K. Quadri
Abstract
Purpose: Morphological information is a key part when we consider the design of
any machine translation engine, any information retrieval system or any natural
language processing application. It is important to investigate how lexicon
development can be automated maintaining the quality which makes it of use for
the applications, since manual development can be highly time consuming task.
The paper describe how we can simply provide the extraction rules along with raw
texts which can guide the computerized extraction of morphological information
with the help of the extract tool like Extract v2.0.
Design/methodology/approach: We used Extract v2.0 which is an open source
tool for extracting linguistic information from raw text, and in particular
inflectional information on words based on the word forms appearing in the text.
The input to the Extract is a file containing, an un-annotated Kashmiri corpus and
a file containing the Extract rules for the language. The tools output is the list of
analyses; each analysis consists of a sequence of words annotated with a
identifier that describes some linguistic information about the word.
Findings: The study includes the fundamental extraction rules which can guide the
Extract tool v2.0 to extract the inflectional information and help in the
development of a full lexicon that can be use for developing different applications
in the natural language applications. The major contributions of the study are:

Orthography component: A Unicode Infrastructure to accommodate PersoArabic script of Kashmiri.

Morphology component: A type system that covers the language abstraction
and an inflection engine that covers word-and-paradigm morphological rules
for all word classes.
Research Implications: The study however does not include all the rules but can
be taken as a prototype for extending the functionality of the lexicon. An attempt
has been made to make use of automated morphological information using
Extract tool.
Originality/Value: Kashmiri language is the most widely spoken language in the
state of Jammu and Kashmir. The language has very scarce software tools and
applications. The study provides a framework for the development of a full size
lexicon for the Kashmiri language from the raw text. The study is an attempt to
provide a lexicon support for the applications which make use of Kashmiri
language. This study can be extended for developing spoken lexicon of Kashmiri
language that can be used in spoken dialogue systems.

Faculty. P. G. Department of Computer Sciences, University of Kashmir, Jammu and

Head. P. G. Department of Computer Sciences, University of Kashmir, Jammu and Kashmir.
190 006. India. email: [email protected]
225
Chachoo & Quadri
Keywords: Natural Language Processing; Morphology; Lexicon; Kashmiri
Morphology; Extract Tool; Logic
Paper Type: Design
Introduction
M
orphological information is a key part when we consider the
design of any machine translation engine, any information
retrieval system or any natural language processesing
application. It is important to investigate how lexicon development can
be automated
maintaining the quality which makes it of use for the
applications since manual development can be highly time consuming
task. Attempts have been made to use unsupervised learning to
automate the process (Forsberg & Ranta, 2004; Creutz & Lagus, 2005)
but if under the supervision of humans who simply have to provide
knowledge about the rules along with raw texts can guide the
computerized extraction of morphological information with the help of
the extract tool. Extract v2.0 is an open source tool for extracting
linguistic information from raw text, and in particular inflectional
information on words based on the word forms appearing in the text.
The input to the Extract is a file containing, an un-annotated Kashmiri
corpus and a file containing the Extract rules for the language. The tools
output is the list of analyses, each analysis consists of a sequence of
words annotated with a identifier that describes some linguistic
information about the word
Morphological lexicon with a wide coverage especially with new words as
used in newspaper, texts and online sources forms a key requirement of
the information retrieval systems, machine translation and other natural
language applications. It would be a time consuming task to extract
morphological information manually, so it is natural to investigate how
the lexicon development can be automated. Since large collections of raw
language data in form of technical texts, newspapers and online material
are available and either free or cheap, it is an intelligent idea to exploit
the raw data to obtain the high-quality morphological lexicon (Forsberg &
Ranta, 2004). Clearly, attempts to fully automatize the process using the
226
Chachoo & Quadri
supervised learning technique do not return the quality as expected
(Creutz & Lagus, 2005; Sharma, Kalita & Das, 2002). However, instead of
using different techniques of machine learning for lexicon extraction in
some form, the language experts can use a suitable open source tool like
Extract v2.0 wherein their role would be to write intelligent extraction
rules. The extract tool will start with a large-sized corpus and a
description of the word forms in the paradigms with the varying parts,
referred to as technical stems, represented with variables. In the tool’s
syntax, we could describe the first declension noun of Kashmiri with the
following definition.
paradigm decl1 =
x+"r"
{ x+"i" & x+"iv" & x+"I" & x+"in" } ;
All the forms are given in the curly braces , called the constraint, for
some prefix x, the tool outputs the head x+"r" tagged with the name of
the paradigm for example Ka:r can have other forms like Kar:iv , Ka:ri
,Kar:in.
Given that we have the lemma and the paradigm class label, it is a
relatively simple task to generate all word forms. The paradigm definition
has a major drawback: very few lemmas appear in all word forms but the
tool a solution by supporting propositional logic in the constraint.
Related Work
The most important work dealing with the very same problem, i.e.
extracting a morphological lexicon given a morphological description, is
the study of the acquisition of French verbs and adjectives by Clément,
Sagot & Lang (2004). Likewise, they start from an existing inflection
engine and exploit the fact that a new lemma can be inferred with high
probability if it occurs in raw text in predictable morphological form(s).
Their algorithm ranks hypothetical lemmas based on the frequency of
occurrence of its (hypothetical) forms as well as part of- speech
information signaled from surrounding closed-class words. They do not
make use of human-written rules but reserve an unclear, yet crucial, role
for the human to hand-validate parts of output and then let the algorithm
227
Chachoo & Quadri
re-iterate. Given the many differences, the results cannot be compared
directly to ours but rather illustrate a complementary technique.
Tested on Russian and Croatian, Oliver (2004); Oliver and Tadic (2004 a)
describe a lexicon extraction strategy very similar to ours. In contrast to
human-made rules, they have rules extracted from an existing (part of) a
morphological lexicon and use the number of inflected forms found to
heuristically
choose
between
multiple
lemma-generating
rules
(additionally also querying the Internet for existence of forms). The
resulting rules appear not at all as sharp as hand-made rules with built-in
human knowledge of the paradigms involved and their respective
frequency (the latter being crucial for recall). Also, in comparison, our
search engine is much more powerful and allows for greater flexibility
and user convenience. For the low-density language Assamese, Sharma,
Kalita & Das (2002) report an experiment to induce both morphology, i.e.
the set of paradigms, and a morphological lexicon at the same time. Their
method is based on segmentation and alignment using string counts only
– involving no human annotation or intervention inside the algorithm. It
is difficult to assess the strength of their acquired lexicon as it is
intertwined with induction of the morphology itself. We feel that
inducing morphology and extracting a morphological lexicon should be
performed and evaluated separately. Many other attempts to induce
morphology, usually with some human tweaking, from raw corpus data
(Goldsmith, 2001), do not aim at lexicon extraction in their current form.
There is a body of work on inducing verb sub categorization information
from raw or tagged text (Faure & Nedellec, 1998; Gamallo, Agustini &
Lopes, 2003; Kermanidis, Nikos & Kokkinakis, 2004). However, the
parallel between sub categorization frame and morphological class is only
lax. The latter is a simple mapping from word forms to a paradigm
membership, whereas in verb sub categorization one also has the onus
discerning which parts of a sentence are relevant to a certain verb.
Moreover, it is far from clear that verb sub categorization comes in welldefined paradigms – instead the goal may be to reduce the amount of
228
Chachoo & Quadri
parse trees in a parser that uses the extracted sub categorization
constraints.
Methodology
Kashmiri is a mix of both agglutinating and inflectional type of language.
Agglutinating language consists of poly morphemic words in which each
morpheme corresponds to a single lexical meaning or grammatical
function and by inflectional means that the lexical meanings and
grammatical functions are at times fused together. Morphemic processes
across most lexical categories such as nouns, verbs, adjectives and
adverbs are studied and converted into rules which are input to the
extract tools e.g.
Nouns in Kashmiri are not marked for being definite. There is an optional
indefinite marker –a:h
Also animate nouns follow the natural gender system. Gender of a large
number of inanimate nouns is predictable from their endings.
The following suffixes are added to nouns to derive masculine forms : da:r, -dar , -vo:l, -ul and –ur
paradigm decl2 =
x+"r"
{ x+" da:r " & x+"-dar " & x+"-vo:l " & x+"-ul " } ;
The following suffixes are added to nouns to derive feminine forms : -en,
-in , -e:n, --ba:y , -ir and –va:jen
paradigm decl3 =
x+"r"
{ x+" en " & x+"- in " & x+"-e:n " & x+"-ir " & x+"-ir " } ;
Morphology
Morphology is the study of morphemes, and Morphemes are words,
word stems, and affixes, basically the unit of language one up from
phonemes. These are often understood as units of meaning, and also part
of a language's syntax or grammar.
It is in their morphology that we most clearly see the differences
between languages that are isolating (such as Chinese, Indonesian,
229
Chachoo & Quadri
Krewol...), ones that are agglutinating (such as Turkish, Finnish, Tamil...),
and ones that are inflexional (such as Kashmiri, Russian, Latin,
Arabic...). Isolating languages use grammatical morphemes that are
separate words. Agglutinating languages use grammatical morphemes in
the form of attached syllables called affixes. Inflexional languages change
the word at the phonemic level to express grammatical morphemes.
All languages are really mixed systems -- it's all a matter of
proportions. English, for example, uses all three methods: To make the
future tense of a verb, we use the particle will (I will see you); to make
the past tense, we usually use the affix -ed (I changed it); but in many
words, we change the word for the past (I see it becomes I saw
it). Looking at nouns, sometimes we make the plural with a particle
(three head of cattle), sometimes with an affix (three cats), and
sometimes by changing the word (three men). But, because we still use a
lot of non-syllable affixes (such as -ed, usually pronounced as d or t, and s, usually pronounced as s or z, depending on context), English is still
considered an inflexional language by most linguists.
Paradigm File Format
A paradigm file consists of two kinds of definitions: regexp and paradigm.
A regexp definition associates a name (Name) with a regular expression
(Reg). A paradigm definition consists of a name (Name), a set of variable
regular expression associations (VarDef), a set of output constituents
(Head) and a constraint (Logic). The basic unit in Head and Logic is a
pattern that describes a word form. A pattern consists of a sequence of
variables and string literals glued together with the ‘+’ operator. An
example of a pattern given previously was x+"r".
Propositional Logic
Propositional logic appears in the constraint to enable a more finegrained description of what word forms the tool should look for. The
basic unit is a pattern, corresponding to a word form, which is combined
with the operators & (and), | (or), and ˜ (not). The syntax for
230
Chachoo & Quadri
propositional logic is given in Fig. 1, where Pattern refers to one word
form.
Fig. 1: Propositional logic grammar
kLog ::= kLog & kLog
| kLog | kLog
| kLog
| ˜ kLog
| kPattern
| ( kLog )
The addition of new operators allow the paradigm in section 1 to be
rewritten with disjunction to reflect that it is sufficient to find one
singular and one plural word form. The middle vowel /o/ of the structure
nouns changes to a central vowel and the final consonant is palatalized.
paradigm decl1 =
x+"r"
{ (x+"I" | x+"ur") } ;
Regular Expressions
The variable part of a paradigm description provided by the tool is to
enable the user to associate every variable with a regular expression. The
association dictates which (sub-) strings a variable can match. An
unannotated variable can match any string, i.e. its regular expression is
Kleene star over any symbol. As a simple example, consider German,
where nouns always start with an
uppercase letter. This can be expressed as follows.
regexp UpperWord = upper letter*;
paradigm n [x:UpperWord] = ... ;
The syntax of the tool’s regular expressions is given in Fig. 2, with the
normal connectives: union, concatenation, set minus, Kleene star, Kleene
plus and optionality. eps refers to the empty string, digit to 0 − 9, letter to
an alphabetic
Unicode character, lower and upper to a lowercase
respectively an uppercase letter. char refers to any character. A regular
231
Chachoo & Quadri
expression can also contain a double-quoted string, which is interpreted
as the concatenation of the characters in the string.
Fig. 2: Regular expression
kReg ::= kReg | kReg
| kReg − kReg
| kReg kReg
| kReg *
| kReg +
| kReg ?
| eps
| kChar
| digit
| letter
| upper
| lower
| char
| kString
| ( kReg )
Multiple Variables
The Extract tool allows multiple variables, i.e. a pattern may contain more
than one variable.
The use of variables may reduce the time-performance of the tool, since
every possible variable binding is considered. The use of multiple
variables should be moderate, and the variables should be restricted as
much as possible by their regular expression association to reduce the
search space.
A variable does not need to occur in every pattern, but the tool only
performs an initial match with patterns containing all variables. The
reason for this is efficiency — the tool only considers one word at the
time, and if the word matches one of the patterns, it searches for all
other patterns with the variables instantiated by the initial match. For
obvious reasons, an initial match is never performed under a negation,
since this would imply that the tool searches for something it does not
want to find.
232
Chachoo & Quadri
It is allowed to have repeated variables, i.e. non-linear patterns, which is
equivalent to back reference in the programming language Perl. An
example where a sequence of bits is reduplicated is given. This language
is known to be non-context-free (Hopcroft & Ullman, 2001).
regexp ABs = (0|1)*;
paradigm reduplication [x:ABs] =
x+x { x+x } ;
Multiple Arguments
The head of a paradigm definition may have multiple arguments to
support more abstract paradigms. An example is of Swedish nouns,
where many nouns can be correctly classified by just detecting the word
forms in nominative singular and nominative plural. An example is given
(Fig. 3), where the first and second declension is handled with the same
paradigm function, where the head consists of two output forms. The
constraints are omitted.
Fig. 3
paradigm regNoun = paradigm regNoun =
gag+"ar" gag+"ir" kot+"ur" ko+":tar"
{...} ; {...} ;
The Algorithm
Fig. 4 represents the algorithm of the tool is presented in pseudo-code
notation.
Fig. 4
let L be the empty lexicon.
let P be the set of extraction paradigms.
let W be all word types in the corpus.
for each w : W
for each p : P
for each constraint C with which w matches p
if W satisfies C with the result H,
add H to W
endif
end
end
end
233
Chachoo & Quadri
The algorithm is initialized by reading the word types of the corpus into
an array W. A word w matches a paradigm p, if it can match any of the
patterns in the paradigm’s constraint that contains all variables occurring
in the constraint. The result of a successful match is an instantiated
constraint C, i.e. a logical formula with words as atomic propositions. The
corpus W satisfies a constraint C if the formula is true, where the truth of
an atomic proposition “a” means that the word “a” occurs in W.
Conclusion
The paper describes the open source extract tool as a means to build
morphological lexicon which requires relatively less human work. Given a
morphological description, typically an inflection engine and a description
of the closed word classes, such as pronouns and prepositions, and
access to raw text data, a human with knowledge of the language can use
a simple but versatile tool that exploits word forms alone. It remains to
be seen to what extent syntactic information, e.g. part-of-speech
information, can further enhance the performance. A more open
question is whether the suggested approach can be generalized to collect
linguistic information of other kinds than morphology, such as e.g. verb
sub categorization frames.
References
Forsberg, M., & Ranta, A. (2004). Functional Morphology. In Proceedings
of the ninth ACM SIGPLAN international conference on
Functional programming (ICFP '04) (pp. 213-223). Snow Bird UT,
U.S.A. New York: ACM. doi: 10.1145/1016850.1016879
Creutz, M., & Lagus, K. (2005). Inducing the morphological lexicon of a
natural language from unannotated text. In Proceedings of the
International and Interdisciplinary Conference on Adaptive
Knowledge Representation and Reasoning (AKRR ’05), 15-17
June, Espoo, Finland, Espoo (2005) (106–113). Espoo, Finland.
Retrieved from
234
Chachoo & Quadri
http://research.ics.tkk.fi/events/AKRR05/papers/akrr05creutz.p
df
Sharma, U., Kalita, J., & Das, R. (2002). Unsupervised learning of
morphology for building lexicon for a highly inflectional
language. In Proceedings of the ACL-02 workshop on
Morphological and phonological learning (MPL ‘02) (pp. 1-10).
Stroudsburg, PA, USA: Association for Computational Linguistics.
doi: 10.3115/1118647.1118648
Hopcroft, J. E., & Ullman, J. D. (2001). Introduction to automata theory,
languages, and computation (2
nd
ed.). Reading, Mass: Addison-
Wesley.
Clément, L., Sagot, B., & Lang, B. (2004). Morphology based automatic
acquisition of large-coverage lexica. Retrieved from
http://hal.archivesouvertes.fr/docs/00/41/31/89/PDF/LREC04.pdf
Oliver, A. (2004).Adquisició d’informació lèxica i morfosintàctica a
partir de corpus sense anotar: aplicació al rus i al croat. (PhD
Thesis). Universitat de Barcelona.
Oliver, A., & Tadic, M. (2004 a). Enlarging the croatian morphological
lexicon by automatic lexical acquisition from raw corpora. In
Proceedings of LREC’04, Lisboa, Portugal (2004) 1259–1262.
Retrieved from
http://www.hnk.ffzg.hr/txts/aomt4lrec2004.pdf
Goldsmith, J. (2001). Unsupervised learning of the morphology of natural
language. Computational Linguistics. 27(2), 153–198. doi:
10.1162/089120101750300490
Kermanidis, K. L., Nikos, F., & Kokkinakis, G. (2004). Automatic acquisition
of verb subcategorization information by exploiting mininal
linguistic resources. International Journal of Corpus Linguistics, 9
(1), 1-28. doi: 10.1075/ijcl.9.1.01ker
Faure, D., & Nedellec, C. (1998). Asium: Learning subcategorization
frames and restrictions of selection. In Y. Kodratoff (Ed.). 10th
Conference on Machine Learning (ECML 98) – Workshop on Text
235
Chachoo & Quadri
Mining, Chemnitz, Germany, Avril 1998. Springer-Verlag, Berlin
(1998)
Gamallo, P., Agustini, A., & Lopes, G.P. (2003). Learning subcategorisation
information to model a grammar with “Co-restrictions”.
Traitement Automatique des Langues. 44 (1), 93–177
236
Open Access Research Output of the University of Kashmir
Asmat Ali
Tariq Ahmad Shah
Iram Zehra Mirza
Abstract
Purpose: Open Access has come up with a promising future of making the
scholarly content free of cost available to everyone. It has widened the
information exchange market and is becoming a worldwide effort to provide free
online access to scientific and scholarly research literature in diverse formats
including open access journals. The present study attempts to provide an overview
of open access publishing in the University of Kashmir.
Design/Methodology/Approach: The study is based on the data extracted from
SCOPUS, leading Science, Technology and Medicine (STM) citation database of
world leading publisher, Elsevier.
Findings: The study reveals that OA publishing is gaining popularity in the whole
university with substantial amount of research publication already available
through OA journals.
Keywords: Open Access (OA); Open Access Publishing; University of Kashmir
Paper Type: Survey
Introduction
ssociation of Research Libraries (ARL) refers open access to any
dissemination models created with no expectation of direct
monetary return and which makes works available online at no
cost to the readers (ARL, 2008). In India, poor access to international
journals and the low visibility of papers are main problems faced by
researchers. OA is viewed as a solution to these problems. OA signifies
the democratization of knowledge and supports socially responsible way
to distribute knowledge.OA makes the same knowledge and information
available to scholars in wealthy, first world nations, in developing excommunist, second world nations, and in under-developed third world
nations (Ylotis, 2005). So, open access has proved a blessing to the
scholars in one or the other way. Whether, the scholar is an author or the
user of scholarly content, it has democratized them in a real sense. OA to
scholarly articles can be achieved through two main ways: by being
published in an open access journal, or by being deposited in an open
access repository (OAR) and open access archives (OAA) respectively
(Fernandez, 2006; Chan & Costa, 2005).
A

Librarian, Nawa Kadal Degree College. Srinagar. Jammu and Kashmir. India.

Research Scholar. Department of Library and Information Science, University of Kashmir.
190 006. India. email: [email protected]

Faculty. Department of Library and Information Science, University of Kashmir. 190 006.
India. email: [email protected]
237
The open access journals make their quality controlled content freely
available to all corners of the world, using a funding model that does not
charge readers or their institutions for access. There are several
operational models in place, the simplest one being where the journal is
set up and run by the university department, published electronically
using only the institutions server space and edited and administered,
including peer review, by interested scholars. A modification of this is
where the journal receives some funding, either by grants or sponsorship,
to support some of the editorial or management cost (Correria &
Jeixeria, 2005).
Scholars all over the globe are actively involved with the open access
publishing process. Because of innumerable benefits adhered to open
access, scholarly networks all over the globe are adding their scholarly
content to it. Whether the scholars are from the developed world or
developing world, they have somewhat a common story to say. Scholars
have contributed to open access journals as they provide a better and
healthy platform to them. The present study attempts to ascertain the
trends in open access publishing at the University of Kashmir.
Objectives
Objectives of the study are
 To assess the OA research output of University of Kashmir
 To assess the growth and trends of OA publications
 To gauge the geographical scattering of OA articles.
Problem
The elements associated with University of Kashmir have been active
elements right from its inception. They have also contributed towards OA
movement. The present study tries to explore the level of OA
contribution from the University of Kashmir.
Methodology
Elsevier’s Scopus database was used to identify the research contribution
from the University of Kashmir. Scopus claims to be the world’s largest
abstract and citation database peer-reviewed literature and quality web
resources. After ascertaining the journals in which the authors have made
their contribution, the titles were further checked with the OA journal list
maintained at the databases of Directory of Open Access Journals and
Open J-Gate.
Scope
The study was confined to the publications that have been published
during the last 11 years from 2000 to 2010.
238
Literature Review
Arunachalam (2008) stresses the need of OA mandate by various
research organizations in India for their own research output and of
projects funded by them. Herb and Muller (2008) discovered that the
scientists after becoming familiar with open access services use them to a
greater extent. Haider (2007) considers open access as a way to connect
the developing world to the system of science, by providing access as a
way to connect the developing world. Mcculloh (2006) observes that
open access initiative is dramatically transforming the process of
scholarly communication, bringing great benefits to academic world.
Prosser (2004) believes that OA journals and institutional repositories
hold out the promise of providing a fairer, more equitable and more
efficient system of scholarly communication and can better serve the
international research community. Chan & Costa (2005) argues that OA
enriches the global knowledge base by incorporating the missing research
from the less developed world and improves the south-north and southsouth knowledge flow. Falk (2004) observes that open access is gaining
momentum with very broad support from library and professional
groups, university faculties and even journal publishers. Lawrence (2001)
demonstrates that open access articles can substantially increase their
impact implicitly and the impact factor of the source journals.
Results and Discussions
Open Access Publication Status
During the study period, a total of 448 articles have been published of
which only 137 (i.e., 30.58 per cent of the total) are of open access
nature. Contribution to open access publications at the University of
Kashmir is gaining momentum as the no. of OA articles goes on increasing
with every passing year. However, the percentage of OA literature was
mostly produced in 2010 in which 44.04 per cent of the total output was
of OA nature, followed by 34.24 per cent of the total output in the year
2008 (Table 1).
239
Table 1: Open Access Publication Status
Total
Open Access
Year
Publications
Publication
2010
84
37 (44.04)
2009
87
21 (24.13)
2008
73
25 (34.24)
2007
60
17 (28.33)
2006
43
14 (32.5)
2005
29
6 (20.6)
2004
32
8 (25.00)
2003
20
7 (35.00)
2002
5
1 (20.00)
2001
5
0 (0.00)
2000
10
1 (10.00)
Total
448
137 (100)
Figures in parentheses indicate percentage
Preferred Open Access Journals
The authors have made use of 56 different journals to make their
scholarly content freely available on the public web. Among these,
authors have published a maximum of 22 articles Indian Journal of Pure
and Applied Physics followed respectively be 5 articles each in Current
Science, Pakistan Journal of Biological Sciences, and Journal of
Inequalities in Pure and Applied Mathematics. Table 2 shows the top 12
OA journals in which authors have made maximum contribution.
Table 2: Preferred Open Access Journals
Journal Title
No. of Papers
Indian Journal of Pure and Applied Physics
22
Current Science
5
Pakistan Journal of Biological Sciences
5
Journal of Inequalities in Pure and Applied Mathematics
5
Indian Journal of Animal Sciences
4
Indian Journal of Medical Microbiology
4
Pakistan Journal of Nutrition
4
Asian Journal of Plant Sciences
4
International Journal of Botany
4
Library Philosophy and Practice
4
Pharmacology online
4
Tropical Ecology
4
240
Geographical Pattern of Publications
The authors have OA scholarly work available in journals published from
23 different nations. From Table 3, it is clear that a maximum no. of 49
are published in 11 Indian journals, followed respectively by 26
publications in 9 Pakistani journals. On the other extreme, one article was
published each in journals from Chile, China, Germany, Hungary,
Romania, South Korea and United Kingdom.
Table 3: Geographic pattern of publications
Country
No. of Journals
No. of articles
India
11
49
Pakistan
9
26
United States
6
14
Thailand
3
3
Iran
3
3
Serbia
2
5
Nigeria
2
4
Turkey
2
3
Brazil
2
3
Poland
2
2
Croatia
2
2
Australia
1
5
Italy
1
4
United Arab Emirates
1
3
Taiwan
1
2
Netherlands
1
2
United Kingdom
1
1
South Korea
1
1
Romania
1
1
Hungary
1
1
Germany
1
1
China
1
1
Chile
1
1
Conclusion
Open access has come up with a promising future of making the scholarly
content free of cost available to everyone. It has widened the information
exchange market and is becoming a world wide effort to provide free
online access to scientific and scholarly research literature in diverse
formats including open access journals. Open access is found to be much
241
popular in University of Kashmir. It is hoped that with the benefit of OA
becoming clear day by day, more publications from University of Kashmir
will be available through open access channels. The different
stakeholders like library professionals and open access advocates also
have a key role in bringing the benefits of open access to the notice of
researchers by extension and awareness programs.
References
Arunachalam, S. (2008). Open access to scientific knowledge. DESIDOC
Journal of Library & Information Technology. 28(1), 7-14.
ARL. (2008). Association of Research Libraries. Retrieved from
http://www.arl.org/OSC/models/oa.html
Chan, L., & Costa, S. (2005). Participation in the global knowledge
common: challenges and Opportunities for research
dissemination in developing countries. New Library World, 106
(3/4), 141-163. doi: 10.1108/03074800510587354
Correria, A. R., & Jeixeria, J.C. (2005). Reforming scholarly publishing and
knowledge communication from the advent of the scholarly
journal to the challenges of open access. Online Information
Review, 29(4), 349-364. doi: 10.1108/14684520510617802
Falk, H. (2004).Open access gains momentum. The electronic Library,
22(6), 527 – 530. doi: doi:10.1108/02640470410570848
Fernandez, L. (2006). Open access initiative in India: An evaluation.
Partnership: The Canadian Journal of Library and Information
Practice and Research, 1 (1). Retrieved from
http://journal.lib.uoguelph.ca/index.php/perj/article/viewArticle
/110/171
Haider, J. (2007). Of the rich and the poor and other curious minds: on
open access and development. ASLIB Proceedings, 59 (4/5), 449461. doi :10.1108/00012530710817636
Herb, U., & Muller, M. (2008). The long and winding road: Institutional
and disciplinary repository at Saarland University and state
library. OCLC System & Services, 24(1), 22-29. doi:
10.1108/10650750810847215
Lawrence, S. (2001). Free online availability sustainability increases a
paper’s impact. Nature. Retrieved from
http://www.nature.com/nature/debates/eaccess/articles/lawrence.html
Mcculloch, E. (2006). Taking stock of open access progress and issues.
Library
Review,
55(6),
337-343.
doi:
10.1108/00242530610674749
Proser, D. L. (2004). The next information revolution law open access
repositories
and
Journals
will
transform
scholarly
242
communications. Liber Quaterly: The journal of European
Research Libraries. 14 (1). Retrieved from
http://liber.library.uu.nl/
Ylotis, K. (2005). The open access initiative: a new paradigm for scholarly
communications. Information Technology Libraries. 24, pp. 157162. Retrieved from
www.find.galegroup.com/itex/start.do?prodId=ITOF&usergroup
nave=bcdelhi
243
BOOK REVIEW
EDITOR
Prof. M.P. Satija
G.N.D University (Punjab), India
[email protected]
Book Review
Mirza
Theimer, Kate. (2010). Web 2.0 tools and strategies for archives and
local history collections. London: Facet Publishing xvii + 246p. ISBN: 9781-85604-687-9.
To connect with and successfully serve the growing generation of native
Web 2.0 users or archivists, librarians and other professionals responsible
for historic collections must learn how to accommodate their changing
information needs and expectations.
In this clearly written jargon-free guide, Kate Theimer demystifies
essential Web 2.0 concepts, tools, buzzwords and provides a thorough
introduction to the key concepts of Web 2.0 and how the transition from
Web 1.0 to 2.0, including the ways to interact with traditional audiences
and attract the new ones, has provided a greater visibility and increased
opportunity for resource discovery. The author explains briefly a variety
of Web 2.0 technologies and their functionalities in Chapter 1, and also
addresses popular fears of using them.
Chapter 2 discusses the evaluation of Web 2.0 tools and how they can fit
into an overall outreach plan and help readers to assess their current web
presence.
Chapters 3-9 each focuses on one important and widely used Web
tools/services namely: blogs, podcasts, Flicker, You Tube, Twitter, wikis,
Facebook, etc.
The chapters follow a common structure with discussion of the
functionalities of Web 2.0 tools in archives and their implementation
requirements. In most cases, the screenshots, checklists as well as
interviews of some of the personalities that have successfully utilized
Web 2.0 tools is also captured.
Chapter 10 draws the user attention towards the lesser-used mashups,
widgets, online chat and second life. The experiences of archivists from
institutions in US, UK and Australia are also highlighted.
Chapter 11 raises an issue of measuring success of Web 2.0
implementations in archives. It provides a useful division between
measuring outputs, outcomes and some practical tips are also given.
Chapter 12 reviews the range of management and policy concerns for
successful web project and considering factors to plan their
implementation.
The book also covers some suggested readings incorporated in Appendix
corresponding to Web 2.0 tools, highlighting additional resources for
further consultation.
The book is thus a good read for any one working with historical and
cultural collections: archivists, local history librarians and information
professionals, to take advantage of Web 2.0 technologies. The book/
author provides with detailed look at the latest technologies with the real
244
Book Review
Mirza
world examples of archives and libraries using these technologies to
enhance their online presence, showcase services and increase
patronage. Professionals will find this manual guide valuable for
promoting their services in a digital age and attracting even the most tec
savvy of patrons.
Iram Zehra Mirza
Faculty
Department of Library & Information Science
University of Kashmir
Hazratbal, Srinagar
245
NEWS SCAN
EDITOR
Dr. Sumeer Gul
Assistant Professor
[email protected]
Assistant Professor
Department of library and Information Science
[email protected]
News Scan
Gul
Department of Library and Information Science, University of Kashmir
to Host Seminar on Emerging Frontiers of Digital Libraries: Perspectives,
Empowerment and Advocacy (EFDL-I)
The Department of Library and Information Science, University of
Kashmir in collaboration with University Grants Commission is organizing
a two days seminar on Emerging Frontiers of Digital Libraries:
Perspectives, Empowerment and Advocacy. The aim of the conference is
to explore different perspectives of digital libraries for playing an
increasingly active role in learning and sharing in an open environment
to empower stakeholders to make an understanding of different
opportunities cum challenges and making an advocacy and promotion
of different prospects which are likely to become highly available in the
world of scholarship.
For details log on
http://lis.uok.edu.in/
Authors Sue Universities over Digital Libraries: Alleged Copyright
Infringement Over Grey Area 'Orphan' Books
The Authors Guild, the Australian Society of Authors, the Union Des
Ecrivaines et des Ecrivains Quebecois (UNEQ) and eight individual authors
have filed lawsuits against the University of Michigan, the University of
California, the University of Wisconsin, the Indiana University and the
Cornell University for copyright infringement, according to the Associated
Press. The issue revolves around the use of 'orphan' works, which are
texts that are out of print, with no known whereabouts for the author,
leaving the books in a kind of grey area on the outskirts of copyright law.
These texts were uploaded to the University of Michigan's Hathitrust
online library, to which other universities subsequently signed up. The
books were scanned from the University's physical library by Google, with
five million done so far and several million left to go, but the authors and
author societies claim that the scanning was unauthorised and illegal.
Source: The Inquirer. September 13, 2011. Available at:
http://www.theinquirer.net/inquirer/news/2108766/authors-sueuniversities-digital-libraries
UK Higher Education Funding Bodies Choose Elsevier's SciVerse Scopus
as Data Provider for 2014 Research Excellence Framework
The four UK Higher Education Funding Bodies (representing England,
Northern Ireland, Scotland and Wales) will use Elsevier's SciVerse Scopus
database as the sole bibliometric provider for the 2014 Research
Excellence Framework (REF). The Framework was developed to assess
the quality of research in UK higher education institutions.
246
News Scan
Gul
Source: The Wall Street Journal. Market Watch. September 19, 2011.
Available at:
http://www.marketwatch.com/story/uk-higher-education-fundingbodies-choose-elseviers-sciverse-scopus-as-data-provider-for-2014research-excellence-framework-2011-09-19
Robot to Bring Virtual Rolling Tours of Campus Library
Baylor Libraries has committed to purchase VGo, a remotely controlled
robot that will be used for virtual tours of Armstrong Browning Library.
Baylor hopes the tool will help enhance learning for students in grades K12.
Source: The Baylor Lariot. November 18, 2011. Available at:
http://baylorlariat.com/2011/11/18/robot-to-bring-virtual-rolling-toursof-campus-library/
Libraries in Gujarat to Digitally Bridge the Knowledge Gap
Libraries in Gujarat are taking steps to digitally unify and share their
resources. Boundaries of the libraries are expanding beyond four walls
and library professionals are gearing up to take up the challenge of using
IT in disseminating authentic, latest and right kind of information to the
right type of users.
Source: Daily News and Analysis (DNA). Friday November 18,
2011.Available at:
http://www.dnaindia.com/india/report_libraries-in-gujarat-to-digitallybridge-the-knowledge-gap_1614404
MIT Launches Open Source Software Program
The Massachusetts Institute of Technology (MIT) announced the launch
of a new program called MITx. Under this initiative, students will be able
to take online classes through an open source software program
provided by the university. While school officials believe the software will
be of great use to its on-campus students, they eventually hope it will
also foster a virtual community of learners from around the world. The
classes will be freely available to anyone who has Internet access.
However, individuals who want to demonstrate their mastery of the
material and earn credentials for their work must pay a small fee.
For more information log on:
http://www.usnewsuniversitydirectory.com/articles/mit-launches-opensource-software-program_12019.aspx
247
News Scan
Gul
Archives Department Takes Up Digitisation of Padmanabhaswamy
Temple Records
The State Archives Department is digitising the Mathilakom records (old
palm leaf manuscripts of Padmanabhaswamy temple in
Thiruvananthapuram) as part of the second phase of digitisation of old
records. The records throw light on the history of the temple, and
digitization might help in researching the records and finding missing
links. There is renewed interest in the records because of the finding of
large quantum of wealth in the temple vaults. Assistant Archivist Ashok
Kumar told The Hindu that the State Archives had the largest collection of
palm leaf records in the whole of Asia. The Department had plans to
digitise all of them so that the information could be preserved. (The
cadjan manuscripts were susceptible to climatic conditions). The process
involved cleaning and scanning of the records and conversion into
portable document format.
Source: The Hindu. December 25, 2011. Available at:
http://www.thehindu.com/news/states/kerala/article2746895.ece
Islamic Manuscripts Conference at Cambridge University
The Islamic Manuscripts Association, the Thesaurus Islamicus Foundation
and the Al-Waleed bin Talal Islamic Research Center will jointly hold an
Islamic manuscript conference at Cambridge University from 9 to July 11,
2012. According to the Miras Maktoub (written heritage) Research
Center, the association invites all interested researchers to submit their
articles to the secretariat of the conference on Islamic manuscripts,
codicology, and conservation as well as management of manuscript
collections.
Source: The Islamic Manuscript Association
http://www.islamicmanuscript.org/
248
NOTE FOR CONTRIBUTORS
The journal accepts papers of original research in the field of Library and
Information Science which are not under consideration by any other
journal or conference proceedings etc. All papers are subject to blind
peer review. The papers may be submitted preferably online on email id:
[email protected] or can be sent to the editor in CD. The
paper should include an abstract highlighting problem, methodology and
findings with three to five keywords. The softcopy should be submitted in
MS-Word and figures, if any, be submitted as a separate graphic file (GIF,
JPEG, or PNG format). Authors should follow APA style (American
th
Psychological Association) 6 ed for citation and references.
A. For Citation
Busha and Harter (1980) or (Busha & Harter, 1980)
B. For Reference
Book
Busha, C.H., & Harter, S.P. (1980) Research methods in librarianship:
techniques and interpretation. New York: Academic Press.
Journal Article
Wei, J., Stankosky M., Calabrese, F., & Lu, L. (2008). A framework for
studying the impact of national culture on knowledge sharing
motivation in virtual teams. VINE, 38(2), 221-231. doi:
10.1108/03055720810889851
Essays or Chapters in Edited Books
Mangla, P.G. (1985). Library and Information Science in India. In Gupta,
B.M., Guha, B., Rajan, T.N., and Satyanarayana, R. (Eds.),
Handbook of libraries, archives & Information centers in India (p.
229-256). New Delhi: Information Industry
An internet document
Applegate, Lynda. M. (2009). Building Businesses in Turbulent Times.
Working knowledge: A first look at faculty research. Retrieved
March 29, 2011 from
http://hbswk.hbs.edu/item/6159.html
The Editor
TRIM, Department of Library and Information Science
University of Kashmir, Hazratbal, Srinagar
India 190 006

here - Computer Science

Transcription

Similar documents

MiraTEC Trim

Diane von Furstenberf for Kravet Collections

Ventelo

Frost Damage Recovery - MIKE GAST Lawn Care, Inc.

Wednesday, Dec. 12, 11 am to 1 pm Texins Activity Center

OUR FLAGSHIP EVENT: “OLD SCHOOL SATURDAY”® (known as

Shepley Wood Products

Publicatie Vijver Corten staal Eng2 2014

2014 Program Guide Spec Sheet

Website page_GearUp Accessories Catalogue all Product