D20: Final report

Transcription

D20: Final report
IST-2001-33127
SciX
Open, self organising repository for scientific
information exchange
D20: Final report
Responsible author: Žiga Turk
Co-authors: all partners
Type: Report
Access: public
Version: 1.0
Date: March, 18, 2004
D20: Final report
version of 30-Mar-04 17:27 page 2/68
EXECUTIVE SUMMARY:
This document provides the first draft of the final report on the SciX project. It addresses all
major aspects of the SciX work:
ƒ
SciX work was based on the business process reengineering described in Section 1.
ƒ
SciX was, in contrast to other work in this area, focusing its work and the contacts on the
institutional repositories. Digital publishing can re-establish their role in the scientific
publishing process. Section 2 discusses this aspect of SciX.
ƒ
Based on open source tools and some development, SciX project created the SciX Open
Publishing Services – a set of services from which various electronic publishing media,
such as journals, conference proceedings personal and institutional repositories etc., can
be built is described in Section 3.
ƒ
Open Access publication and bring the results of scientific work closer to the end user in
the industry. Section 4 describes how SciX handled this.
ƒ
Finally, Section 6 outlines what will happen with SciX after the project is formally
ended. The partners in the project work together before the project; SciX work is
providing an excellent platform to continue to do so.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 3/68
RELEASE HISTORY
date
changes
March 30th, 2004 this draft version
updated final report based on feedback from the review
May 31st, 2004
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 4/68
TABLE OF CONTENTS:
EXECUTIVE SUMMARY: ...................................................................................................... 2
RELEASE HISTORY ............................................................................................................... 3
TABLE OF CONTENTS: ......................................................................................................... 4
1. BUSINESS PROCESS REENGINEERING (EDITOR BJORK – DO YOU WANT IT AS
SIMILAR AS THAT TO D4, CAN YOU BORROW SOME FOR THE OVERALL
CONCLUSIONS) ........................................................................................................................ 6
1.1 INTRODUCTION ............................................................................................................... 6
1.2 THE PROCESS MODEL WORK....................................................................................... 6
1.3 THE COST STRUCTURE OF OPEN ACCESS JOURNAL PUBLISHING .................. 14
1.4 THE ANALYSIS OF BARRIERS TO CHANGE ............................................................ 19
1.4.1 Legal framework ..........................................................................................................20
1.4.2 IT infrastructure............................................................................................................21
1.4.3 Business models ...........................................................................................................22
1.4.4 Academic Reward System............................................................................................25
1.4.5 Marketing and Critical Mass ........................................................................................26
1.5 Conclusions ....................................................................................................................... 30
2. A CHANGING ROLE OF PROFESSIONAL ASSOCIATIONS (EDITOR MARTENS -- MORE REWRITE NEEDED) ............................................................................................... 33
2.1 INTRODUCTION ............................................................................................................. 33
2.2 COMMERCIAL COMPETITION AND RELATED WORK.......................................... 33
TABLE: COMMERCIAL INDEXES AND BIBLIOGRAPHIC DATABASES...................... 34
2.3 EXEMPLARS IN THE FIELD OF SOFTWARE ............................................................ 34
2.4 EXEMPLARS IN THE FIELD OF STANDARDS .......................................................... 34
2.5 THE OPEN ARCHIVES INITIATIVE.......... ERROR! BOOKMARK NOT DEFINED.
2.6 COLLABORATION WITH PROFESSIONAL ORGANIZATIONS.............................. 35
2.7 SUMMARY....................................................................................................................... 37
2.8 CONCLUSIONS ............................................................................................................... 38
3. THE SCIX OPEN PUBLISHING PLATFORM - SOPS................................................... 39
3.1 ARCHITECTURE............................................................................................................. 39
3.1.1 Applications..................................................................................................................40
3.1.2 Overview of Services ...................................................................................................42
3.1.3 Overview of the Protocols ............................................................................................43
3.2 USER EXPERIENCE........................................................................................................ 44
3.2.1 Digital libraries.............................................................................................................44
3.2.2 International digital libraries ........................................................................................45
3.2.3 Electronic journals........................................................................................................47
3.2.4 Conferences ..................................................................................................................47
3.3 OPEN SERVICES............................................................................................................. 48
3.3.1 WSDL/SOAP interfaces...............................................................................................49
3.3.2 Open towards OAI-PMH protocol ...............................................................................50
3.3.3 Open for use by Citation Management Software or Office applications .....................51
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 5/68
3.4 IMPLEMENTATION ....................................................................................................... 51
3.4.1 SOPS physical architecture ..........................................................................................52
3.5 CONCLUSIONS ............................................................................................................... 53
4. DIGITAL PUBLISHING – BEYOND R&D COMMUNITY (GUDNI) .......................... 54
4.1 INTRODUCTION ............................................................................................................. 54
4.2 ROLES AND ACTORS IN KNOWLEDGE TRANSFER............................................... 54
4.3 CONSTRUCTION INDUSTRY - THE PILOT IMPLEMENTATION........................... 55
4.4 EXTENDING THE PUBLISHING MODEL ................................................................... 56
4.5 VALUE ADDED PUBLICATIONS................................................................................. 57
4.6 VAP ARCITECTURE OVERVIEW ................................................................................ 59
4.7 COLLABORATION AMONG THE VAP APPLICATIONS.......................................... 61
4.8 EXAMPLE OF A VAP END USER APPLICATION - IBRI RHEOCENTER PROTAL63
4.9 CONCLUSIONS ............................................ ERROR! BOOKMARK NOT DEFINED.
5. SCIX AFTER SCIX (EDITOR ALMUDENA) .................................................................. 64
5.1 PERSPETIVES BY THE PARTNERS............................................................................. 64
5.1.1 LJU ...............................................................................................................................64
5.1.2 SHH ..............................................................................................................................64
5.1.3 TUW .............................................................................................................................65
5.1.4 USAL............................................................................................................................65
5.1.5 INDRA .........................................................................................................................65
5.1.6 IBRI ..............................................................................................................................66
5.1.7 FGGI.............................................................................................................................66
5.2 CONCLUSIONS ............................................................................................................... 66
6. CONCLUSIONS (BCB?) ...................................................................................................... 68
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 6/68
1. BUSINESS PROCESS REENGINEERING IN SCIX
1.1 INTRODUCTION
This section describes the results of WP 1 of the SciX project, and is structured in the following
way:
ƒ
ƒ
ƒ
ƒ
ƒ
The scientific publishing process model
The cost analysis of Open Access Journals
The analysis of barriers to change
Recent important developments
Conclusions
1.2 THE PROCESS MODEL WORK
The aim of the modelling was to help us understand the scientific publishing process and how it
is affected by the Internet, in order to provide a basis for a cost and performance analysis of
various alternative ways of organizing it. The model can also work as a roadmap for positioning
various new initiatives, such as e-print repositories and harvesting tools, within the overall
system of scholarly communication.
The model explicitly includes the activities of all the stakeholders involved in this system,
including the activities of the:
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
Researchers who perform the research and write the publications
Publishers who manage and carry out the actual publication process
Academics who participate in the process as editors and reviewers
Libraries that archive the publications and provide access to the them
Bibliographic services which facilitate the identification and retrieval of publications
Readers who search for, retrieve and read publications
Practitioners who implement the research results directly or indirectly
In the model the unit of observation is the single publication, how it is written, edited, printed,
distributed, archived, retrieved and read, and how eventually it may affect practice. The model
depicts publishing and value added services using both paper and electronic formats. Pure
electronic or pure paper-based publishing could be described by subsets of the model. The same
goes for free publishing on the web (“open access”), which resembles traditional publishing, but
where certain activities such as negotiating, keeping track of and invoicing subscriptions can be
almost entirely left out.
In order to read the model the reader needs some familiarity with the IDEF0 modelling
methodology. IDEF0 stands for Integration Definition for Function Modelling. The main
concepts are the activity and the flow. The flow can be used as input, output, control or
mechanism. An input represents something which in an activity is consumed to produce an
output. Typical inputs could be raw materials, energy, human labor, but also information when
the purpose of the activity is to transform the information to provide added value. Outputs can be
used as inputs to further activities, and feedback loops are possible. Activities are controlled by
controls. Typical examples could be laws, guidelines and instructions for carrying out an
activity etc. Mechanisms, which point at activities from below, are persons, organisations,
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 7/68
machines, software etc. which carry out the activities. The presentation of the IDEF0 diagrams is
hierarchical in that diagrams on lower levels provide detailed breakdowns of those from the
higher ones.
The current version of the SPLC-model includes 22 separate diagrams, arranged in a hierarchy
up to seven levels deep. There are typically three activity boxes on each diagram, although there
are a couple of diagrams with more activities and some with only two. There are altogether 64
activity boxes and around 200 arrows. The model is discussed in more detail in delivrable 2.
The diagram in Figure 1 is crucial for understanding the life-cycle view adopted in this
modelling effort. The whole life-cycle is seen as consisting of five separate stages. The perform
the research stage is probably the most expensive part, usually consisting of several man-months
of work effort per resulting publication, but the one least affected by the reengineering efforts
facilitated by the Internet (at least directly, indirectly the effect can be substantial in terms of
better quality of the research). The publish and disseminate the results and study the results
stages constitute the main object of study in this project. The fifth activity is evaluate the
researcher, acting as a control mechanism of the performe the research activity. From the
perspective of the public bodies that to a large part finance research it is the efficiency of the
total process, including both the production and “consumption” of publications, that should be
optimized. The important thing is that in a life cycle analysis, the cost and efficiency of both the
publish the results activity and the study the results activity are important. Optimizing only one
of these may lead to a sub optimal solution for the total process. Here Internet has changed the
situation dramatically, as it has for information goods that can be delivered in a digital format.
Scientific Method
Appointment &
grant decisions
Publishing Practice
Industrial R&D Policies
Reading Habits
Scientific Problems
Perform the Research
New Scientific Knowledge
0 mk
1
Performance
measures
Evaluate the researcher
Publish and Disseminate
the Results
0 mk
0 mk
4
2
Funding and Academic bodies
Disseminated Scientific Knowledge
Retrievable
Publication
Study the Results
0 mk
3
Improved Productivity and Quality of Life
The Researcher
Implement the Results
The Researcher
The Publisher
Libraries, indexing
services etc.
Readers
0 mk
5
Industrial Problems
Society and Industry
Figure 1: Do Research. Publish, Study and Apply the Results, Breakdown (Diagram AO)
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 8/68
The Publish the Results (Diagram A2) part of the model (Figure 2) has been split up into three
distinct activities, which to a large extent are carried out by different stakeholders. Based on the
results of the research, the researcher writes a manuscript, which then in the next stage through a
number of transformations is changed into a publication (on paper or electronic). The model will
at this stage take the generic “Manuscript” as input. Typical examples of monographs include
working papers, research reports and Ph.D. theses. Conference papers are subjected to some sort
of external review either for the abstract or the full paper, and are usually presented orally in
addition to the printed version. Conference proceedings are published as one-off books or
typically annually. Articles in scientific periodicals are subjected to rigorous peer review. It is
important to note that periodicals articles have a much higher likelihood of being referenced in
bibliographical services than the other types of documents. Also journals are usually available by
subscription whereas the access to monographs and conference proceedings is predominantly
acquired on an individual basis.
The last activity Archive and Index is extremely important from a life-cycle viewpoint and
involves the activities of libraries, bibliographic services etc. to make the publication easily
available to researchers and practitioners worldwide. Of the publication types only the Publish as
journal article has at this stage been further detailed. This is because of its relative importance in
scientific publishing and also because the cost modelling effort will be concentrated
there.
Scientific Writing
Style
Publishing
Practice
Library and Indexing
Practice
New Scientific
Knowledge
Write Manuscript
0 mk
1
Manuscript
Perform Publishing
Activities
0 mk
2
Publication
Retrievable
Publication
Archive and Index
0 mk
The
Researcher
The
Publisher
3
Libraries and Bibliographic
Services
Figure 2 Publish the Results (Diagram A2)
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 9/68
The diagram Publish as Scholarly Journal Article (A223) (Figure 3) may at first sight be
difficult to understand. The idea is to show all the activities which are carried out by the
publishing organization, and thus have a direct cost implication for them. This is the reason for
separating activities such as do general publisher’s activities, do journal specific activities. Both
of these demand resources, which cause overhead costs, which then are added on top of the basic
variable costs caused by the processing of each individual article (in the activity do article
specific activities). For instance setting up and maintaining the IT-technical infrastructure for a
portfolio of journals could be such an overhead causing item. The main pipeline in the model is,
however, the input arrow manuscript, which directly enters the activity do article and issue
specific activities.
Publisher's Business
Strategy
Journal Review
Policy
Plan for Running
Journals
Do General
Publisher's Activities
0 mk
1
Issue
Schedule
Do Journal
Specific Activities
0 mk
2
Manuscript
Journal
Article
Do Article and Issue
Specific Activities
0 mk
3
Infrastructure for
Running Journal
Figure 3 Publish as Scholarly Journal Article (Diagram A223).
General publishing activities are typical for most commercial publishers and such professional
associations which publish several journals. Activities can include general management and
financial functions as well as the setting up of the IT-technical structure for the production of
journals (both on paper and the web).
Like many of the diagrams in this model, this model represents a choice of viewpoint. For the
activity Do Journal Specific Activities an important aspect is that commercial journals may spend
a lot of money on marketing, and also on the management of subscribers (invoicing, setting up
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 10/68
ways of checking access to electronic versions). For open access electronic journals, the latter
activity is almost non-existent.
This diagram considers the two major modes for publishing scientific journals. In the paperbased world prior to the Internet articles were as a rule bundled into issues and had to wait for
publishing until the whole issue was ready. Electronic publishing does however provide the
possibility to publish each article as soon as it is ready. Today many journals are printed in both
print and electronic formats but still retain the issue-based structure.
Figure 4 depicts the activities carried out during the peer-review process as well as after the peer
review to technically format the paper for printing. Before the advent of personal computing the
copy editing activity used to be an activity incurring considerable cost, but nowadays most
researchers produce text which already is formatted according to the needs of the journal. Also
the actual final typesetting is much easier since almost all information is acquired in a digital
format. The negotiation of copyright usually takes place after the manuscript has been accepted
for publication. The output is a signed copyright agreement where the author and publisher agree
to the terms of publishing rights. For open access journals a copyright transfer does not ususally
take place since no economic rights are involved.
Journal Review Policy
Rejected manuscript
Manage the review process
Choice of reviewers
0 mk
1
Accepted Manuscript
Review Manuscript
Manuscript
0 mk
2
Reviewers Comments
Copyright agreement
Negotiate copyright
0 mk
4
Revise Manuscript
0 mk
Copyedited
manuscript
3
Copyedit Article
The
Editor
Reviewers
The
Researcher
0 mk
5
Publisher
Publisher
Figure 4: Do Article Specific Activities (Diagram A22331).
Once an article is accepted for publishing, it enters an activity called queue for publishing, which
typically takes from half a year to a year for traditional issue&paper-based journals (the worst
case the first author has experienced was three years). Waiting does not imply a direct cost, but
there may be a significant opportunity cost involved from the viewpoint of the researcher and
society, since the results are held back before the actual publishing.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 11/68
Archive and Index (Fig 5) is the part of the overall process which has traditionally to a large part
been handled by research libraries with public funding. Note also that from a cost viewpoint,
hundreds of libraries from all over the world have been performing the same archiving function
for each paper version of an article. The primary activity is here make publication available,
which secures that a publication is available either in print or electronically within a particular
organization (such as a university), as well as that the publication can be found in different
bibliographical search services. In the perform value-adding services a third party analyzes the
data to calculate citation indexes, impact factors etc., or writes news bulletins about research
results that practitioners can digest more easily. The Archive activity is currently receiving
increasing attention, since the archiving of electronic versions of journals for decades implies a
number of difficult problems.
Library and
Practice
Publication
Formats for Long
Digital Storage
Make Publication
Available
0 mk
Retrievable
Publication
1
Value-added
Services
Perform Value-adding
Services
0 mk
2
Archive
Long-term
Archival Copy
0 mk
Libraries and Bibliographic
Services
Information
Brokers
3
National
Libraries etc.
Figure 5 Archive and Index (Diagram A213)
Figure 6, Make Publication available (A231), includes both the activity of making the paper
publication available (placing it in the shelves of the library) and making the electronic version
available. In both cases it is preceded by the longer term activity of securing subscriptions and
access rights to the material, an activity which is even more visible today due to the large library
consortia that negotiate “big deals” with the large publishers. An additional value adding activity
is the integration of the metadata about the publication in databases which facilitates finding out
about the existence of the publication.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 12/68
Local Demand
for
Publications
Subscription
or Pay per
View Facility
Secure
Access Rigths
and Subscription
0 mk
Publication
1
Paper
Publication
Retrievable
Publication
Make Paper
Publication
Available
0 mk
2
Electronic
Publication
Make Electronic
Copy Available
0 mk
3
Searchable
Metadata
Meta Data of the Publication
Integrate Meta
Data into
Search Services
0 mk
4
Alerting
Messages
Figure 5 Make Publication Available (Diagram A231 )
There are at least two major mechanisms for making an electronic copy of a publication
available. Firstly this can be done through standard commercial services, which necessitate that
the reader or normally the local university library has secured a subscription and makes the
publication visible via the university intranet. A second possibility which partly bypasses this is
if the author has sent a copy of the publication to an open access e-prints repository as
exemplified by the Los Alamos preprints server for physics.
Traditionally commercial indexing services have dominated the function of integrating metadata
for journal articles into search services and libraries have paid subscriptions to them (Figure 7).
Over the past years researchers have increasingly started to use general web search engines for to
identify interesting publications. An effort to overcome the quality problems related to this is the
definition of the Open Archives Initiative standard for tagging scientific content material on the
web, which will enable dedicated harvesting search engines to maintain a much more focused
data bases of metadata about relevant publications.
A byproduct of the heavy use of IT for these purposes is the possibility for readers to subscribe
to services, which based on the interest profiles they define, can send them alerting messages
when something they might be interested in is published.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 13/68
Subscription or Pay
per View Facility
Searchable
Metadata
Meta Data of the
Publication
Manage Centrally Edited
Bibliographic Index
0 mk
1
Alerting
Messages
Manage Automated
Search Engine
0 mk
2
Manage Local Front End
to Meta Data Services
General Web
Search Engines
0 mk
3
Dedicated Harvesting
Tools
Bibliographic
Services
The Local
University
Library
Figure 6 Integrate Meta Data into Search Services (Diagram A 2314).
The Study the Results diagram (Figure 8) structures the activities of the readers of scientific
publications. Note that to arrive at a per publication cost the activities of individual readers all
over the world and in different periods should be summed up. The find out about publication
activity results in the output metadata of interesting publication activity (including the location
from which a paper or electronic version can be retrieved). This output is used as the control of
the retrieve publication activity. Finally the publication is read and the scientific information in
question has been disseminated. Note that researchers often self-archive interesting publications
they have read either as paper copies or today increasingly as bookmarks or in a database.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 14/68
Searchable
Metadata
Metadata of Interesting
Publication
Find Out
About Publication
Alerting
Message
0 mk
1
Retrieve Publication
Retrievable
Publication
Retrieved
Publication
0 mk
2
Disseminated Scientific
Knowledge
Distributed Paper
Copy
Read Publication
0 mk
3
Self-archived
Copy
Figure 7 Study the Results (Diagram A3)
Space does not here allow describing the model in all its details, but readers are referred back to
the earlier report (D2) published on the SciX website.
The initial experiences of using a formal process modelling language for studying this
phenomenon have been very positive. The studied process is by its very nature rather linear
(contrary to for instance architectural design), which makes the modelling easier than for
processes involving a lot of networking or iterative procedures. Also colleagues to whom the
model has been shown have quite easily grasped the fundamentals of the IDEF0-notation and
have been able to follow the logic of the model.
1.3 THE COST STRUCTURE OF OPEN ACCESS JOURNAL PUBLISHING
Cost studies related to the process model presented in the Scientific publication life-cycle model
aim at presenting the costs to society of the activities in the model. In this part costs for the
activity “publish and disseminate the results" are presented. We aim at answering the question
whether open access publishing as a new form of publishing differs in cost structure from
traditional publishing.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 15/68
The optimisation of cost for the whole process is interesting from the point of view of society,
which to a large part is financing the production of research results as well as the publishing and
use of research publications. The production of research results is probably the most expensive
part, but the one least affected by the reengineering efforts facilitated by the Internet. These costs
are therefore relatively stable and unchanged from the "as-is model" presented in deliverable 1 to
the "to-be model" presented in deliverable 2. On the other hand the activities "publish and
disseminate the results" and "study the results" have undergone considerable reengineering with
the change from print publishing to electronic. The cost efficiency of both the production and
dissemination of publications should be optimised from the perspective of society.
For the cost calculations we have characterised the journals into two main categories. The
traditional business model, which is subscription based, that is based on paid access and the
relatively new model facilitated by the Internet, the open access model.
Accepted manuscript
Review
manuscript
Manuscript
$0
Queue for
publishing
Reviewers' comments
2
$0
6
Revise
manuscript
$0
Publish article
in journal issue
4
$0
Index article
7
$0
10
Revised manuscript
Publish article
on the web
Acquire
access rigths
$0
Index web
source
Subscription
$0
8
$0
11
1
article indexed in general search engine
Make journal available
inside university
$0
article indexed in bibliographic index
3
Search for
and retrieve
article
$0
5
Disseminated research results
Read article
$0
9
Figure 9. A model of the value chain of scientific publication. The model contains both the
activities of traditional mainstream publishing as well as Open Access publishing. Those
elements, which can be bypassed in the Open Access model have been indicated by a darker
background.
Figure 9. is an illustration of the value chain for delivering a scientific article to its potential
readers. The essential differences between current mainstream publishing and the Open Access
models as practiced by most independent OA journal publishers are:
ƒ
OA publishing is usually much faster and unnecessary delaying of publication in order to
stick to a regular issue schedule is usually avoided.
ƒ
Traditional publishing relies to a large extent on commercial indexing services for
spreading information about an article to potential readers.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 16/68
ƒ
OA publishing has until now mainly relied on general search engines as a means of
"marketing" their content to readers who do not regularly follow the journal.
ƒ
In traditional publishing there is a need for an intermediary between the publisher and the
readers for setting up the subscription arrangements and this has even been accentuated in
the electronic environment leading to the setting up of library consortia.
ƒ
In traditional publishing there is a cost to society in the form of many potential readers
who don't get access to research results they would have needed due to high subscription
cost or the amount of extra effort needed to get access to a publication.
Although the Open Access mode solves this problem, potential readers may fail to find out about
interesting OA articles because these are marketed efficiently only to a select community of
researchers and because general web search engines are rather inefficient tools for finding
relevant and quality assured material.
Characteristically open access journals have a large amount of funding from the publisher’s own
institution. A survey to collect data on the cost structure for open access journals was conducted
in May 2003 as a web survey. 60 answers to the questionnaire were collected and the percentage
of answers amounted to 20% of the target population.
In the comments to the questionnaire reported in the to-be report, (Deliverable 2), many of the
editors mentioned “volunteerism” and “it’s a part of my normal work “ as other sources of
funding. Grants are also one important source of funding. In some cases professional institutions
provided funding for journal publishing. Advertisements, member fees and author charges were
the source for main funding in few cases. The participants were able to give more than one main
source for funding. Figure 10 gives a breakdown of the main sources of funding.
Main sources of funding of open access journals
other funding
19 %
grants
15 %
author charges
4%
member fees
5%
advertisments
1%
publisher's institution
56 %
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 17/68
Figure 10. Main sources of funding of open access journal
The average number of published articles in 2002 for the open access journals was 20. The
reletively low amount of articles compared to major journals in the respective field also indicate
that if the work editing a journal is done as voluntary community service this is the maximum
amount of articles ( 20 published articles means on average 40 submission to guide through the
review process) that can be processed without the help of employed staff.
Since open access publishing in its current form is presumed to be “volunteerism” to at least
some extent, we knew that exact cost data covering the whole publishing process would probably
be very hard to get from all participants. The difficulties in providing cost data is also mentioned
and regretted by several of the participants in the free comment section. Editing a journal is
considered as part of the ordinary work at a university and a good portion of idealism is also true
for carrying out the editorial work.
The participants had three options for giving cost data; they could use one or all three. They
could give the information as direct expense numbers or as budgetary cost. As an alternative they
could give the approximate time spent on a task. The last alternative was used in most answers. It
is also the most relevant measure and best reflecting the characteristic of the work of publishing
open access journal. The editor puts in a lot of his own time and this is not calculated as direct or
budgetary cost. Instead of giving information on individual tasks such as administration,
marketing etc. the participants could give an approximation of total time spent for running the
journal.
50 participants of the total 60 that took part in the survey provided cost data in some form,
mostly as an approximation of the time spent on a task. Direct expenses and budgetary cost
figures greater than zero were provided by relatively few. For most general management tasks
the reported value is zero, which means that the editors state that they have no direct expenses or
budgetary costs. If they could not answer the question the editors were asked to leave the answer
field blank. The median values for time spent on general management tasks per year are shown
in Table 1. The editor’s general approximation of the total time spent on general management is
considerably higher. This option was used when the editor was not able to provide numbers for
the individual tasks.
Due to the low number of observation of direct expenses and budgetary costs the time spent
reporting unit is used in the reporting.
Table 1. General management, measured by the unit" time spent per year" for open access
scientific journals
Time spent on general management / year
Task
Administration
IT-infrastructure
Planning issues
Marketing to authors
Marketing to readers
Other
SUM
© SciX Consortium 2004
Median
50 h
40 h
50 h
20 h
3h
0h
163 h
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 18/68
Editor's approximation of the total time
spent on general management per year
250 h
For open access journals the time spent on article specific activities or relating to processing of
an article is 22 h (see Table 2).
Table 2. Article specific activities for open access journals
Article specific
activities
Editor's review
Time spent
median value
10 h
Peer review
6h
Technical editing
4h
Placement of article
2h
Sum
22 h
For a comparison of production costs of open access journals to traditional journals the measure
“time” could be transformed into monetary value by calcualting the salary costs per hour for the
editor and assisting persons.
The following example indicates how this could be done. To convert the labour time spent value
into monetary value we have used an approximation of salary costs for the editor the reviewers
and an assistant.
The labour time spent, 22 h is divided between the editor (professor), reviewers (professor) and
an assistant (student or other office staff) approximately so that the editor and reviewers work 16
h (median value) on the review process, and the assistant works 6 h (median value) on technical
editing and placement of the article. The salary / hour, including social costs, is for the editor and
reviewer approximated to 45€ and for the assistant to 20€.
Thus the calculated costs for the article processing would be:
Review process 16 h (10 + 6) * 45 = 720€
Technical editing and placement of the article 6 h (4 +2) * 20 = 120€
The first copy cost exclusive of time spent on the review process is 120€ and including the
review process 840€.
In a similar way time spent on general management time could be transferred to costs.
The constraint regarding this question of cost structure is that a typical open access journal does
not have a budget and actual direct expenses or out-of pocket money is reported in only a few
cases. The calculation of overhead costs for using the space and computer equipment and
network of the publisher’s or editor’s institution is an alternative to the direct expenses and
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 19/68
budgetary costs of a traditional publisher. On the other hand the time spent by the editor and the
reviewers on the review process is not normally counted as a cost in traditional publishing. The
elements of the value chain, reported in Figure 1, queuing time for publishing and the
subscription and access rights handling are production costs that can be bypassed in the open
access model. The main immediate impact of open access publishing would be in the reduction
of subscription prices in library budgets.
1.4 THE ANALYSIS OF BARRIERS TO CHANGE
After a decade of experimenting there is now quite a lot of evidence about the possibilities and
difficulties in making Open Access a real alternative. Table 1 below can be used as a starting
point for a discussion about the prerequisites and barriers for open access publishing. The three
channels discussed here are open access journals, which function as primary outlets, and subjectspecific and institutional repositories, which mainly function as secondary outlets
complementing the mainstream channels of journals and conference proceedings. Self-posting on
the web is left outside the discussion, even though it is a rather important channel at present.
The barriers and means have been classified into six different categories: Legal framework, ITinfrastructure, Business models, Indexing services and standards, Academic reward system,
Marketing and critical mass. In the table 3 the number of asterisks (from zero to three) denotes
the importance of a particular item in hindering a rapid transition process.
Table 3. A classification of different types of barriers for increased open access publishing and
their relative importance
Open access
Journals
Subject-specific
repositories
Institutional
repositories
Legal framework
-
*
**
IT-infrastructure
**
*
**
Business models
***
**
*
Indexing services and
standards
**
-
***
Academic reward
system
***
*
*
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 20/68
Marketing and Critical
mass
***
**
***
In the following these barriers will be discussed one by one.
1.4.1 LEGAL FRAMEWORK
Open access journals
In the case of traditional journals, typically published by commercial publishers or learned
societies, the author usually grants the publisher a rather exclusive copyright, in return for the
services that the publisher renders the author. Many copyright forms grant the author the right to
limited distribution of copies to colleagues. The emergence of the Internet has brought into light
a particular problem, concerning the non-commercial distribution by posting copies on the web.
In many of the copyright forms which publishers ask authors to sign, this area is not properly
addressed and constitutes a grey zone.
Open Access journals, on the other hand, have from the start adopted a rather liberal approach
reminiscent of the licensing schemes used by the Open Source programming community (often
referred to as copyleft). As a rule the author retains the copyright to the work. What the open
access journals typically are interested in is that the paper, if made available elsewhere in the
exact format of the journal, is attributed to primary publication in the journal, and also that
nobody else (except the author) can resell the content. Currently used copyright agreements for
OA journals are quite satisfactory from both the author’s and the journal’s viewpoint.
Subject-specific repositories
A strong impulse for subject-specific repositories was the long lead-time between submission of
a draft manuscript and the publication of the full paper. In some areas of science, such as highenergy physics, a tradition of scientists exchanging preprints on paper already existed and the
new repositories just developed this mechanism further.
One of the problems with low cost subject-specific repositories is that due to the high number of
papers in the successful ones, the managers of the service have no resources to check the legality
of the papers posted. It is up to the authors and their discretion to take out papers once they have
been accepted for publication, if they have signed copyright agreements, which prohibit keeping
the copies on the server. The legal problems resemble the situation with institutional repositories
and will thus be discussed below.
Institutional repositories
Institutional repositories will in early stages get their initial content from works of the faculty for
which the university itself or the authors retain the copyright, such as Ph.D. thesis and working
paper series of departments. These entail no legal problems. In the longer run, however, the
critical mass of institutional repositories depends on the inclusion of the best work of each
university’s faculty, that is the journal papers published elsewhere. From a legal viewpoint this
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 21/68
constitutes a challenge, since university administrations will be very careful not to be break any
copyright contracts.
Many of the major publishers have recently, if the author asks for it, granted the authors
permission to the parallel non-commercial electronic publishing on the web pages of the
university of the author. In a project conducted by the Loughborough University several leading
publishers were asked about their official view concerning the publishing of the manuscript or
the finalized paper in open access servers (Romeo). Of these 33 publishers agree in some form,
whereas 49 gave a negative answer. Together the publishers who participated in the survey
represented 7169 journals. When the results were weighted according to the number of titles, 49
% of the journals permitted the publishing of either or both versions.
Although many publishers currently are quite liberal in their attitudes towards parallel noncommercial web posting in subject-specific and institutional repositories, there is a lot of
uncertainty in the longer run. If parallel OA-publishing gains momentum and starts to have a
negative effect on subscription income it is possible that the copyright agreements become
tighter and also that compliance to existing agreements is more actively monitored
1.4.2 IT INFRASTRUCTURE
Open access journals
Most Open Access Journals have so far been individual efforts created by single academics and
groups of academics, often managing the journals on a part-time basis. Thus the IT technical
infrastructure of these journals is quite varied, ranging from rather rudimentary static HTMLversions to quite sophisticated database driven systems, depending on the skills and resources of
the creators. The platforms have seldom been bought from outside companies or larger
publishers. One of the drawbacks of these systems is that they are very vulnerable, in case the
person in charge for some reason or other stops working with the journal.
The notable exceptions to this are provided by two major efforts utilising new business models
for running portfolios of OA journals. The technical infrastructure of Biomed Central is on a par
with the leading commercial publishers and includes coding of the papers in XML as well as
workflow management of reviews. Biomed Central gets considerable economies of scale since
they publish almost a 100 journals (BioMed Central). Public Library of Science in recently
launched its first journal (PLoS).
In the longer run the publishers of individual journals would benefit a lot from pooling resources,
for instance by sharing software applications, or using collaborative web hosting. Such
discussions are for instance under way in the Nordic countries for smaller national or Nordic
peer reviewed journals. Another possibility is to use open source applications for running such
journals.
Subject-specific repositories
Like OA journals most subject-specific repositories are the results of individual efforts and the
corresponding IT systems have been made by the academics themselves. Although there would
be benefits of sharing IT-resources also for subject-specific repositories, this might be more
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 22/68
difficult to achieve in practice, since such repositories often are bundled with other web-services
to the research community in question, thus necessitating quite a lot of customisation.
Institutional Repositories
Institutional repositories present a rather different picture from current OA journals and subjectspecific repositories. University libraries have considerable funds at their disposal and are used
to outsourcing part of the work in building their IT-infrastructure. They also take a very longterm perspective in the setting up of institutional repositories.
When universities start planning for such systems they are likely to use one of the following
solutions or perhaps combinations of these:
·
Plan for joint national collaboration platforms
·
Use well-proven open source applications
·
Buy the software from outside IT-consultants
·
Outsource the whole service to commercial publishers
An example of the first option is the Dutch DARE project. The currently best-known open
source solution is the D-space system, originally developed by MIT for its own internal use but
currently offered for use to other universities.
1.4.3 BUSINESS MODELS
Open Access journals
Most Open Access Journals have so far been established by individual pioneers or groups of
academics. The main business model has been to minimise costs and to fund the operations as a
form of open source project, where hardly any transfer of money is involved and all costs are
absorbed by the employers of the individuals participating. This business model is very
vulnerable to sustain operations in the longer term and for scaling up from a few papers per year
to larger publication volumes, since that might necessitate employing staff. Its also not well
suited for such journals where copyediting and layout work of graphics etc cannot be handled by
the authors themselves.
Other possible business models, which would provide more funding for professional-level
operations (such as the employment of staff) include advertisement, subsidies from learned
societies or research funding agencies, or author charges, in order to keep the end product freely
available on the web, rather than take recourse to subscription fees. All of these have and are
being tried out, in different combinations. The most controversial is the one involving author
charges (for instance used by the BioMed Central journals) since this reverses the role of a
publisher from a seller of a commodity to consumers to a provider of services to authors. Getting
individual researchers to pay sums in the order of 500-1500 euros for publication might be very
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 23/68
difficult unless a journal already is regarded as a top-level journal in its field. A way around this
dilemma which is being tried out by BioMed Central is for the publisher entering into “umbrella
agreements” with universities who pay a yearly fee covering all article submissions of their own
faculty.
Yet another model is to publish in a hybrid way with a mixture of subscription only and open
access. Each author decides whether his article will be open access, by paying an author charge.
This business model is currently being pioneered by Oxford University Press who recently
announced that they will start using this model for one of their most prestigious journals, Nucleic
Acids Research.
Advertisement can work in some limited fields of science such as medicine, where for instance
drug companies may have an interest. A very important group of players are the learned
societies, which historically were the ones to start scientific journals as we know them now.
They could see Open Access as an important service for their constituency and society in
general. Unfortunately many learned societies see journal publishing as an internal profit centre
generating finance for other activities or an activity, which at least should generate income
enough to cover its cost. From this perspective open access through author charges would still be
acceptable. A further problem is, however, that many offer journal subscriptions bundled with
their membership fees and fear that going open access would threaten the income from such fees.
The business model issue is central to the further proliferation of Open access journals. The
currently dominating volunteer work only model does not easily scale up to large-scale and
sustainable operations and the other business models need yet to demonstrate their strengths.
Through co-operation or outsourcing of part of the work to commercial companies the publishers
of individual journals could obtain the same economies of scale, branding etc, which large
commercial publishers have today. This would however require changing the business model
from the currently dominating Open Source model.
Table 4. A classification of journal business models including examples
PAPER
ELECTRONIC
PAID ACCESS
per article
Document
delivery
Pay-per-view
per journal
Traditional journal
subscription
Electronic journal
subscription
"The Big Deal" or Science
Direct
bundled
HYBRID ACCESS
delayed
The Association of Learned
and Professional Society
Publishers for the journal
Learned Publishing
limited functionality
For example, read-only
possibility.
individial article
basis
Oxford University Press for
the journal Nucleic Acids
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 24/68
Research (NAR).
OPEN ACCESS
community service
advertising
grants
author charges
institutional
membership
Majority of small OA
journals
British Medical Journal
Public Library of Science
Public Library of Science
Biology
BioMedCentral
Subject-specific repositories
Subject-specific repositories have evolved in a few select fields. The selection of fields has come
about through a combination of existing behavioural infrastructure, individual entrepreneurship
and pure chance. The best-known example of such a service is the arXiv service in high-energy
physics. Despite its very low running costs, it still costs money to run and the whole service has
recently been transferred to Cornell University.
Discipline-specific repositories are usually rather tightly aligned with pre-existing communities
of researchers who communicate a lot with each other, meet at regular conferences and publish
in a limited number of journals. It would be very difficult for such repositories to start to either
charge subscription fees or to start to levy fees on authors uploading their papers (one aspect is
that the payments would be rather small per upload and the transaction costs easily could
consume most of the generated gross income). Thus the main options left would be subsidies
from hosting universities or advertisement. Probably the dependency on voluntary work will
prevail.
Institutional Repositories
Discipline-specific repositories will in the longer run be "threatened" by institutional
repositories, since both compete for the same material. If institutional repositories gain
momentum and are indexed effectively through standards such as the Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), they will offer a parallel channel to the same
content as subject-specific repositories, and have clear advantages in their business models.
From the business model viewpoint the development of institutional repositories will depend a
lot on the political decisions universities have to make concerning the future roles in the
electronic world of their libraries and publishing departments. Since the need for storing and
handling paper copies of material from the outside decreases very rapidly the finance thus freed
could be used to finance the institutional repositories instead.
Indexing services and standards
Open Access Journals
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 25/68
One of the major drawbacks of Open Access Journals has so far been that they rarely have been
indexed in the commercial indexing services for searching quality assured publications, which
universities provide to their researchers and students. Information about the publications in the
journals has instead been spread through direct email marketing among select communities of
academics and through being indexed by general web search engines.
Indexing services fulfil in this connection a dual role in helping the marketing of the journal and
its content. Firstly they help in attracting occasional readers who may not even be aware of the
journal’s existence. Secondly the fact that a journal can claim being “indexed in” lends prestige
to the journal and thus helps in attracting more and better submissions. A particularly important
one is the Science Citation Index (and the accompanying Social Sciences and Arts and
Humanities indexes). Academic appointment and grant committees take the impact factors
produced from the SCI in consideration when ranking the output of academics and there are thus
high rewards for publishing in such journals.
This creates a viscous circle. It is very difficult to get new journals accepted in SCI before they
have established a track record, and being outside the “core literature” of SCI makes it very
difficult to get good submissions and establish a track record.
Subject-specific repositories
Subject-specific repositories have usually not experienced any need to be indexed by third
parties. Firstly it would be very difficult since most of the material in them is or will be
published elsewhere, and thus the references should be to those primary sources. Secondly
subject-specific repositories strive to compete (in terms of coverage) with commercial indexing
and full-text services rather than work in symbiosis with them.
The emergence of the OAI-Protocol for Metadata Harvesting (discussed below) may however
change the situation in the near future.
Institutional Repositories
In the same way as for subject-specific repositories it would be difficult for Institutional
Repositories to be indexed in current established indexes for any of the content of which they are
the secondary outlet.
The solution to this dilemma is in general web search engines or in a new type of search engines
dedicated to scientific web content. If an author puts an electronic copy of his own publications
on his web pages the main channel to this is already today through general search engines such
as Google. Dedicated Open Access search engines for scientific content which are tagged
according to the OAI- Protocol for Metadata Harvesting (OAI) are currently under development.
1.4.4 ACADEMIC REWARD SYSTEM
Open Access Journals
The behaviour of academics as they choose to which journals and conferences they submit their
papers is to a very high degree conditioned by the academic reward system. In most universities,
publishing in the leading established journals of one’s field is highly rewarded. Prestige counts
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 26/68
much more than wide and rapid dissemination, and easy access. This system naturally puts
academics (and in particular the younger ones) in a situation where primary publishing of their
best work in relatively unknown Open Access journals is a very low priority.
A system such as this places any new journals, whether subscription-based or Open Access, in a
very disadvantaged position. Only if the journal manages to get a sufficient influx of high-quality
papers it stands a chance of entering into the group of journals with high prestige, and even then
after a delay of a few years.
It is probably idealistic to expect the whole academic community to change its evaluation
system, to take better account of the benefits offered by open access. The experiences of the past
ten years show also that it is very difficult for totally new OA journals to become first rank
journals in their fields. An obvious shortcut is if established journals would change their business
models and become open access, but despite isolated examples, this is unlikely to happen on a
larger scale as long as publishing is as profitable a business as it is today.
Subject-specific repositories
The success or failure of subject-specific repositories has relatively little to do with the academic
reward system, since few important items are published in the repositories alone. Authors upload
their manuscripts to these in order to get more efficient and faster dissemination of their
publications, which also appear elsewhere for reasons outlined above. If such dissemination
leads to them being read, and in particular referenced more often, this could indirectly have a
positive effect on their academic status and provide an incentive for uploading. It is difficult to
envisage more direct rewarding mechanisms.
Institutional repositories
Institutional repositories can function both as primary and secondary channels. As for the first
function (for instance Ph.D. thesis and working paper series) filling them with content is
unproblematic. The wide use of institutional repositories for secondary publication will however
demand a number of measures. Scientists and their departments can be rewarded financially for
posting electronic copies of their work to institutional repositories, or posting copies can be made
mandatory, although the latter solution might be very difficult to implement in practice.
1.4.5 MARKETING AND CRITICAL MASS
Open Access Journals
Since journal publishing is very dependent on getting authors to submit their best papers to the
journal in question, marketing and branding are very important for long-term success. Most OA
journals have not yet been established as brands and on the whole the marketing of such journals
has been very poor, partly due to lack of resources for marketing, partly because a lack of
understanding of the need for marketing. Many editors of OA journals have idealistically
believed that the merits of Open Access and spreading the word via email lists etc. are enough.
The recent launch of BioMed Central, which houses around a hundred OA journals, is in this
respect an exception and this hub might in the near future become a sort of brand in itself. Even
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 27/68
more spectacular has been the start of the Public Library of Science journal of Biology in
October 2003, which managed to become headline news in many media.
There are many ways in which newly established journals can build their prestige. Firstly the
reputation of the editor and the constitution of the editorial board are important. Secondly
attracting enough papers from leading academics early on is important. This can again lead to a
positive chain reaction of citations in other articles and journals and eventually (in the long term)
inclusion in the SCI.
In the summer of 2002 researchers at Hanken identified 317 active OA-journals (Gustafsson
2002). In the study three different sources were used, the most important of which was the
Ulrichsweb database. By comparing the number of journals with the total number of scientific
peer review journals in Ulrichsweb, it was found that the share of OA-journals of the total
number of journals was only 0,7 % and of electronically available titles 1,5 %. Of the new
journals founded in the period 1996-99, about every tenth was, however, open access. In table 5
below the ten most popular topic areas for OA-journals are listed.
Table 5. The most popular areas for Open Access journals
Scientific domain
Medicine
Number of journals
36
Mathematics
36
Education
27
Law
20
Sociology
16
Economics
16
Computer Science
15
History
14
Biology
12
Information Science
11
The real number of open access peer reviewed journals can be assumed to be substantially bigger
than the numbers in Hanken's study. It is very difficult to get information about smaller journals
and journals publishing in other languages than English. Most of these journals are only known
to interested readers in their respective communities, The recently launched Directory of Open
Access Journals tries to improve their marketing by providing university libraries (and scientists)
world-wide with up-to date information about available journals (DOAJ). Currently the directory
includes some 700 journals.
Subject-specific repositories
Subject-specific repositories are also highly dependent on the behaviour of authors, but here the
dilemma is slightly different. The papers uploaded are typically drafts intended for final
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 28/68
publication elsewhere (as conference papers, journal articles, etc.). Thus the researcher is not
really forced to make an either/or choice. Rather he has to make the decision whether it is worth
the extra effort to upload his paper to the server. And this is dependent on his perception of high
widely used the repository is in his research community.
From a marketing viewpoint it is thus very useful if subject-specific repositories can be bundled
with other information, which is useful for researchers in the domain, such as conference
announcements, email discussion lists, directories of scientists, link lists to freely available
educational resources.
Institutional Repositories
Institutional repositories will have much less of a problem with branding than the other two
channels. When MIT for instance announces the development of its new institutional repository
and that it is offering the software for free to other universities (DSpace), it is in fact using its
brand as a university as a marketing tool and is at the same time hoping to further strengthen its
brand through this action.
In contrast to the individual Open Access journal and the subject-specific repository, the success
of which is measured within a limited community of researchers, the success of an institutional
repository is very much dependent on universities world-wide starting similar repositories, and
on the contents of these being comprehensible accessible to interested readers. Thus IRs follow
the same sort of logic as mobile phones or email addresses, where each new connection adds to
the value of each already existing connection. Only when IRs achieve a critical mass on the
global scale (both in terms of number and content) they will offer a competitive channel for
scientific content.
The development of institutional repositories is just beginning and thus their impact is still at this
moment extremely marginal.
Recent developments
Since the planning of the SciX project, almost three years ago, some very important
developments have taken place. In the following a few of the most important ones are discussed:
•
•
•
•
The emergence of institutional repositories as an important OA channel
The emergence of new professionally operating publishers using OA with author charges
as their business models
The interest that some society publishers have shown for moving established journals
into the OA realm
The political commitment of high level research policymakers to OA and the interest
shown by organisations such as the OECD, the UK parliament, the European
Commission to understand the problems at hand and to define their policies in regards of
OA.
Three years ago the two main OA channels where OA journals and subject specific repositories,
both categories almost exclusively run as community effort type activities in an Open Source
like manner. One of the primary aims of the SciX project was explicitly to establish guidelines
and collect experiences of how a subject specific repository can be organised in a more
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 29/68
professional and sustainable manner. But during the past two three years a new category of
repository, the institutional one, has emerged as a serious alternative to the subject specific one.
Universities and research institutes have also in the past published certain types of research
publications inhouse, such as Ph.D. thesis and research reports, but are now as they are going
over to open access electronic publishing of these, also showing an interest in putting copies of
journal article and conference papers of their faculty in these repositories as well. The perhaps
best known example of such a repository is the one created by MIT. The software used for this,
D-Space, is also made available to other organisations using an open source license.
Biomed Central has been the publisher that has pioneered Open Access funded by author charges
as a viable business model, and they now publish around 100 journals on their very professional
level platform. After some initial experiences that indicated that it is difficult to get scientists to
pay the 500 dollar author charge, Biomed Central has changed strategy and has entered into
institutional membership deals with individual universities and university consortia (notably
JISC in the UK). The members pay yearly fees which cover all submission from their member
scientists, in the case of JISC all English authors. This seems a very promising route and also
Public Library of Science is starting to explore this revenue model.
Both BiomedCentral and PloS are upstart companies. In addition a number of established
publishers are starting to experiment with a hybrid model, where each author gets to decide
whether his article is open access or not, by paying an author fee. Table 6 contains information
about some known examples (Source: mails from Thomas Walker and Peter Suber).
Table 6. Examples of journals practicing the hybrid approach to OA publishing
JOURNAL
SINCE
PRICE LEVEL
Limnology and Oceanography
Jan. 1999
> 500 USD
Entomological Society of America
Jan. 2000
120 USD
Physiological Genomics
July 2003
1500 USD
Company of Biologists
Jan 2004
800 USD
(4 journals)
(3 journals)
The Scientific World
Nucleic Acids Research
150 – 600 USD
2003
500 USD
The Journal of Limnology and Oceanography claims on its web pages that articles published as
open access in 2003 have been downloaded 2.8 times as often as those that require a subscription
(the figure for 2002 articles was 3.4).
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 30/68
One way to tackle the barriers to more widespread OA publishing is action on a political level,
involving decisionmakers from organisations that fund the research, employ the researchers etc.
A concrete manifestation of such actions is the Berlin declartion which was signed by a number
of very highranking university representatives from Germany and France . The following extract
from the text, signed in October 2003, speaks for itself:
Our organizations are interested in the further promotion of the new open access paradigm to gain the most benefit
for science and society. Therefore, we intend to make progress by
•
encouraging our researchers/grant recipients to publish their work according to the principles of the open
access paradigm.
•
encouraging the holders of cultural heritage to support open access by providing their resources on the
Internet.
•
developing means and ways to evaluate open access contributions and online-journals in order to maintain
the standards of quality assurance and good scientific practice.
•
advocating that open access publication be recognized in promotion and tenure evaluation.
•
advocating the intrinsic merit of contributions to an open access infrastructure by software tool
development, content provision, metadata creation, or the publication of individual articles.
We realize that the process of moving to open access changes the dissemination of knowledge with respect to legal
and financial aspects. Our organizations aim to find solutions that support further development of the existing legal
and financial frameworks in order to facilitate optimal use and access.
(source: http://www.zim.mpg.de/openaccess-berlin/berlindeclaration.html)
Another current action is the UK parlimentary inquiry, in which both commercial publishers and
OA proponents such as Biomed Central and PloS have been giving testimony regarding their
views on the efficiency of the current system. This action indicates the concerns of legislative
bodies for the state of affairs. The European commission has also shown an interest in further
investigating the issue by issuing a tender request in September 2003 for a report on “study of
the economic and technical evolution of the publications markets in Europe”.
Very closely related to the open access to research publications is the access to research data
produced with public funding, a very important issue for instance in biomedicine. Here the
OECD has recently been active and has in March 2004 published a final report entitled
“Promoting Access to Public Research Data for Scientific, Economic and Social Development.
1.5 Conclusions
General conclusions:
•
The impact of Open Access channels on the whole flow of scientific publications is still
very small. General awareness of the advantages of OA publishing is naturally a
prerequisite for scientists choosing to use OA channels both for primary and secondary
publishing and much remains to be done to achieve this. On the other hand the
emergence of OA channels has put mainstream publishers on their toes actively looking
at new business models.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 31/68
•
The enthusiasm and iconoclastic spirit of the early days is now changing into a more
realistic search for sustainable business models, and a better understanding of the
formidable barriers to change. The most common form of OA is in fact still selfpublishing by authors who put copies of their own publications on their own home pages.
This is, however, not an efficient long-term solution.
•
The dominating business model for OA journals and subject repositories is still the
community service model. In the long run this model does not look sustainable. The
author charge model for OA journals could be a solution, but there are still many open
questions.
•
The costs per article for OA journals are clearly lower than for mainstream
print+electronic journals, but not as radically lower as some proponents of OA have
suggested.
•
Institutional repositories in principle offer many advantages for parallel publishing
(archival security, sustainable financing) but the copyright challenges need to be
resolved. The central lever for change is the point at which the author of a publication
decides where to submit it (and also weather to upload a copy to a repository).
•
In Europe there are numerous regional of national journals published in English or other
European languages, often published on shoestring budgets with public subsidies. These
would definitely benefit from going OA and would need support with IT-infrastructure,
advice etc.
•
OA journals have not been very good at marketing. Solutions such as the Directory of
Open Access Journals (DOAJ), which has been set up by the Lund university library with
the help of SciX data, can be helpful. Branding is also extremely important from a
marketing viewpoint. A key issue for marketing and awareness is the efficient indexing
of Open Access material is the success of the OAI-Protocol for Metadata Harvesting.
Conclusions from the SciX work
•
It is very important to get a critical mass of initial content. The easiest way to achieve this
is via partnership arrangements with organisations that have a legacy repository of
existing publications. In the case of SciX this has been achieved through organisers of a
number of recurring conference series.
•
These associations have membership fees, which include getting the proceedings to such
conferences for free or a reduced price. It is difficult to persuade them that totally free
access is a much better solution, because they fear a loss of revenue if access is not
restricted to membership. There is a one-time cost of digitising and handling of existing
material with cannot be funded as a “community service” type activity. The longer term
running costs will be much lower though. In the case of OA journals, such as ITcon, there
is no such legacy material. Here the challenge is getting authors to submit their best
material for publication.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 32/68
•
Making the products developed available as Open Source solutions is a fruitful extension
of the community service ideal, and will help accelerate the developments. Thus SciX
solutions have been successfully applied for other domains.
•
Raising awareness is very important, and this has to be done partly on the national level.
Librarians, authors, publishers etc. are to a large extent inspired by concrete examples of
how OA works. The timely dissemination of SciX results has for instance resulted in the
founding of the FinnOA committee by the National Library of Finland. The initial
experiences of using a formal process modelling language for modelling scientific
publishing have also been very positive.
Trying to get researchers to support the move towards Open Access, which most agree in
principle would be good for the advancement of science, is like trying to get people to behave in
a more ecological way. While most people recognise the need to save energy and recycle waste it
takes much more than just awareness to get them to change their habits on a large scale. It takes
a combination of measures of many different kinds, such as technical waste disposal
infrastructure, legislation and taxation to get massive behavioural changes underway. The same
is true for making the results of science openly available on the web.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 33/68
2. CHANGING ROLE OF PROFESSIONAL ASSOCIATIONS
Academic publishing started in the hands and under the control of learned societies. Open access
to digital publishing has the potential of returning control to these societies again; SciX made a
major contribution in this context. While most open access activities has been focusing on
institutional repositories, SciX managed to establish good collaboration with several societies
and associations. This section summarizes the SciX experience in this area.
2.1 INTRODUCTION
For a long time, scientific publishing remained largely in the hands of learned societies and
similar, scientist-driven associations. Publishers have been entering the market since the mid 19th
century, but their role has been marginal and profits negligible until the 1960s, when the Science
Citation Index (http://www.isinet.com/) was introduced and the number of universities around
the developed world grew quickly.
An important goal of SciX was therefore to engage the several communities. Thousands and
thousands of papers have been published in conference proceedings. The reach of these
publications and therefore the potential impact of the work reported is limited to the conference
participants. Most of these proceedings are not available electronically or archived in libraries. It
is in the interest of the associations and societies as well as of an individual author to make sure
that publications reach as many readers as possible.
In the SciX approach, the digital library is just one of the services that are required by a
community to set up an electronic publication process and build efficient collaboration and
knowledge management around it. They are discussed in Section 3 of this report.
2.2 COMMERCIAL COMPETITION AND RELATED WORK
The idea to use the Internet for scientific publication is not new. Existing solutions are of the
following types:
•
Preprint archives offer drafts of papers that have been submitted to publication in paper
based journals. No quality control is provided. Often, the papers are quite similar to the
final works published. Perhaps the best known such archive is the Los Alamos or arXiv
preprints archive (http://www.arxiv.org/).
•
Electronic journals (eJournals) and magazines (eZines). Similar to ITcon they provide
similar quality control mechanisms as paper based publications. 400 such journals have
been believed to have existed in 1999, including a Journal on Electronic Publishing.
Today this number is estimated at over 1000.
•
On-line bibliographies are collections of papers (usually without full text) from a certain
discipline. An example from a related domain to W78's is the ARCOM collection of
abstracts
from
construction
management
and
economics
(http://www2.auckland.ac.nz/lbr/prop/propres.htm#ARCOM). After having been
published as a booklet for a number of years the abstracts are currently freely available
through a database on the web. Another well known example is the CiteSeer service
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 34/68
offering full texts of some 2.5million papers related to computer science. CiteSeer is
accumulating the papers from the Web and copying them from authors' websites to one
central location where they are, classified, index and cross-referenced.
Both professional organisations, groups of publishers as well as specialised companies are
providing added value services related to scientific publishing. An example is the CIB's database
ICONDA, as well as several others. Several bibliographical databases are providing
sophisticated search engines on bibliographic information about publications (such as titles and
abstracts). Full texts are, as a matter of principle, not available.
Table: Commercial indexes and bibliographic databases.
Ei Compendex
ICONDA
RSWB
CumInCAD
CiteSeer
Number of records
6.000.000
500.000
575.000
3.000
2.500.000
Availability
$
$
$
Free
Free
A number of web-based services can be found that provide reference to scientific material. Addon services associated with these are typically discussion forums, reviews of journals, books,
conferences and papers and some generic editorial work. As the support for these services is
normally on a volunteer basis the outcome is firstly less focused, -complete and -coherent often
when dealing with very complicated issues and, secondly they are mostly targeted at peers in the
scientific community but not the general population of industry.
2.3 EXEMPLARS IN THE FIELD OF SOFTWARE
The policy of the ARPA and the NSF in the United States was that all research that was funded
through public funding should make the results available for free. This has not been entirely true
for published papers, but has worked excellently with software. Programs written in the context
of research projects were made available - for free, usually including source code - on the
Internet. In fact, the software to run the Internet in the first place was available for free. This
created the critical mass for the so-called open-source initiative (http://www.opensource.org/).
An increasing number of operating systems, application programs and tools are available for
free. Market share of those systems is growing and they are being used as a platform for vertical
applications by companies such as the IBM.
On the other hand, the European funded research projects (such as the 4th and 5th Framework
projects) never made a requirement for making the results publicly available. The excuse used
was that commercial companies are co-funding this work and that they are not interested in
making available what could be their competitive advantage. We are not aware of the scientific
community challenging this system. Labelling most of the reports "restricted" actually restricted
the readership to the project officers and the reviewers.
2.4 EXEMPLARS IN THE FIELD OF STANDARDS
Standards organisations, similarly to journal publishers, do not fund the writing of new
standards, yet they are given the copyright of a standard. They support their organisational
activities by the selling of the paper copies of the standards. Several research efforts that deal
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 35/68
with the computerisation of the building codes stopped at a prototype level, because of the
problems with the copyright to the text of the standard.
Contrary examples are the standards that govern the Internet and the Web. The well known
"request for comments" documents (RFCs) are the results of the work of groups of individuals
and made available, for free, on the Internet, to comment as well as to write compatible software.
In the early 1990s there has been a competition between the ISO and Internet based networking,
best demonstrated through the use of two different email addressing schemes. The Internet
solution, based on open freely accessible RFC822 standard, prevailed.
The development of product modelling standards too started with the restricted publication
model. Only recently the IAI is correcting the mistake and is making the entire IFC standard
available on the Internet for free.
2.5 COLLABORATION WITH PROFESSIONAL ORGANIZATIONS
A strategy to create a complete and relevant topical digital library depends on good relationships
with the associations, whose members were involved in research work and wrote the
publications. It is of utmost importance that these associations consider such a repository as
"their own". Such an approach is also contributing to the reversal of the process that took
scientific publishing away from the professional societies and associations in the early 20th
century. Currently SciX is hosting digital libraries in the field of CAAD, Construction
Informatics, Electronic Publishing, Environmental Studies.
CUMINCAD - an acronym for CUMulative INdex of CAD / cumincad.scix.net - was created as
response to limited, difficult access to scientific information in the field of CAAD. At this point
associations such as ACADIA, CAAD futures and eCAADe had already been in their second
decade of existence whereas CAADRIA and SiGraDi had only been newly founded to serve the
“regions” Asia and Latin America more effectively. The conference proceedings can be regarded
the only tangible result of an association's activity, in those days being, however, paper-based
publications with less than 250 copies making them practically unavailable. They became “gray
literature”.
What was clear from the very beginning was that the success of a library like CUMINCAD
would strongly depend on its contents, i.e. the availability of a critical mass. This was realized
pretty soon as far as CUMINCAD was concerned and publications were provided by the
mentioned associations, basically furnishing digital datasets of the most recent conferences, but
also including retrospective digitalization (even way back to the first conferences held).
Principally, electronic data procurement for the “younger” associations turned out to be easier, as
datasets have been archived ever since the mid-nineties, being, however, dependent on the
individuals concerned in many cases with their differing habitudes regarding data archiving.
Some data carriers and software packages are no longer in use and thus corresponding computer
equipment has also been eliminated.
Even though the "open-access"-idea remained the partners main focus, three classes of users
were determined for CUMINCAD: Entering anonymously only bibliographic information is
disclosed, this being an incentive for registration as "friend" free of charge, as this status entitles
to viewing summaries and advanced research features. Retrieval of full-text versions – so
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 36/68
available – had been restricted to the members of the CAAD-Associations (700 persons) so far.
Thus the continuing contribution of full-papers in pdf-format was granted, and the associations
could offer this as an additional service for valid memberships. At the time of writing over 2.000
"friends" are registered and we can assume that a significant segment of the scientific community
throughout the field of CAAD will be reached this way.
SciX is also providing hosting for digital libraries of other fields of science. Several communities
such as the CIB W78, ELPUB Conference series, IAPS Association, etc. are already using the
SciX platform to implement their digital archives. Ironically, ELPUB stands for ELectronic
PUBlishing, the ELPUB-proceedings, however, have not been published electronically. The
"history" of ELPUB started in 1997. Nearly all papers from the early days of ELPUB are
archived (on paper) and have recently been digitized to produce e-prints. Digitization from
paper-based materials by means of scanning and OCR is much more labor-intensive than
"reusing" archived digital data sources. The latter also allow for a more compact and highquality reproduction including full color, which is not available in print form.
The previous situation of ELPUB regarding availability of e-papers situation is not unique and
basically applies to many other academic associations, which function mainly on the basis of
volunteering work. The International Association for People-Environment Studies (IAPS) e.g.
has a much longer history than ELPUB as its activities started more than three decades ago.
IAPS is aware of the fact that their presence particularly in Eastern-Europe could be improved,
whereas a digital library could serve as strong accompanying measure. The kick-off with a selforganizing repository has already been accomplished (http://iaps.scix.net). In a first attempt
scientific contributions from 1996 on have been recorded, as far back as data in an electronic
format were to be gathered. Instead of searching individual proceedings one by one, an overview
search can be performed by the end-user more conveniently.
Since 1988, about one thousand papers have been published in the CIB-W78 proceedings. Most
of these proceedings are the so-called grey literature - published by the workshop organisers not generally available to a broader audience. And yet in this community valuable contributions
have been made, particularly in relation to computer integrated construction and product
modelling.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 37/68
2.6 SUMMARY
The table below summarizes the impact that SciX is having on some communities.
association
# of
pap
ers
# of
papers
with
full text
# of
register
ed
users
access model
sustainability
ACADIA
565
505
2215
members only
papers for free by organizers
CAAD
futures
421
411
2215
members only
papers for free by organizers
CAADRIA
401
393
2215
members only
papers for free by organizers,
repository maintenance
compensated through free
participation at the
conference
ECAADE
992
950
2215
members only
papers for free by organizers,
repository maintenance
compensated through free
participation at the
conference
SiGraDi
534
525
2215
members only
papers for free by organizers
CIB W78
929
886
Open Access
after free
registratio
papers for free by organizers
ELPUB
250
211
0
Open Access
after free
registration
papers for free by organizers
IAPS
233
8
1048
8
Open Access
after free
registration
papers for free by organizers
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 38/68
2.7 CONCLUSIONS
One of the advantages of longer rooted communities, is their track record and prestige,
represented also in the thousands of papers published by scientists, who may now be the
authorities in the field. This track record, however, is remembered by a few dozens who have
been attending the conferences regularly. However, this is all usually documented in the
proceedings which are mostly locked in the cabinets of those attending. All others could follow
the achievements of an association if they were electronically and freely available. In the SciX
project an modular infrastructure was made available that allows societies and associations to put
their history on-line and in some cases (ELPUB and IAPS) full Open Access has been granted
from the very beginning. Negociations towards open access are ongoing with the CAADassociations are ongoing and seem promising. Open Access could comprise all materials being
older than two years. Thus the CAAD-associations could offer their members exclusive access to
recently issued publications as an additional bonus. All assocations in charge have as "discovery"
in common, that their publication "capital" is available for dissemination.
On completion of the SciX-project the commenced repositories are to be continued. ELPUB and
IAPS have been self-organizing from the very beginning. CUMINCAD - which is supported by
the CAAD-associations - will be carried on in a similar way as in its very infancy (1998-2001).
Thus integration of the new conference proceedings always plays a major role. As the required
metadata usually are issued by the conference organizations in a predefined quality, this will
only amount to slight efforts regarding subsequent editing and interlacing with the full-papers in
pdf-format (portable document format).
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 39/68
3. THE SCIX OPEN PUBLISHING PLATFORM - SOPS
SciX Open Publishing Services (SOPS) is software that allows setting up various on-line
scientific publishing media such as:
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
personal archives,
institutional archives,
topic & society archives,
electronic journals,
electronic conference proceedings,
workflow support for the above.
SOPS provides building blocks, such as repository, user management, discussions, ratings,
reviews, review process support etc. out of which the above publications can be built.
SOPS is open in the sense that it provides:
ƒ
ƒ
ƒ
ƒ
ƒ
WSDL definitions of all available functions,
Metadata harvesting according to the OAI-PMH 2.0 standard.
Compatibility with citation management software such as Reference Manager, Citation
Manager and Endnote.
Compatibility with Microsoft Office 2003 Research Task Pane
Really Simple Syndication (RSS) feeds and Office Smart Tags (comming soon).
SOPS is multilingual. It exists in English, German and Slovenian languages.
The SciX Open Publishing Platform (SOPS) demonstrates the approach to open archiving and
open publishing that implements the Web services paradigm.
3.1 ARCHITECTURE
The SciX system has a modular architecture allowing modules to be included, left out, added, or
replaced relatively easily in any particular implementation or application. This is made possible
because the schema shown in Figure 8 is not implemented in a monolithic relational database
application but rather by a numbed of services and applications. The collaboration among them
in applications is established by programmers and system integrators at runtime. The
communication among the modules uses either a more efficient proprietary mechanism (if the
modules are physically on the same server) or XML/SOAP if they are running on different
servers.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 40/68
Other User
SciX Scientific User
SciX Industry User
Application layer providing functionality to end user (Web/servlet based)
3 party
applicatio
n1
rd
3rd party
application 2
electronic
journal
conference
support
digital
library
VAP
web service
administration
Word/
Openoffice.org
client
(Optional)
Value added
publications
XML protocols over HTTP
Business object layer providing web services to applications
other
Metadata
Harvester
OAi
adapter
Repository
Knowledge
management
Collections
User
management
Reviews
Annotation
Discussion
VAP
Content
Managemnt
application
OAi-PMH
data
proprietary
SciX Metadata
Harvester
OAi-PMH
data
data
data
data
data
Data layer
External 3rd party archives (OAI-PMH
compliant)
Personal
Society
archives
archives
Institution
archives
data
data
VAP
Syndication
server
RSS
External
VAP
Hosted
archives
Figure 8: Logical architecture of the SciX pilot showing all major SciX and 3rd party
components and ways of communication among them. SciX developments are shown as dark
boxes with white text. 3rd party components are shown as white boxes with black text. XML over
HTTP communication is denoted by orange, OAI-PMH by red and proprietary or other
communication by black arrows.
Figure 8 depicts all major components of the SciX pilots. It is important to stress, that it does not
show one system, but various components that, when properly combined, result in different
applications offering different functionality to the end user. The figure is described top to
bottom.
3.1.1 APPLICATIONS
Applications are described in relation to Figure 8, left to right.
Application as
in the order of
the boxes on
the top layer in
Figure 8)
© SciX Consortium 2004
Description
Sample application
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 41/68
Application as
in the order of
the boxes on
the top layer in
Figure 8)
Description
Sample application
3rd Party
Application 1
This example application's relation to a SciX based system is only
through accessing the data in the SciX digital library (managed by
SciX repository service) via their metadata harvester that is
harvesting a SciX repository using the OAI-PMH protocol. SciX
repository includes a module - the OAI Adapter that makes it
possible to harvest by that protocol.
OAI Metadata Harvester
3rd Party
Application 2
This example application is using SciX services. For example, it
could use the SciX repository service but build around it some
particular workflow system to support a particular publishing
model. This application is talking to SciX services using an XML
protocol.
itc.scix.net topic repository
harvesting the ITcon.org
journal
Electronic
journal
The electronic journal application provides the functionality to
support an electronic journal. It supports the submission,
reviewing, rewriting, publishing, reading, citing, annotating,
discussing etc. electronic journal papers. The application is
pulling together, in a particular way, the resources offered by
some of the services in the services layer. The application is used
to support the ITcon journal at www.itcon.org.
www.itcon.org
Conference
support
The conference application provides the functionality to support
organization or a conference or workshop. It supports the
registration of participants, submission of abstracts, reviewing of
abstracts, submission of papers, reviewing and publishing. The
application is pulling together, in a particular way, the resources
offered by some of the services on the services layer. Particularly
the workflow in the conference support is different to the journal.
In SciX this application is demonstrated by supporting the
ECPPM 2004 and ELPUB 2004 conferences at
extranet.2004.ecppm.org.
extranet.2004.ecppm.org
arw-2004.scix.net
iaps2004.scix.net
Digital library
This is the most basic of all the applications and mainly provides
for access to the repository service. Other services may or may
not be included, depending on how feature-packed a digital
library application is desired to be. In SciX this application is
demonstrated by digital libraries such as cumincad.scix.net,
itc.scix.net and several others.
cumincad.scix.net
iaps.scix.net
elpub.scix.net
architekturinformatik.scix.
net
raumplanung.scix.net
filozofija.scix.net
europia.scix.net
Web service
administration
Although Web services are only used through the applications,
they do need an administrative user interface that is represented
by this box. The administration functions include extending or
customizing the schema of the data that the service is handling,
setting up access controls, monitoring use through service level
log files, doing large scale data management (such as backing up
data etc). All SciX applications use this module.
each of the above at /cgibin/appname/AdmMenu
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 42/68
Application as
in the order of
the boxes on
the top layer in
Figure 8)
Value Added
Publications
(VAP)
Description
VAP applications provide the normal industry user with access to
industry specific articles such as digests, reviews and summaries
that are produced, by VAP editors and publishers, from SciX
digital archives. The VAP provides the functionality for
maintaining edited articles, collaborative authoring and
versioning and publication of articles to Web sites and other
VAPs. The application enables e.g. digest editors to use the SciX
repository service to search and browse the digital archives, to
read and upload papers, produce citations and bibliographic notes.
Sample application
reocener.rbygg.is
3.1.2 OVERVIEW OF SERVICES
The 3rd layer from the top of Figure 8 shows the SciX services. This section briefly describes the
services included in the SOPS. The name in [brackets] is the name of the software module
supporting this service.
service
short description
Repository
service
The repository service provides access to “works” and the basic functionality such as upload
article, check for uniqueness, create unique article ID, remove article from repository, crossreference articles, create new version of article, bulk upload articles, harvest articles from external
sources, export articles in standard format, enter & maintain author details, enter & maintain
institution details, retrieve an article by unique id, retrieve article by key-word search, browse
repository via citations, search for similar articles etc.
OAI server
This is an optional module for the works service that provides OAI harvesting functionality of the
works in the repository.
Repository
contributions
The service is inheriting a lot from the repository [works] service. Its main factor of distinction is
that it allows users to add information. This is normally not allowed in the works service.
Knowledge
management
The KM service provides enhanced abilities for categorising, classifying, browsing and searching
out information and is based on knowledge management techniques, particularly statistical text
analysis and machine text learning.
Selections
A selection is defined as a bag of works elements. It is created by a user, for example to define a
reading list for her students or to pick, from the bibliography, a selection of papers on a given
topic.
User
management
While anonymous access is important because many people do not like to be bothered with logins and passwords, any active use of the services requires an identification and other related
actions. This is managed by the users service. Several repositories may, of course, share the same
users service allowing for a single password for several services.
Conference
reviews
Service handles peer reviews of conference papers.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 43/68
service
short description
Journal
reviews
Service handles peer reviews of journal papers.
Technical
reviews
Service handles technical review of the work, such as formatting, citation styles etc.
Discussion
forum
The service provides support for threaded discussion about works or any other item.
Ratings
The service allows users to rate (excellent-poor) and justify their rating of the works in the
repository.
Series
This service allows the creation of a series of works; for example a particular conference track
could be in one series, papers related to a particular topic in another; the series may be
overlapping.
News
The news service allows the managers of an application to post news to the end users; about new
features, new added works, announcements of conferences etc.
Each service is typically available through a distinct URL, for example the repository service,
handled by the works.pl module as http://somewhere.com/cgi/works/
3.1.3 OVERVIEW OF THE PROTOCOLS
This section describes, how the various elements of the Figure 8 communicate with each other.
ƒ
the orange "bus" in the Figure denotes the communication among the SciX Web services
and applications. If two components that need to collaborate are on different machines,
they would communicate using XML protocols over HTTP. If the two components that
need to collaborate are on the same machine, a more efficient mechanism that does not
include HTTP protocol overhead, but rather system calls, will be used. The selection
between the two mechanisms will be done automatically and transparently to the user (or
system integrator) who will not need to care about physical locations of the services.
ƒ
the red arrows denote the OAI-PMH protocol. This protocol is increasingly popular for
the exchange of digital library metadata.
ƒ
black arrows denote private and other protocols, private to one SciX service and of no
interest to anyone else. SQL is an example of such a protocol for data access.
ƒ
SciX architecture is compatible with the OAI-PMH. SciX is using an existing open
source metadata harvesting software to feed 3rd party archives into SciX Repository
service. SciX is also providing an OAI-PMH adapter to the Repository Service, so that
SciX Repositories could be harvestable by 3rd party harvesters.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 44/68
3.2 USER EXPERIENCE
3.2.1 DIGITAL LIBRARIES
Figure 9: Metdadata display of the ITC.SciX.net. Note the built in clustering features.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 45/68
Figure 10: Author index of the IAPS.SciX.net
Figure 11: ElPub.SciX.net showing links to ratings and structured discussions in otherwise
default style.
Figure 12: Search results in BibTex format, shown in the EuropIA digital library.
3.2.2 INTERNATIONAL DIGITAL LIBRARIES
These demonstrate that the SciX platform can be efficiently translated to other languages and set
up:
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 46/68
•
•
•
filozofija.scix.net - (institutional archive) Rather bare bones institutional library with
philosophical texts in Slovenian languages.
architektur-informatik.scix.net - (topic archive) Support of a German speaking
association on the use of IT in Architecture.
raumplanung.scix.net - (institutional archive) Institutional archive of the TU Vienna.
Figure 13: Slovenian philosophical archive - showing the main window as well as a selection of
favourite papers.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 47/68
Figure 14: Raumplanung.scix.net - the main menu of a German speaking digital library. Some
translations may be less then perfect because they were done by translators unaware of the full
context of the translation.
3.2.3 ELECTRONIC JOURNALS
The journal that has been publishing electronically for the last 8 years was transferred to the
SciX platform. Implements a very liberal access model - even anonymous users have full access
to full content. Does not (yet) implement reviewing. This is otherwise part of the D12 "e-journal
infrastructure".
Figure 15: Display of the editorial board that uses the user management service to manage all
people related to the journal.
3.2.4 CONFERENCES
iaps2004.scix.net is a conference where the full workflow of the submitted papers is managed by
the SciX platform.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 48/68
Figure 16: Assignment of a reviewer based on keywords in the paper and expertise of the
reviewers. In the iaps2004.scix.net.
3.3 OPEN SERVICES
The open in the “SciX Open Publishing Services” stands for the following:
ƒ
ƒ
ƒ
ƒ
ƒ
ƒ
open access to information in the archives,
open for end user / reader input,
open for programmatic/service based use by WSDL/SOAP interfaces,
open for indexing using the OAI-PMH protocol,
open for use by software such as Citation Management Software or Office Programs,
open source code for all, ready for modifications, customizations and upgraded.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 49/68
3.3.1 WSDL/SOAP INTERFACES
Figure 17: Validation of the WSDL/SOAP interface by XML Spy software.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 50/68
3.3.2 OPEN TOWARDS OAI-PMH PROTOCOL
Figure 18: Validation of the OAI-PMH harvester interface.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 51/68
3.3.3 OPEN FOR USE BY CITATION MANAGEMENT SOFTWARE OR OFFICE
APPLICATIONS
Figure 19: Interaction with Office 2003 and SOPS. Left: A SOPS service listed among other
reference material. Right: Consulting a SOPS bibliographic service while writing a paper with
Word.
Figure 20: Exporting references from a SOPS DL application to a commercial citation
management program (right).
3.4 IMPLEMENTATION
SOPS needs the following infrastructure:
ƒ
a server connected to the Internet with about 20 megabytes of free space + some 4
megabytes for each language version + space for data. Minimal hardware would be an
Intel Pentium 133 Mhz class machine.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 52/68
ƒ
ƒ
ƒ
an httpd server, such as Apache or Xitami.
WODA Database and Web services generator.
a person with some knowledge of Web Severs and Perl language.
The above infrastructure should be installed and tested. The WODA environment is a key
requirements; it enables that the footprint of the individual services is very small and therefore
manageable by the person installing and managing SOPS. WODA is documented at
www.ddatabase.com. The reader should consult those documents on how to install it. The next
section provides some essential information.
3.4.1 SOPS PHYSICAL ARCHITECTURE
SOPS handles HTTP requests in this way (Figure 21):
1) Web browser1 makes a request to Web server to generate a page, for example listing papers
with a certain keyword.
2) The Web server maps the URL to a name of a file in the server’s CGI directory, for example
a script called works. This scrip has to assemble the declaration of a service from different
sources so that it can pass the processing of the service to Woda.
3) The script first calls a file usually called common.pl. It includes the definition of the
application of which a small part is the works service; it defines how various services (like
works) fit together as well as common layout and appearance.
4) The common application definition includes the definitions that are common to all generic
SOPS services.
5) The common application as well as all common SOPS settings are now defined.
6) The declaration of a generic SOPS works service is re-used.
7) At this point, the cgi/works may define how it is different from a plain vanilla works service;
it may extend it in any way in which the Woda service generator or Perl language allow it to
be done.
8) After the declaration of a service has been assembled it is passed to Woda for processing and
…
9) The generation of a reply, that is first sent to the HTTP daemon that …
10) … passes the generated information to the client that made the original request.
1
Any http client can take this role, including OAI harvesters, Office applications, Citation Managers or other SOPS
services.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 53/68
generic, off the shelf
technology
http
browser
application specific
http
server
sops/
scix.pl
cgi/
common.pl
cgi/works
generic Web
service
SOPS specific
sops/
works
Woda
1
2
3
4
5
6
7
8
9
10
Figure 21: Serving an HTTP request by SOPS.
SOPS applications therefore have these main components:
component
role
example
typical size
reused
custom sops
service script
assembles service definition,
customizes generic SOPS
service
cgi/work
100 bytes
not at all, custom in
each application
application
script
defines application and
common application features
cgi/common.pl
10000 bytes
by 5-10 sops
service scripts
SOPS script
defines common features of all
SOPS services on this server
sops/scix.pl
500 bytes
by all sops cgi
scrips on this server
SOPS generic
service
definition
defines a SOPS service as
discussed in Section 3.1.2.
sops/works.pl
10.000 bytes
by all custom sops
service scripts on
this server
WODA
interprete the service definition
as assembled from the
definitions above
woda/uk/woda.p
l
500 kbytes
by SOPS and other
services on this
server
3.5 CONCLUSIONS
The presented services and applications are used on a daily bases by thousands of registered and
several unregistered users. The applications are up and running, populated wit real data. The
software that was created during this demonstration project has some advantages as well as some
disadvantages over the existing solutions. The SciX Assessment and Evaluation report is
detailing those findings.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 54/68
4. DIGITAL PUBLISHING – BEYOND R&D COMMUNITY
4.1 INTRODUCTION
Although digital publishing is coming of age and availability and volume of digital content on
the global network is increasing daily, the work invested by industry professionals in finding
meaningful and relevant information is not decreasing equally compared to more traditional
ways of working using paper based material.
The Industry need for quality over quantity is still a challenge when considering knowledge
transfer from the research and scientific community to industry professionals. Publication of
scientific and research results are directed to conventional dissemination channels e.g. scientific
journals and conferences that have become more and more expensive in terms of resource use
and economic value for the normal industry professional. Relatively few papers published in
journals and conference proceedings are of actual value. In industry terms, the crucial factor is
the time spent in searching and obtaining relevant information, they want relevant information
just-in–time, and can’t afford the luxury of browsing and reading volumes of papers just-in-case
it contains useful information.
To support the to-days industry knowledge worker in pursuing continuing education and
scientific knowledge there is a requirement to support alternative path to the current scientific
publication model that enables more seamless knowledge transfer and access to, up-to-date and
current, state of the art reporting in the form of commentaries, summaries, reviews or editorials
and technology trend analyses branded for authenticity and integrity and that can reliably acted
upon.
4.2 ROLES AND ACTORS IN KNOWLEDGE TRANSFER
The main roles in the knowledge transfer process are creators, providers and consumers.
Knowledge creators are those that conduct research and scientific developments and disseminate
information typically through conventional scientific channels in form of research and scientific
papers e.g. Academia and research laboratories.
Providers provide the channels to disseminate the information e.g. conference organizers, journal
and magazine publishers and scientific and professional communities. Emerging business models
on the Internet also introduce on-line services such as e-Prints, digital archives, e-journals and elearning sites as dissemination channels.
Intermediary providers e.g. industry specific professional organizations, associations and interest
groups actors, who are looking to elevate the information and knowledge exchange in a subject
specific community between their respective members.
Consumers in the scientific publication model are by majority scientists, academics and research
professionals and to lesser extent industry professionals.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 55/68
4.3 CONSTRUCTION INDUSTRY - THE PILOT IMPLEMENTATION
The construction industry is in many respects different from other manufacturing industries.
The industry, predominantly comprised of small companies (97% with fewer then 20
employees), operates in dynamic, temporary multi-organizational project teams forming virtual
organizations on project-to-project bases. What also characterize the construction business
process are one of a kind products being manufactured in unique geographical locations resulting
in low repetition in construction projects. Working together in short-term project teams and at
unfamiliar site conditions present a degree of uncertainty in all construction projects that must be
dealt with in a just-in-time fashion requiring information sharing between project team partners
unparalleled in other industries.
The construction business process is information intensive. Obtaining and managing all the
information needed by the project team presents many challenges. This information is very
diverse in nature – it includes technical, organizational, business and financial sources, together
with health & safety, government regulations, national and international Standards, as well as the
detailed drawings and specifications specific to the facility that evolves through the conceptual,
design, construction, handover, operational and maintenance phases.
Architects, engineers and contractors normally work on project-based assignments where the
project life cycle extending over the several project phases, information requirements vary quite
extensively. The common denominator however is that information search; retrieval and
analyses are conducted under limited project time frame, which requires efficient and effective
just-in-time mechanisms to find, evaluate and access the information to support rapid decisionmaking. This includes finding the right and most cost effective technical solutions often based on
innovative foundations, which need to be referenced and validated through scientific literature.
Manufacturers of building components and software developers, on the other hand, normally
work on product-based assignments. The core difference being that product-based development
cycle normally produces a prototype before the final product is released in production. The
cyclic manner of prototype development extends the learning process where as in projectoriented development you get what you design with little room for corrections along the way.
Designers need constantly to follow up on technological and scientific developments for
example innovations in concrete design has enabled new approaches in building design and
construction of facilities that has required new skills and knowledge to be acquired and learned
by designers and contractors. Other learning processes in a competitive organisation include
following the progress of competitors, evaluation of new product materials and production
methods many of which are reported on in scientific and research papers.
Scientific journals are oriented toward high specialisation and embody the corpus of knowledge
in a particular field. Journals have ceased to report community information and current
awareness, which has increasingly turned professionals toward industry magazines for market
watch, state of the art and community news. However content in professional magazines is not
just targeting professionals, but also a large group of advertisers who finance their publication
costs and as such they are targeting two market groups with different needs. However, for the
few useful articles they print in each issue, magazines are more likely to be reporting on one of
the hot topics of the day contrary to what you might expect to find in a journal. The downside of
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 56/68
many magazine articles is that they are written in a language to suit the majority of the
community. Journal scientific papers on the other hand contain detail, in-depth and complex
discussions on attributes leading up to the research, the research methodology and the
experimentation process that may not be of particular interest to professionals who are, in many
cases, looking for the short condensed version or the concrete conclusions in a language easily
understood by the professional community. There is a delicate balance to be preserved so that the
merit of the scientific research being reported isn’t lost in oversimplification or generalisation to
attract larger audience of readers and the requirements for scientific accuracy and easily
accessible results to be validated and exploited.
In the construction industry there exists a clear need for across discipline knowledge and
information sharing that mainstream media such as journals or professional magazines can´t
address properly.
4.4 EXTENDING THE PUBLISHING MODEL
The extension of the Scix digital publishing model to include industry communities will
introduce new actors and enable new business- and value creation models in the scientific
publishing process. The strategic objective is to provide industry readers with more efficient
mechanisms for obtaining context specific information of interest and thereby achieve more
efficient knowledge transfer from the scientific community to construction industry practitioners.
Re-structuring
information and
knowledge from
scientific papers to
suit industry
Scientific
Community
writing of
scientific
papers
Industry
finding new
information
Model extension
Value Adding Publication
Scix
Digital
Archive
Article
editing
Article
distribution
Article
presentation
Fig
ure 22: Extended Digital publishing model
The principle for the model is to separate information creation, information distribution and
information presentation. This separation of concerns introduces a new knowledge worker in the
value chain, the content creator. This party prepares specialized content and predefined view that
require high specialization and level of expertise. He then brokers the content to one or more
content distributors or presenters, which will be the connection point for the end-user (e.g.
construction information portals, building information centres, learning centres or professional
association web-sites). The model supported will aim at providing the greatest flexibility in
operation of the value added publication services.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 57/68
4.5 VALUE ADDED PUBLICATIONS
Value added publication developed under SciX are designed to provide industry practitioners
with view(s) on to SciX digital scientific archives that is properly translated, edited and
structured to be of added value to target audience groups by bringing together the capabilities of
information technology and human expertise and creativity.
The main stakeholders in the operation of a SciX VAS are consumers who digest the outputs
from a SciX VAP and providers who set-up and operate SciX VAP.
Consumers in the case of SciX VAP can be either electronic consumers (e.g. other SciX VAP,
web applications or e-Portals) or human consumers (e.g. industry practitioners, university
students etc.). The electronic consumers are them self’s providers, acting as intermediary value
creation services that take as input, output from one or more SciX VAP and present it to other
consumers. The added value can be in presentation, localisation, specialisation or aggregation of
content across different disciplines.
The stakeholders involved in operating a SciX VAP service, the providers can be subdivided in
to two categories those that operate free (moderated) services and those that operate commercial
(accredited) services. The first stakeholders group may contain zero profit organisations
providing services to industry communities (e.g. professional organisations, special interest
groups, industry associations and centres of excellence) as well as academic institutions and
research facilities. Stakeholders, which are currently in the business to educate and disseminate
information to designated groups of industry practitioners through their web sites or information
portals. Normally they have some obligations to its members in dissemination of information
either in their bylaws or their strategic policies. Such sites may be freely open or closed to
members only, but are paid for from a common budget consisting of member fees and
contributions and not by the individual user of the information. The latter may contain
stakeholders such as building information centres, best practice and on-line training and learning
sites. These are normally profit organisations who packet information and re-sell to the industry
and know the requirements of the industry, what information is needed, how it is presented and
to which target group. These are normally sites that charge users registration fees or price per use
for accessing information from their sites. Additionally new actors are invited to join the
publishing chain as intermediary processors of SciX content that will create, re-format, transform
or aggregate content and syndicate to the above stakeholders and are capable of forming flexible
business relationships in the publishing process chain.
Value Added Publication (VAP) is a Scix application that demonstrates how SciX content may
be used in vertical markets or subject domains (fig. 2). The aim of a VAP application is to
provide the normal industry user with access to industry specific articles that are created by VAP
operators (e.g. editors and publishers) based on and edited from SciX digital archive content.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 58/68
SciX Digital Library
Service – Service Integration
SciX VAP
SciX VAP
SciX VAP
Engineering
Architecture
Construction IT
Figure 23: Moderated subject specific VAP applications that sit between the industry
practitioner and SciX Digital libraries
These articles in form of digests, reviews and summaries are more suited to indusry specific
requirements and thereby extend the reader base of scientific publications. The section below
outlines some of the main characteristics of a SciX VAP application:
•
•
The Operators of a VAP application are typically industry organizations, federations and
associations, commercial information providers or special interests groups
VAP moderated articles are targeted to industry practitioners while content in SciX
Digital Libraries are more specific to the scientific and academic community
VAP applications can cooperate in a in a peer-to-peer network to disseminate, aggregate or reuse content created by one VAP in another VAP. The logical architecture of the VAP handles
content creation and content presentation as separate concerns. For example a content author (an
expert) that writes about a specific domain of knowledge can make his content available to
several VAP content consumers for presentation. Similarly a content consumer (e.g. an
association) can aggregate and re-publish domain specific content from several independent
VAP publishers to a single audience group.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 59/68
SciX VAP
Architect Information Portal
Architecture
SciX VAP
SciX VAP
Architect
Association
Construction IT
Figure 24: SciX VAP content aggregation
Figure 24 Illustrates an example of the VAP publishing model. The Architect Information Portal
(AIP) is subscribing to content from two independent subject specific services, the architecture
subject service and the construction informatics subject service. The content is aggregated by the
AIP’s own SciX VAP and made available to the portal software for publication to end-users.
4.6 VAP ARCITECTURE OVERVIEW
An overview of the VAP architecture is shown in fig 4. The SciX VAP architecture is designed
to promote flexible use case scenarios and a widely accessible system in a peer-to-peer
arrangement suitable for handling varied needs of industry community’s and end-user
requirements.
Components of the VAP application are shown in coloured boxes (fig 4.) while external
applications are shown white. The architecture is based on the principles of separation of
concerns; a) content creation and content presentation are handled by external applications and
b) content management and content delivery are handled by VAP applications. In other words
the VAP application makes no assumptions about the tools used for creating a VAP article or the
format in which it is stored, nor the method or technology by which it will be presented to the
end user as demonstrated in Figure 4.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 60/68
Industry
editor, publisher
Value added publications
Client applications (MS-WORD,
Open Office)
WORD DOC,
SciX
Services
HTML,XML
Web client
HTTP file upload
Repository
service
XML protocols
over HTTP
SciX
API
proxy
Content Management
System
Content Store
Syndication
server
data
Syndication
client
HTTP / XML_RPC / RSS
HTTP / RSS & File Download
RSS enabled
Web-application,
Web-portal or Web-server
Industry
User
Figure 25: The SciX VAP software
The main components of the VAP architecture are three types of applications, a Content
Management system, Syndication Server and Syndication Client and an interface proxy to the
SciX digital archives.
Scix Interface proxy (SIP). The SIP handles the interaction of a VAP application with the SciX
repository service. The repository service provides access to metadata and full text of scientific
papers in a Digital archive. The SIP uses the repository Dublin Core compatibility to provide
meta-data about scientific papers.
VAP editors use the SIP to search and browse the digital archives, to read and upload papers,
produce citations and bibliographic notes. More specifically the SIP provides functionality for
VAP editors to:
•
•
•
•
Do keyword search or build detailed searches for scientific publications based on
numerous parameters
Browse the digital archive (e.g. list of scientific publications, metadata etc.) by author,
class and keyword
Download scientific publication full text and metadata for building citations and
bibliographic notes
Bookmark repository scientific publications
VAP editors can select a digital archive from a pre-configured registry from a range of available
sites. The functionality of the SIP has been implemented and is made available in the CMS
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 61/68
application apart from the browsing function, which will also draw from the Knowledge
management service.
Content Management System (CMS). The CMS provides the support for maintaining edited
articles, collaborative authoring and versioning and publication of articles to end-users.
The CMS provides two user interfaces (UI), one for authors and the other for managing the
CMS. The author UI is called “Workspace Manager” (WM). In the WM the authors manage
their private article repository, user access permissions and publication of articles. Similarly the
CMS manager UI manages CMS system central resources e.g. users, roles, access privileges and
the public repository that can be access by all users.
The CMS includes a logical repository structure consisting of nested collections and files similar
to OS file systems, locking mechanism to implement chechout/checkin functionality for
collaborative authoring and centrally managed access control to namespaces, collections and
files. The VAP repository consists of three types of namespaces, an owner private namespace
where users store their personal articles and documents, a project namespace where sharable
articles are stored and a public namespace where syndicated content feeds and downloaded
public documents are stored.
The CMS basic functionality has been implemented, but some features have still to be
implemented as part of the user interface, while some are available in a developer version others
are only partially finished.
Syndication Server. The Syndication server consists off a Manager Web application and an
HTTP XML-RPC server application. The Manger application handles the management of
content delivery. The Content delivery is subscription based and automated by a syndication
schedule (e.g. when, where, what). The Syndication server constantly scans the CMS repository
for published updates and new articles. When such updates are available the Syndication server
sends the designated client a notification of an available update in the form of an XML-RPC
request. Notifications of updates are formatted as RSS channel headlines. Upon receiving a
request to download the clients downloads the published RSS channels containing the
information about the new and updated articles and optionally uploads the actual full text articles
for publication to end-users or further processing.
Syndication Client. The Syndication Client is HTTP server application in the P2P network that
listens for XML-RPC requests. When updates are available the syndication server sends the
client a notification in an XML-RPC request. The client responds by calling an XML-RPC
method on the Syndication server to upload a RSS channel that contains information on updates.
The syndication client can then, depending on the VAP configuration, store the RSS channel in
the CMS repository and/or store the RSS channel to a file where it can be picked up by any RSS
enabled application for presentation. Depending on preferences the client can optionally retrieve
the actual articles being advertised in the RSS channel from the Syndication server by successive
XML-RPC method calls and persist them in the CMS public repository namespace.
4.7 COLLABORATION AMONG THE VAP APPLICATIONS
The SciX VAP framework is based on a syndication model. The SciX VAP exist in a peer-to© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 62/68
peer network where the content delivery mechanism allows communication of content between
independent SciX VAP applications and publication applications like Web servers. A VAP
application may contain both or one of a syndication server for delivering content and a
syndication client for receiving and storing syndicated content locally.
The server and client applications can cooperate in a side-by-side configuration (co-exist) on a
single machine or as independent applications on separate machines. The server and client can
schedule automatic update of content over the network between any VAP peers. Scheduling is
controlled by a contract that exists between a server and a client. The contract consists of a
subscription, which describes the business relation ship between the two, the terms of the
contract and a catalogue, which describes the content being subscribed to.The method used to
deliver content updates and full content between the independent VAP’s is based on the RSS
“Rich Site Summary” standard. RSS has gained wide acceptance by Internet developers as the
standard way to publish headline news over the Internet.
Both the syndication server and client share the same data repository as the CMS, which must be
present also for the syndication process to operate smoothly.
Since the content contained in the VAP is fundamentally document oriented (e.g. Word
documents, HTML and XML documents) it is presented to publication services without any
consideration of how it is eventually displayed to the end-users. Any web-application capable of
accepting RSS news feeds can easily publish content from any VAP and reformat retrieved
content based on metadata information supplied in the RSS headline to comply with visual look
and feel of the web-application front end.
SYSTEM 1
Content
Menagement
System
Manage
content
SYSTEM 2
Syndication
Server
Discover
updates
Syndication
Client
1)Notify update
2) Send RSS
channel and
articles
1) Accept update
2) Receive RSS
channel and
articles
Store
RSS
channel
file
Content
Menagement
System
Store
RSS
channel
and
articles
Manage
content
INTERNET
CMS
Repository
OS
Filesystem
CMS
Repository
RSS file
RSS enabled
application
Figure 26: Demonstrates an example, the cooperation of two VAP applications on separate
systems.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 63/68
4.8 EXAMPLE OF A VAP END USER APPLICATION - IBRI RHEOCENTER PROTAL
The example below is an example to an end-user RSS enabled application. RSS files are
downloaded directly into the portal file directory from the syndication client and the portal
software displays and renders a XML RSS channel automatically.
Figure 27: IBRI Rheocenter portal homepage.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 64/68
5. SCIX AFTER SCIX
This section places the work done in SciX in a broader perspective of the partner efforts before
and plans for the life after SciX.
The partners have been active in the field of electronic publishing since the mid 1990s. BoChrister Bjoerk and Ziga Turk have been the editor and one of the co-editors of the Electronic
Journal of Information Technology in Construction (ITcon). The average time from submission
of a paper and the publication has been below 6 months. Each published paper had an average of
about 1000 readers that viewed the abstract and about 1400 that downloaded the full text.
Since 1999, Bob Martnes and Ziga Turk have been managing CUMINCAD - the Cumulative
index of CAD - the largest freely available database of papers related to computer aided
architectural design, particularly related to the education in this area. At conferences organised
by regional organisations of CAAD teachers (ECAADE in Europe, ACADIA in North America,
Sigradi in South America and CAADRIA in Australasia) thousands of papers have been
published. Rarely were the proceedings published by a professional publisher, therefore, the texts
were not entered into commercial indexes and neither were they sold commercially. The full
texts were not broadly available; only conference attendees had copies. On the other hand, the
professional organisations retained the copyright to this work and could therefore allow its
publication/archiving in the CUMINCAD. In this way this work is available on the net and
rescued from oblivion. At the time of writing, CUMINCAD includes 3831 papers with abstracts.
883 of papers are available in full text as well.
A similar effort (with a couple of hundred papers) was created with the EGSEAAI community.
The CIB-W78 in 1996 in Slovenia was the first to use the Internet as the only medium to support
the workflow of conference organisation. The W78 in 2000 in Iceland made it almost 100% web
based.
5.1 PERSPETIVES BY THE PARTNERS
5.1.1 LJU
The main value gained for LJU after its participation in the project will be the enhancement of its
research presence and the gaining of the prestige as a publisher and digital librarian of the
material related to construction and architectural informatics. As a University Institution, the
creation of institutional, national and discipline repositories with such easy to use, quick to set up
and simple to maintain services as the Scix platform will be remain to be a viable most important
non-core task of the research group there. It use its leverage and experience to make a broader
impact of the open access publishing in Slovenia and neighbouring countries as well as in
scientific domains where it excels.
5.1.2 SHH
The Information Systems Science unit of SHH, both teaches and is involved in research of how
the Internet affects the business processes and modelling of the scientific publishing process.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 65/68
Therefore, to SHH a useful output of the project will be a database containing information about
active online open access referred journals. This database will be gathered and updated at the end
of the project.
On the other hand, it will be suitable to implement through the Scix platform a validation process
for open access refereed journals. The validation process would include the peer-review
procedure, editorial and other well-known qualitative measures, thus providing a certificate of
high scientific quality for a journal. A quality label would give the readers and authors an
assurance of quality in the same way that the name of a renowned commercial publisher or
journal does.
5.1.3 TUW
The SciX-demonstrator repositories have shown, that digital libraries related to specific
academic areas are used by the corresponding scientific community. For TUW the Scix
repository will support the dissemination and will allow with various other features to real access
to content of specific dissertations within specified areas from many locations.
Setting up a SciX digital library will causing minimal efforts of time. Adaptation towards the
needs of individual associations will be possible through its modular architecture. Scientific
associations and their individual members are encouraged to contribute their scientific
information to a repository and typical grey literature, for example conference proceedings,
would become accessible. As soon as a critical mass has been gathered, the interest to use this
will grow. Self-organization can be regarded as a stimulation to fill the repository with minimal
investment of time.
5.1.4 USAL
The IS Research Institute of Salford University will be enhanced its research presence due to its
participation in the project, having obtained until now the 6* rated Research Institute for IS
research.
On the other hand, a Institutional repository will be easily built and maintain with the SciX
platform developed during the project.
The technology developed will be expanded to other University units, such SCRI, BuHu, Faculty
of B&I, an to the whole University.
5.1.5 INDRA
Indra has not been before involved in the field of digital publishing until now, therefore, been an
IT service company, its main interest for the work done in the project has been to obtain the
developed software for including it within its offering portfolio. Due to the fact that the platform
has been build as an open source software, Indra as any other partner in the project, will not be
able to license the product, but will be able to offer the platform customization under the way of
professional services.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 66/68
Additionally, from a more internal point of view, another envisioned benefit will be to provide to
the company a new tool for the knowledge management dispersed among their corporate markets
and competence centers.
On the other hand the installation of the platform in Indra with the intention to offer it to their
clients as an Application Service Provide,r will be another way to exploit it and gain important
revenues.
5.1.6 IBRI
Over the years clients from USA, Europe and Asia have contracted IBRI for various research
work leading to establish Centre of Excellence at IBRI, the IBRI Rheocentre. The developed
Scix pilot wrapper service will become part of the web-portal to be set-up for the Centre of
Excellence. The strategic objective of the wrapper service will be to:
ƒ
Maintain state of the art competences through scanning and collecting knowledge from
SciX hosted papers in the form of short lists, extracts and summaries.
ƒ
Make this knowledge easily available to its employees and industry clients.
Utilise and prepare e-Learning content for employees.
5.1.7 FGGI
ƒ
By the participation in the project, the FGG Institute has further developed its code base
for quick creating of Web services. It is reusing that code in the commercial efforts that it
is undertaking both in Slovenia as well as world-wide.
ƒ
Even though it will give away code to set up repositories of scientific papers for free, so
that researchers from other topics could easily set up services similar to SciX, previous
experience with other services and tools have shown that this typically generates
consulting revenue. Even though the code may be free, the organisation setting it up
needs help with the installation and could also be interested in extensions and
customization.
After its participation in such a highly profiled project, FGGI expect to receive technical and
commercial contacts as well as generate promotional value for the company. It has, for example,
secured a job to provide Web services for the most important Slovenian engineering association
which is positioning it excellently for contacts with the engineering community in Slovenia – a
community with which its core business lies.
5.2 CONCLUSIONS
The partners in SciX were involved in limited open digital publishing activities before the SciX
project. After all, the ITcon journal will enter its 10th volume next year. What has been done on a
shoestring budget, marginal cost and voluntary afternoon and evening effort of individuals, has
been a full time job for two years. In addition to contributing to new findings, increasing the
content ten fold, and demonstrating a Web services approach to open access publishing, the SciX
project provided an opportunity to rewrite the software and bring it to a level that scales up with
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 67/68
the growing demand for open access publishing and that will allow it to be, again, a sustainable
and viable voluntary effort for the academic partners. The commercial partners, INDRA, IBRI
and FGGI have gained the expertise and software that is allowing them to become content
providers in the fields of technical publication and information management.
© SciX Consortium 2004
www.scix.net
D20: Final report
version of 30-Mar-04 17:27 page 68/68
6. CONCLUSIONS
One of the first “business” process re-enginners, Nicoc Machiavelli explained at some point:
"It must be remembered that there is nothing more difficult to plan, more doubtful of success, nor
more dangerous to manage than the creation of a new system. For the initiator has the enmity of
all who would profit by the preservation of the old institutions and merely lukewarm defenders in
those who would gain by the new one."
All key players in the scientific publishing process (publishers, librarians, bibliometrists, funding
bodies), with a possible exception of the scientists themselves, have a role to play in the current
system and therefore a vested interest that it does not change. They would profit by the
preservation of the old system.
The lukewarm supporters are the individual scientists who are weak, and the professional
organizations, who are not all that weak. Empowered by the technology that SciX demonstrated
and encouraged by advantages that the SciX analysis is describing, they will be able to make
small, local steps towards a change that will make global consequences and will lead to a
knowledge community where knowledge moves freely, without borders and barriers.
© SciX Consortium 2004
www.scix.net