Case study: Publishing a data paper - Research Data Service
Case study: Publishing a data
Stephen Gray, PhD Archaeology
Copyright © 2015 University of Bristol
This case study outlines the basic process and motivations for publishing a data paper within
a data journal.
What is a data paper and what is a data journal?
A data paper is essentially a one or two page description of a publically available dataset
which spells out its re-use potential. In addition to the dataset itself receiving a digital object
identifier (DOI), an associated data paper receives a DOIs of its own. This means either can
be cited in an academic context.
Data journals publish data papers. The majority of data journals (including Internet
Archaeology) are actually ‘mixed’ (i.e. they publish both ‘traditional’ and data papers) but
some dedicated data journals are beginning to emerge (Nature’s Scientific Data1, for
My research involves conducting low-level aerial surveys of archaeological sites using an
unmanned aerial vehicle (UAV). My data consists of sets of vertical photographs and
associated metadata. The metadata allows each image to be georeferenced (located to
precise spatial coordinates) and explains how and when each image was captured.
This data can be used by myself or others to create orthorectified maps, 3D models of
structures or digital elevation models (DEMs) of landscape topography.
Two datasets2 from two separate surveys have been submitted to FigShare3 a commercial
and free-to-use data repository. One of these, the Blackquarries Hill Long Barrow survey
dataset (see Fig 1 for the FigShare dataset record), was also written up as a data paper and
the paper submitted to the journal, Internet Archaeology4.
Motivations for publishing a data paper
Publishing a data paper is a similar process to publishing an academic journal article; data is
peer reviewed, amended and made publically available under a citable unique identifier. This
process is familiar to most researchers and so contributes to the academic legitimacy of data
as a valuable research output.
It may be that data papers and the ‘data journals’ which publish them are a temporary
phenomenon, useful only until a new and truly data-centric form of review and citation
evolve. However, in my opinion it’s one of the best ways we have at the moment to
demonstrate that data has value.
Because the low-level aerial survey generates so much data, typically far more than is
required for my immediate purposes, publishing data for open re-use is an attractive
prospect for me.
UAV Aerial Survey - Clifton Camp (ST56557330),
UAV Aerial Survey - Blackquarries Hill Long Barrow (ST77509320),
In order to make this possible I ask land owners for permissions to publish the data. I also
favour non-proprietary formats, wherever possible, to avoid the need for data re-users to
have expensive software. Lastly, I ensure that the metadata I create is sufficient to support
the needs of data re-users and not only my own immediate needs.
If a survey dataset can be made available for reuse, I will submit it to a data repository who
can provide ongoing access.
Fig 1, FigShare dataset record for Blackquarries Hill Long Barrow survey data
The data paper publication process
Publishing a data paper is very similar to publishing any peer-reviewed academic paper.
Internet Archaeology ask for the information listed below. The word limit is 2000 words.
Authorship (including contact details and ORCID identifier)
DOI of deposited dataset
Content of the dataset
Background to the dataset – include context, main aims/objectives of the dataset
(and/or project) and general data methodology
Summary description (if required e.g. if dataset is excavation data)
Scope (incl. period terms or dates/geographical context. You should also note any
data 'gaps'/what is not covered)
Future work and Re-use Potential of the dataset e.g. avenues of possible further
analysis, integration with other datasets etc.
Details of how the dataset relates to other publications/archives (including physical
Acknowledgements and Funding Statement
Internet Archaeology also provide helpful guidance (http://intarch.ac.uk/authors/datapapers.html) to the submission process.
The peer review process
A data papers differ from a traditional journal article in that it explicitly credits the reviewer
and makes their comments available to all and these include potential areas of future
research. Reviewers are also looking for factual accuracy (or an identified degree or error),
reusability of formats and clearly documented and reproducible data collection and
Corrections may be requested following peer review. If changes relate to the data paper,
they can be simply carried out. However, if they relate to the dataset itself, authors are
encouraged to speak with the host repository. In most cases a new, it will not be possible to
remove already-published datasets, in case they have already been used and cited. Instead,
an amended dataset will be deposited with the repository and the metadata form the old
dataset will point to the new dataset as the current version.
In my case only one piece of factual information had to be changed (the National Record of
the Historic Environment (NRHE) monument number) and this was within the data paper.
Common to all ‘gold’ Open Access publications, an article processing charge is payable, for
Internet Archaeology this is £100. If research is RCUK-funded this cost is covered by the
The final version of the data paper was published in March 2015 (see Fig 2).
Fig 2 extract from the final data paper
Reflections on the process
Although I’ve shared my research data before, the data paper which accompanies the
Blackquarries Hill data is the first I’ve created. I created it because I feel strongly that this
particular site deserves to be better known and by publishing the data paper in Internet
Archaeology, I’m encouraging archaeologists and heritage managers to carry out further
work on the barrow.
To some degree this whole exercise has been an experiment in finding an effective way to
communicate my research. It’s too early to tell but I’ll be very interested to see if publishing a
data paper leads to more downloads of the associated dataset or puts me in touch with
potential collaborators. If so, I will certainly publish more data papers.
Leonardo Candela’s, Data Journals: a Survey (2014) is a comprehensive list of data journals
Re3Data (http://service.re3data.org) is a list of research data repositories, organised by
M.A. Parsons and P.A. Fox, question the definition of data ‘publication’ in Is Data Publication
the Right Metaphor? (https://www.jstage.jst.go.jp/article/dsj/12/0/12_WDS-042/_article).