Building an archaeological project repository I: Open Science means Open Data
This is a guest post by Anthony Beck, Honorary
fellow, and Dave Harrison, Research fellow, at the University of Leeds School of
Computing.
In 2010 we authored a series of blog posts for the Open Knowledge Foundation subtitled ‘How open approaches can empower archaeologists’. These discussed the DART project, which is on the cusp of concluding.
The DART project collected
large amounts of data, and as part of the project, we created a
purpose-built data
repository to catalogue this and make it available, using CKAN, the Open Knowledge Foundation’s
open-source data catalogue and repository. Here we revisit the need
for Open Science in the light of the DART project. In a subsequent
post we’ll look at why, with so many repositories of different kinds,
we felt that to do Open Science successfully we needed to roll our
own.
Open data can change science
Open inquiry is at the heart of the scientific enterprise. Publication
of scientific theories – and of the experimental and observational data
on which they are based – permits others to identify errors, to support,
reject or refine theories and to reuse data for further understanding
and knowledge. Science’s powerful capacity for self-correction comes from
this openness to scrutiny and challenge. (The Royal Society,
Science as an open enterprise, 2012)
The Royal Society’s report Science as an open enterprise
identifies how 21st century communication technologies are changing
the ways in which scientists conduct, and society engages with,
science. The report recognises that ‘open’ enquiry is pivotal for the
success of science, both in research and in society. This goes beyond
open access to publications (Open Access), to include access
to data and other research outputs (Open Data), and the
process by which data is turned into knowledge (Open
Science).
The underlying rationale of Open Data is this: unfettered access to
large amounts of ‘raw’ data enables patterns of re-use and knowledge
creation that were previously impossible. The creation of a rich,
openly accessible corpus of data introduces a range of data-mining and
visualisation challenges, which require multi-disciplinary
collaboration across domains (within and outside academia) if their
potential is to be realised. An important step towards this is
creating frameworks which allow data to be effectively accessed and
re-used. The prize for succeeding is improved knowledge-led policy
and practice that transforms communities, practitioners, science and
society.
The need for such frameworks will be most acute in disciplines with
large amounts of data, a range of approaches to analysing the data,
and broad cross-disciplinary links – so it was inevitable that they
would prove important for our project, Detection of Archaeological
residues using Remote sensing Techniques (DART).
DART: data-driven archaeology
DART aimed is to develop analytical methods to differentiate
archaeological sediments from non-archaeological strata, on the basis
of remotely detected phenomena (e.g. resistivity, apparent dielectric
permittivity, crop growth, thermal properties etc). The data collected
by DART is of relevance to a broad range of different communities.
Open Science was adopted with two aims:
- to maximise the research impact by placing the project data and
the processing algorithms into the public sphere; - to build a community of researchers and other end-users around the
data so that collaboration, and by extension research value, can be
enhanced.
‘Contrast dynamics’, the type of data provided by DART, is critical
for policy makers and curatorial managers to assess both the state and
the rate of change in heritage landscapes, and helps to address
European Landscape Convention (ELC) commitments. Making the best use
of the data, however, depends on openly accessible dynamic monitoring,
along the lines of that developed for the Global Monitoring for
Environment and Security (GMES) satellite constellations under
development by the European Space Agency. What is required is an
accessible framework which allows all this data to be integrated,
processed and modelled in a timely manner.
It is critical that policy makers and curatorial managers are able
to assess both the state and the rate of change in heritage
landscapes. This need is wrapped up in national commitments to the
European Landscape Convention (ELC). Making the best use of the data,
however, depends on openly accessible dynamic monitoring, along
similar lines to that proposed by the European Space Agency for the
Global Monitoring for Environment and Security (GMES) satellite
constellations. What is required is an accessible framework which
allows all this data to be integrated, processed and modelled in a
timely manner. The approaches developed in DART to improve the
understanding and enhance the modelling of heritage contrast detection
dynamics feeds directly into this long-term agenda.
Cross-disciplinary research and Open Science
Such approaches cannot be undertaken within a single domain of
expertise. This vision can only be built by openly collaborating with
other scientists and building on shared data, tools and techniques.
Important developments will come from the GMES community, particularly
from precision agriculture, soil science, and well documented data
processing frameworks and services. At the same time, the information
collected by projects like DART can be re-used easily by others. For
example, DART data has been exploited by the Royal Agricultural
University (RAU) for use in such applications as carbon sequestration
in hedges, soil management, soil compaction and community mapping.
Such openness also promotes collaboration: DART partners have been
involved in a number of international grant proposals and have
developed a longer term partnership with the RAU.
Open Science advocates opening access to data, and other scientific
objects, at a much earlier stage in the research life-cycle than
traditional approaches. Open Scientists argue that research synergy
and serendipity occur through openly collaborating with other
researchers (more eyes/minds looking at the problem). Of great
importance is the fact that the scientific process itself is
transparent and can be peer reviewed: as a result of exposing data and
the processes by which these data are transformed into information,
other researchers can replicate and validate the techniques. As a
consequence, we believe that collaboration is enhanced and the
boundaries between public, professional and amateur are blurred.
Challenges ahead for Open Science
Whilst DART has not achieved all its aims, it has made significant
progress and has identified some barriers in achieving such open
approaches. Key to this is the articulation of issues surrounding
data-access (accreditation), licensing and ethics. Who gets access to data, when, and under what conditions, is a serious ethical issue for the heritage sector. These are obviously issues that need co-ordination through organisations like Research Councils UK with
cross-cutting input from domain groups. The Arts and Humanities
community produce data and outputs with pervasive social and ethical
impact, and it is clearly important that they have a voice in these
debates.
Leave a Reply