Pre-OKCon Open Science and Social Science Workshop

July 1, 2011 in OKCon

On Wednesday, a diverse crowd of scientists, economists, coders and even a new media artist gathered at Kalkscheune in Berlin for an Open Science Workshop organised by Rufus Pollock and Francois Grey, to discuss how open science, particularly citizen based crowd-sourcing of data can provide valuable sources of open scientific data.

You can view a full list of participants and ideas on the Etherpad, or view a more coherent and permanent record on the workshop Wiki as items are migrated over!

The session kicked off by introducing some successful citizen science projects:

  • – Distributed computing using volunteer CPU time to run simulations of malaria epidemiology under different parameters and incorporating different control measures.

  • Epicollect – A web application for the generation of data collection forms for mobile phone platforms, as well as data collection to project websites.
  • QuakeCatcher – brings seismology into homes and schools using internal accelerometers in laptops or external USB accelerometers in desktops to detect earthquakes. Aims to generate the world’s largest distributed network of seismometers.

Following this burst of inspiration, the group then generated a total of nine hack ideas around the topic of generating quality open science/social science data and took forward some of these in four groups

Data Digitiser

A tool for volunteer sourced transcription of data tables from scanned books/papers where OCR and machine automation is not an option. Suggested applications: Brazilian census data, regression tables from economics articles to allow comparisons across multiple articles examining the same variables.

How? See the dedicated Etherpad for a full run down of what was suggested and achieved.

Output: Working demo of user interface! Data Digitiser displays an image opposite a Google spreadsheet ready for transcription, which can be supplemented by a separate metadata form.

Source code available via Github.

BOINC on Phones (Android)

How to take advantage of the computing power of smartphones to run applications such as

How? Port malaria code to Google Android, probably via Android NDK to compile native C/C++ code and then write an Android Java “stub” to do the computation on Android devices (i.e. mobile phones, tablets, etc). On a grander scale we will also need BOINC to run on Android to launch the Android

Output: On the way! Far too much to do in one day.

Data Cleaning and Quality Control and Assurance

How to integrate data cleaning and general QA/QC steps with open databases and volunteer networks.

How? That was the question the group set out to answer and they concluded that they’d like to see a web based spreadsheet quality assurance tracker. Ideally this would take the form of an overlaid comment/issue flagging system which could then be checked by the data provider.

Output: Lots of interesting discussions and ideas – a summary of these can be found here.


This group discussed the idea of mapping images/photos published on the web using contextual information (detective work) and pattern recognition from satellite images. This is already used in humanitarian crises to do damage assessment and help plan investment for post conflict reconstruction and a version for the tobacco free initiative was planned.

Output: See their discussion and suggestions at the bottom of this Etherpad.

Follow on and getting involved

There is now a new mailing list for people interested in developing apps/tools/datasets etc for open science/open data in science/citizen cyber science.

Sign up here if you’re keen!

One response to “Pre-OKCon Open Science and Social Science Workshop”

  1. […] day before the conference started, I attended the Open Science Workshop, where  I worked with a couple of others on outlining a tool that might be used for […]

Leave a Reply

Your email address will not be published. Required fields are marked *