A recap of some of the activity at OKFestival 2012

September 27, 2012


The following post is by Ross Mounce, one of the two OKFN Panton Fellows.

Wow! Where to begin… In this post I shall attempt to summarise some of OKFestival 2012 that was held in Helsinki just last week from the 17th to the 22nd of September.

Some Background:

I had been to the Open Knowledge Conference last year (in Berlin), where I gave an invited talk on Open Palaeontology and met lots of brilliant people in the Open Science community like Bjoern Brembs, Cameron Neylon & Peter Murray-Rust. But this year the event was even bigger, and even better – teaming up with the annual Open Government Data Camp for a mega-event.

The Event Itself:

The Aalto University buildings of the venue were wonderfully modern and well equipped for this event (inc. great WiFi which was essential for such a digital event as this). I got to Helsinki with our other Panton Fellow – Sophie Kershaw on the Tuesday, and caught the tail end of the Data Journalism session that day including an excellent, inspirational talk on amongst other things. It detailed the amazing knowledge and insight gained from tracking the movement of ships with open data. I couldn’t help thinking that academics could learn a lot from these open data visualization experts (myself included!). This is one of the huge benefits of the conference – bringing together a melange of humanities, scientists, economists, governments, the World Bank (there were at least 20 representatives from this organisation here!), corporates (e.g. IBM), journalists, entrepeuners, and designers all connected by a shared utility of openness.

An interesting example of Shippr data – ships turn off their beacons once they pass the point for fear of pirates…

Wednesday – the Science & Academia session

I really liked the way that the conference had an introductory session to the days parallel events in the morning from 10am – 11am. If one was unsure of which stream to go to – these Morning Plenaries gave each topic stream a chance to pitch their events in a short slot to the awaiting audience. I thought this was very helpful given there were 13 separate topic streams at the conference!

I was involved in two sessions on this Science day. Firstly the Open Access discussion panel chaired by Peter Murray-Rust, the video for which is here with Tim Hubbard (Sanger Institute), Carlos Russel (World Bank), Tom Olijhoek & Mark MacGillivray (Open Access Index) and myself (University of Bath & OKFN Panton Fellow):

It’s a long video. We covered many topics including altmetrics, the lack of access for independent researchers and ivory-tower academics, the role of libraries, ‘illegally’ posting one’s own work up on the internet, incentives for OA and much more… with excellent contributions from the audience including Puneet Kishnor from Creative Commons and Matt Todd from the Open Source Drug Discovery team amongst many others.

Then after this there was the research data session with contributions from Mark Wainwright on CKAN, Mark Hahnel on Figshare and Joss Winn of the Orbital project.

Finally we finished with the Panton Fellowships Session with talks from myself on content mining for phylogenetic tree data, Open Access licencing and the various costs of Gold Open Access options:

and Sophie Kershaw on her Open Science Training Initiative (OSTI) at the Oxford University DTC:

The day was then rounded off with a hugely inspirational talk from Matt Todd who had travelled all the way from Australia(!), summarising his Open Source Drug Discovery work in the main lecture theatre, followed by a lovely traditional Finnish meal & social mixer afterwards in Ravintola Lasipalatsi.

Fast forward to 04:30 to see the start of Matt Todd’s talk


Probably everyone’s highlight of the conference was Hans Rosling’s fantastic key note presentation which I urge you all to watch – it was brilliant, and thrilling to be there live in the audience for.


If there’s one thing that impresses me most of all about OKFestival, it’s this: it’s not just about talking – they do things here too. Lots of ‘hacking’ sessions on Friday to create new tools and collate awesome new data. Most conferences are extremely boring in that it’s just talk after talk after talk. Things get done here, new collaborations are started, fresh links across disciplinary boundaries are made connecting journalism with academia, economic development with open architectural design, and other incredible trans-disciplinary mashups. It’s a joy to behold.

I’m really glad I came to OKFestival, as ever I got a lot out of it.

Next year it’ll be in Switzerland.

Perhaps we’ll see you there?


cross-posted and modified from a previously posted version here

#OKFest Open Science and Culture Hackday – Project 2 Louhos

September 19, 2012

Louhos have generated a tool called Sorvi with the aim of making R based statistical computational tools and methods traditionally used by scientists available for people wrangling all sorts of data sets from government and finance to weather and more.

Sorvi combines these resources by providing a centralized collection of general-purpose open-source tools for data manipulation, analysis and and visualization. The project currently focuses on Finnish open data sets, but has far wider applications.

Members of the Louhos team hard at work.

The hackday project focused on formulating documentation that will be easy for newcomers and two use cases for a regular user and developer to illustrate the abilities of the tools.

The team made significant advances in completing and improving documentation during the day, if you’d like to explore, check out the Louhos homepage and have a go with the tools yourself!

#OKFest Open Science and Culture Hackday – Project 1 pyBOSSA Feynman’s Flowers

September 19, 2012

This morning saw the start of the world’s biggest ever open knowledge meeting and following an inspiring opening session 40 odd hackers descended on the OKFest MAKE space to collaborate on building apps and tools to open up scientific and cultural data sets. The open science crowd was about 15 strong and quickly settled on three projects to focus on throughout the day.

We had our friendly mascot Chuff to help us on the way, sporting his own conference badge:

Creating a pyBOSSA App: Feynman’s Flowers

Daniel Lombraña González and Quentin Mazars-Simon are leading development of Feynman’s Flowers, a pyBOSSA app to crowdsource measurement of how individual molecules stick to surfaces. Read on for more details!

Spintronics with Individual Molecules

Example of a single atom diagram (Helium atom)

Traditional electronic devices work by moving charge around a circuit. This has produced astounding results over the last half century, but we are now at a point where further reducing the size of circuit elements is difficult because it would create too much heat in too small a space.

Our research group is studying magnetic molecules to understand how they can be used to make the smallest possible “spintronic” devices, in which charge (electronic) and spin (magnetic) properties can be used together. In the future, this would allow us to make devices that can do more and also use less energy.

As a simple example, we may be able to use the magnetic orientation of one molecule (the direction in which its internal compass needle points) to store a single bit (0 or 1) of information: this would potentially increase the storage density on hard drives by 100x.

How You Can Help: Measure How Molecules Stick to Surfaces

Using a special kind of “microscope”, operating at close to absolute zero temperature and based on the quantum mechanical principle of tunneling, we can measure single molecules on surfaces.

Many of a molecule’s magnetic properties are determined by how it binds (sticks) to the surface. For example, a group in Japan found that the magnetic stability energy of a molecule could change by 50% just by a 15˚ rotation in its binding angle.

It is therefore crucial for us to measure the distribution of binding angles of a particular molecule on a given surface. This will allow us to compare our results with theoretical predictions to better understand their properties. That’s where you come in …

The App

Data on the centre of attached molecules and their angle must be determined by hand as there is no reliable algorithm. the Feynman’s Flowers app will allow users to determine the centre of the attached molecule and its angle allowing researchers to analyse the data digitally.

What Did We Do?

We’re nearly there! Daniel and Quentin have the app online and with a few tweaks it should be fully up and running within the next day!

You can have a go and also check out the code if you’re keen to see under the bonnet or adapt the code to your own needs.

[Atom diagram from Wikimedia Commons under CC-BY-SA 3.0. All other images under CC-BY]

#OKFest Open Science and Culture Hackday – Project 3: Investigative Open Bibliography

September 18, 2012

The third hackday project aims to explore the links between corporations and researchers for the areas of organic food, but more widely research into pharmaceuticals, cosmetics, food and other consumer products. By extracting author and funding data from the full text open access literature, funding links become clearer allowing visualisation which can be manipulated into patterns based on study outcomes to identify areas with particularly high positive result publication biases which may be influenced by commercial interests.

How to Do It?

Luckily the OKFN has several tools at its disposal to make this happen:

Our Open Biblio project has developed tools to extract full text literature (PubCrawler) and convert files into bibliographic metadata formats such as BibJSON whereby data can be easily manipulated and facet browsed (BibServer).
Therefore, we are extracting a subset of the open access BioMedCentral articles.

The next step will be to extract author information and affiliation (illustrated using Ahmet et al., 2011), which brings up a new problem of reconciling affiliations where no canonical list of institutions exists. Much discussion ensued about the possibility of using pyBOSSA and merging several publicly available lists of institutions from authoritative sources as a starting point.

We will also require funding data and competing interests sections from the articles, which can be mined using natural language processing tools to extract corporate names (possibly utilising Open Corporates) and references to other funders e.g. research councils and charities.

Once extracted this information would be entered into a database for faceted searching by funder, institution and author along with standard bibliographic metadata e.g. list all articles funded by GlaxoSmithKline and published in UK institutions in 2011.

What Can We Do With It?

This would lay the ground work for linking to other data sets and further elucidating patterns in corporate funding of research. For example, abstracts and key words could be used to browse by topic and a further pyBOSSA app could be generated to type the articles by a positive or negative result, or even graded e.g. relative risk of a new clinical treatment. This could be used to look at positive publishing bias in different areas of science and for work through different funders – which appear to be the worst affected areas? Which funders very rarely publish negative results?

Many studies on publication bias already exist in clinical trials but they are often limited to a few hundred articles e.g. Friedman and Richter, 2004, whereas the automated methodology described above could examine large portions of the literature. There are also different types of publication biases e.g. citation biases for positive articles which could possibly be investigated on a larger scale using these techniques than is currently possible, particularly involving Open Citation tools.

Incorporating geographic information on institutions and keyword analysis could reveal hot spots of research and other useful information for funders and research managers.

Hard at work hacking with the OKF Okapi

What Did We Do?

We completed liberating BMC articles by crawling the site and using parsers from BibTex to BibJSON. By the end of the week we expect to have 140k articles uploaded to BibServer including author, affiliations, abstracts and more.