Peter Kraker

You are browsing the archive for Peter Kraker.

YEAR Conference 2015: Your chance to win 5000 Euros for your Open Science project idea

Peter Kraker - February 27, 2015 in Announcements, Featured

Acknowledgements: Thanks to the YEAR Board for contributing to this blog post!

Are you a young researcher with an Open Science project idea? Here’s a chance to win 5000 Euros to make it happen: The Young European Associated Researchers (YEAR) Network organises its Annual Conference on 11-12 May 2015 at VTT in Helsinki/Espoo (Finland) with a focus on Open Science. Registration for the conference is now open.

The YEAR Annual Conference is a two-day event for young researchers, which offers a platform for exchange and training focused on key aspects of EU projects. This event provides young researchers with a solid basis for successful integrations of both open access and open research data concepts in Horizon 2020 projects as well as current research workflows.

“Sharing is caring”! This is probably a good way to describe what Open Science really means: a new approach to science to share ideas, research results, research data, and publications with the rest of the world, through the newly available network technologies.

Open science approaches are rather new concepts that many researchers are not familiar with as of yet. Young researchers in particular struggle when being confronted with open access or open research data and issues related to it. This fact is reinforced by survey recently conducted by YEAR, according to which many of the surveyed young researchers are inexperienced with open science and unsure about its implications. According to a majority of about 80% of the survey participants one of the most effective channels for awareness-raising of Open Science is its integration in research training. The aim of this training is to respond to this demand and to provide young researchers with a solid basis for successfully implementing both open access and open research data concepts in H2020 projects and to highlight ways of integrating them into current research workflows.

Conference Day 1: invited international experts will introduce strategies for fulfilling open access requirements in H2020 projects and Open Data Pilots. The goal of Day 1 is to give the attendees the necessary background information and useful tools for publishing open access or open research data.

Conference Day 2: the young researchers are invited to come with a project idea relying on, or promoting open research data/open science aspects. They will be challenged to defend their idea and to work it out with the other young researchers to take a chance to win one of the two YEAR Awards. The goal is for the young researchers to gain hands-on experience on developing strong project ideas as well as to find other potential project partners.

Confirmed speakers and trainers: Jean-Claude Burgelman (European Commission, DG Research and Innovation), Petr Knoth (The Open University, UK), Jenny Molloy (OKFN, University of Oxford, UK), Peter Kraker (KNOW Center, AT)

YEAR Awards: the two most outstanding project ideas defended and developed during the Conference Day 2 will be awarded. The YEAR Awards consist of a European Project Management training course and 5000 euros each to further develop the project ideas.

Please submit your project idea for the YEAR Annual Conference 2015 by ~~Thursday 2 April 2015~~ Thursday 16 April 2015.

The conference is supported by the EU project FOSTER and is organised by YEAR in cooperation with VTT, AIT Austrian Institute of Technology, KNOW Center Graz, and SINTEF. The Open Knowledge Foundation is a dissemination partner.

Conference links: http://www.year-network.com/homepage/year-annual-conference-2015

https://www.fosteropenscience.eu/event/year-annual-conference-2015-open-science-horizon-2020

No Comments »

Panton Fellowship Wrap-Up

Peter Kraker - October 9, 2014 in Featured, Panton Fellowships

On stage at the Open Science Panel Vienna (Photo by FWF/APA-Fotoservice/Thomas Preiss)

It’s hard to believe that it has been over a year since Peter Murray-Rust announced the new Panton fellows at OKCon 2013. I am immensly proud that I was one of the 2013/14 Panton Fellows and the first non UK-based fellow. In this post, I will recap my activities during the last year and give an outlook of things to come after the end of the fellowship. At the end of the post, you can find all outputs of my fellowship at a glance. My fellowship had two focal points: the work on open and transparent altmetrics and the promotion of open science in Austria and beyond.

Open and transparent altmetrics

The blog post entitled “All metrics are wrong, but some are useful” sums up my views on (alt)metrics: I argue that no single number can determine the worth of an article, a journal, or a researcher. Instead, we have to find those numbers that give us a good picture of the many facets of these entities and put them into context. Openness and transparency are two necessary properties of such an (alt)metrics system, as this is the only sustainable way to uncover inherent biases and to detect attempts of gaming. In my comment to the NISO whitepaper on altmetrics standards, I therefore maintained that openness and transparency should be strongly considered for altmetrics standards.

In another post on “Open and transparent altmetrics for discovery”, I laid out that altmetrics have a largely untapped potential for visualizaton and discovery that goes beyond rankings of top papers and researchers. In order to help uncover this potential, I released the open source visualization Head Start that I developed as part of my PhD project. Head Start gives scholars an overview of a research field based on relational information derived from altmetrics. In two blog posts, “New version of open source visualization Head Start released” and “What’s new in Head Start?” I chronicled the development of a server component, the introduction of the timeline visualization created by Philipp Weißensteiner, and the integration of Head Start with Conference Navigator 3, a nifty conference scheduling system. With Chris Kittel and Fabian Dablander, I took first steps towards automatic visualizations of PLOS papers. Recently, Head Start also became part of the Open Knowledge Labs. In order to make the maps created with Head Start openly available to all, I will set up a server and website for the project in the months to come. The ultimate goal would be to have an environment where everybody can create their own maps based on open knowledge and share them with the world. If you are interested in contributing to the project, please get in touch with me, or have a look at the open feature requests.

Evolution of the UMAP conference visualized in Head Start. More information in Kraker, P., Weißensteiner, P., & Brusilovsky, P. (2014). Altmetrics-based Visualizations Depicting the Evolution of a Knowledge Domain 19th International Conference on Science and Technology Indicators (STI 2014), 330-333.

Promotion of open science and open data

Regarding the promotion of open science, I teamed up with Stefan Kasberger and Chris Kittel of openscienceasap.org and the Austrian chapter of Open Knowledge for a series of events that were intended to generate more awareness in the local community. In October 2013, I was a panelist at the openscienceASAP kick-off event at University of Graz entitled “The Changing Face of Science: Is Open Science the Future?”. In December, I helped organizing an OKFN Open Science Meetup in Vienna on altmetrics. I also gave an introductory talk on this occasion that got more than 1000 views on Slideshare. In February 2014, I was interviewed for the openscienceASAP podcast on my Panton Fellowship and the need for an inclusive approach to open science.

In June, Panton Fellowship mentors Peter Murray-Rust and Michelle Brook visited Vienna. The three-day visit, made possible by the Austrian Science Fund (FWF), kicked off with a lecture by Peter and Michelle at the FWF. On the next day, the two lead a well-attended workshop on content mining at the Institute of Science and Technology Austria.The visit ended with a hackday organized by openscienceASAP, and an OKFN-AT meetup on content mining. Finally, last month, I gave a talk on open data at the “Open Science Panel” on board of the MS Wissenschaft in Vienna.

I also became active in the Open Access Network Austria (OANA) of the Austrian Science Fund. Specifically, I am contributing to the working group “Involvment of researchers in open access”. There, I am responsible for a visibility concept for open access researchers. Throughout the year, I have also contributed to a monthly sum-up of open science activities in order to make these activities more visible within the local community. You can find the sum-ups (only available in German) on the openscienceASAP stream.

I also went to a lot of events outside Austria where I argued for more openness and transparency in science: OKCon 2013 in Geneva, SpotOn 2013 in London, and Science Online Together 2014 in Raleigh (NC). At the Open Knowledge Festival in Berlin, I was session facilitator for “Open Data and the Panton Principles for the Humanities. How do we go about that?”. The goal of this session is to devise a set of clear principles which describe what we mean by Open Data in the humanities, what these should contain and how to use them. In my role as an advocate for reproducibility I wrote a blog post on why reproducibility should become a quality criterion in science. The post sparked a lot of discussion, and was widely linked and tweeted.

by Martin Clavey

What’s next?

The Panton Fellowship was a unique opportunity for me to work on open science, to visit open knowledge events around the world, and to meet many new people who are passionate about the topic. Naturally, the end of the fellowship does not mark the end of my involvement with the open science community. In my new role as a scientific project developer for Science 2.0 and open science at Know-Center, I will continue to advocate openness and transparency. As part of my research on altmetrics-driven discovery, I will also pursue my open source work on the Head Start framework. With regards to outreach work, I am currently busy drafting a visibility concept for open access researchers in the Open Access Network Austria (OANA). Furthermore, I am involved in efforts to establish a German-speaking open science group

I had a great year, and I would like to thank everyone who got involved. Special thanks go to Peter Murray-Rust and Michelle Brook for administering the program and for their continued support. As always, if you are interested in helping out with one or the other project, please get in touch with me. If you have comments or questions, please leave them in the comments field below.

All outputs at a glance

Head Start – open source research overview visualization

Blog Posts

Audio and Video

Panton Fellows introduction at OKCon 2013
Panel “Science in a time of change – Is Open science the future?” [German]
Podcast Open Science in Research [German]
Introduction to Open Research Data as part of Open Science Panel Vienna [German]

Slides

Reports

Open Science Sum-Ups (contributions) [German]

1 Comment »

The third quarter of my Panton Fellowship in the rear view mirror

Peter Kraker - July 3, 2014 in Featured, Panton Fellowships

Three quarters down in my Panton Fellowship, it is time again to review my activities.

The open source visualization Head Start, which gives scholars an overview of a research field, remained one of my focal points. In April, I released version 2.5 which includes a brand new server component that lets you manipulate the visualization after it has loaded. The new version also contains the timeline visualization created by Philipp Weißensteiner, along with a consolidated code base and many bug fixes. Furthermore, I worked on the integration of Head Start with Conference Navigator 3, a nifty scheduling system that allows you to create a personal conference schedule by bookmarking talks from the program. Head Start will be used as an alternate way of looking at the topics of the conference, and to give better context to the talks that you already selected and the talks that are recommended for you. Finally, in the wake of Peter Murray-Rust’s visit to Vienna in June (more on that later), I teamed up with Chris Kittel and Fabian Dablander to take first steps towards automatic visualizations of PLOS papers. The accompanying branch can be found here.

by opensource.com

I also continued to promote open and transparent altmetrics. In the blog post entitled “All metrics are wrong, but some are useful”, I argued that no single number can determine the worth of an article, a publication, or a researcher. Instead, we have to find those numbers that give us a good picture of the many facets of a paper and put them into context. In my comment to the otherwise excellent NISO whitepaper on altmetrics standards, I maintained that openness and transparency should be strongly considered for altmetrics standards. This is the only way to uncover biases inherent in all metrics. It would also make it easier to uncover attempts of gaming the system.

A highlight of the last quarter was Peter Murray-Rust’s and Michelle Brook’s visit to Vienna. The three-day visit, made possible by the Austrian Science Fund (FWF), kicked off with a lecture by Peter and Michelle at the FWF. A video of the great talk entitled Open Notebook Science can be found here. On the next day, the two lead a well-attended workshop on content mining workshop at the Institute of Science and Technology Austria.The visit ended with a hackday organized by openscienceASAP, and an OKFN-AT meetup on content mining, with presentations by PMR, Andreas Langegger (Zoomsquare), Roman Kern (Know-Center) and Marion Breitschopf (meineabgeordneten.at). It was a very enlighting yet intense week, as you can also read in PMR’s account of the activities.

Last but not least, I attended a meeting of the Open Access Network Austria working group on outreach. There, I will lead an effort to come up with a concept for enhanced visibility of open access efforts. Finally, I also contributed to the open science sum-ups of activites in Austria, Germany and beyond. Here you can find the monthly summaries for March, April and May [in German].

Sadly, the next quarter will also be the last of my Panton Fellowship, but a first highlight is already lurking around the corner: the Open Knowledge Festival will kick off in Berlin on July 15. See you all there!

Tags: report 1 Comment »

All metrics are wrong, but some are useful

Peter Kraker - May 31, 2014 in Panton Fellowships

by Leo Reynolds

Altmetrics, web-based metrics for measuring research output, have recently received a lot of attention. Started only in 2010, altmetrics have become a phenomenon both in the scientific community and in the publishing world. This year alone, EBSCO acquired PLUM Analytics, Springer included Altmetric info into SpringerLink, and Scopus augmented articles with Mendeley readership statistics.

Altmetrics have a lot of potential. They are usually earlier available than citation-based metrics, allowing for an early evaluation of articles. With altmetrics, it also becomes possible to assess the many outcomes of research besides just the paper – meaning data, source code, presentations, blog posts etc.

One of the problems with the recent hype surrounding altmetrics, however, is that it leads some people to believe that altmetrics are somehow intrinsically better than citation-based metrics. They are, of course, not. In fact, if we just replace the impact factor with the some aggregate of altmetrics then we have gained nothing. Let me explain why.

The problem with metrics for evaluation

You might know this famous quote:

“All models are wrong, but some are useful” (George Box)

It refers to the fact that all models are a simplified view of the world. In order to be able to generalize phenomena, we must leave out some of the details. Thus, we can never explain a phenomenon in full with a model, but we might be able to explain the main characteristics of many phenomena that fall in the same category. The models that can do that are the useful ones.

Example of a scientific model, explaining atmospheric composition based on chemical process and transport processes. Source: Strategic Plan for the U.S. Climate Change Science Program (Image by Phillipe Rekacewicz)

The very same can be said about metrics – with the grave addition that metrics have a lot less explanatory power than a model. Metrics might tell you something about the world in a quantified way, but for the how and why we need models and theories. Matters become even worse when we are talking about metrics that are generated in the social world rather than the physical world. Humans are notoriously unreliable and it is hard to pinpoint the motives behind their actions. A paper may be cited for example to confirm or refute a result, or simply to acknowledge it. A paper may be tweeted to showcase good or to condemn bad research.

In addtion, all of these measures are susceptible to gaming. According to ImpactStory, an article with just 54 Mendeley readers is already in the 94-99 percentile (thanks to Juan Gorraiz for the example). Getting your paper in the top ranks is therefore easy. And even indicators like downloads or views that go into the hundreds of thousands can probably be easily gamed with a simple script deployed on a couple of university servers around the country. This makes the old citation cartel look pretty labor-intensive, doesn’t it?

Why we still need metrics and how we can better utilize them

Don’t get me wrong: I do not think that we can come by without metrics. Science is still growing exponentially, and therefore we cannot rely on qualitative evaluation alone. There are just too many papers published, too many applications for tenure track positions submitted and too many journals and conferences launched each day. In order to address the concerns raised above, however, we need to get away from a single number determining the worth of an article, a publication, or a researcher.

One way to do this would be a more sophisticated evaluation system that is based on many different metrics, and that gives context to these metrics. This would require that we work towards getting a better understanding of how and why measures are generated and how they relate to each other. In analogy to the models, we have to find those numbers that give us a good picture of the many facets of a paper – the useful ones.

As I have argued before, visualization would be a good way to represent the different dimensions of a paper and its context. Furthermore, the way the metrics are generated must be open and transparent to make gaming of the system more difficult, and to expose the biases that are inherent in humanly created data. Last, and probably most crucial, we, the researchers and the research evaluators must critically review the metrics that are served to us.

Altmetrics do not only give us new tools for evaluation, their introduction also presents us with the opportunity to revisit academic evaluation as such – let’s seize this opportunity!

Tags: altmetrics 5 Comments »

What’s new in Head Start?

Peter Kraker - April 29, 2014 in Panton Fellowships

The past couple of months I have been working on the open source visualization Head Start in the context of my research stay at University of Pittsburgh. Head Start is intended for scholars who want to get an overview of a research field. They could be young PhDs getting into a new field, or established scholars who venture into a neighboring field. The idea is that you can see the main areas and papers in a field at a glance without having to do weeks of searching and reading. You can find more information in my last blog post on the system.

If you read this post, you already know that Philipp Weißensteiner introduced a timeline visualization to the repository that lets you compare different datasets in a single view. I finished the integration of the timeline visualization, making it possible to review all datasets both in the regular and the timeline view. I was also busy consolidating the code and fixing the occassional bug along the way. The biggest change in version 2.5, however, is the introduction of a server component to Head Start. So far, Head Start consisted of a pre-processing system for generating the data, and an HTML5 interface for visualizing the data. There was no way of manipulating the visualization after it had been loaded. The new server component consists of REST-ful webservices and a PHP backend to deal with dynamic requests.

The server component proved very useful during the integration of Head Start with Conference Navigator 3, developed by the great folks of the PAWS Lab here in Pittsburgh. Conference Navigator is a nifty scheduling system that allows you to create a personal conference schedule by bookmarking talks from the program. The system then gives you recommendations for further talks based on your choices. Head Start will be used as an alternate way of looking at the topics of the conference, and to give better context to the talks that you already selected and the talks that are recommended for you. To do that, Head Start dynamically loads bookmarking and recommendation data from the CN3 database.

What’s next? First of all, the system will be evaluated with users in one of the upcoming conferences that Conference Navigator supports. Furthermore, I would like to move the preprocessing systems from an offline to an online solution, thus enabling it to load live content from APIs.

If any of the above got you interested, here is the link to the Github repository. As always, please get in touch if you have any questions or comments, or in case you want to collaborate on this project.

Tags: adaptive systems, visualization 1 Comment »

Second Quarterly Report on my Panton Fellowship

Peter Kraker - March 26, 2014 in Panton Fellowships

by Timothy Appnel

I am now almost halfway through my Panton Fellowship, so it is time to sum up my activities once again.

The most important activity in the last quarter was surely the work on the open source visualization Head Start. Head Start is intended to give scholars an overview of a research field. You can find out all about the initial release in this blog post. I was busy in the last few weeks with bugfixing and stability improvements. I also refactored the whole pre-processing system and further integrated the work of Philipp Weißensteiner with regards to time-series visualization. If you are interested in trying out Head Start, or – even better – would like to contribute to its development, check out the Github repository.

Furthermore, I attended the Science Online un-conference in Raleigh (February 27 to March 1). Scio14 was very inspiring and engaging. Cameron Neylon hosted a great session on imagining the far future of academic publishing. In Rachel Levy‘s workshop on visualizations, we reflected on our own visualizations and there were tons of tips for improving one’s work. Other great sessions included post-publication peer review (with Ivan Oransky), altmetrics (facilitated by Cesar Berrios-Otero), and alternate careers in science (led by Eva Amsen). I also encourage you to check out the videos of the keynotes which include a very inspiring talk by Rebecca Tripp and Meg Lowman on neglected audiences in science, and the awesone crowd-sourced 3D printing project for creating prosthetic hands by Nick Parker and Jon Schull.

Let’s move on to my work for the local Austrian community. Together with my fellow OKFN members Sylvia Petrovic-Majer, Stefan Kasberger, and Christopher Kittel, I became active (remotely for now) in the Open Access Network Austria (OANA). Specifically, I am contributing to the working group “Involvment of researchers in open access”. I am very excited about this opportunity as it is one of the objectives of my Panton Fellowship to draw more researchers in open science.

What else? Earlier this year, I was interviewed for the openscienceASAP podcast. In the interview, I talked about altmetrics, the need for an inclusive approach to open science, and the Panton Fellowships. You can find the podcast here (in German). If you have read my last report, you may remember that I spoke on a panel about open science at University of Graz. The video of the panel (in German) is now online and can be found here. Furthermore, I’d like to draw your attention to the monthly sum-ups of open science activities in the German speaking world and beyond: January, February.

So what will my next quarter look like? As you may remember from my last report, I am currently a visiting scholar at University of Pittsburgh. In the weeks to come, I will integrate Head Start with Conference Navigator 3, developed by the great folks of the PAWS Lab here in Pittsburgh. Conference Navigator is a nifty scheduling system that allows you to create a personal conference schedule by bookmarking talks from the program. The system then gives you recommendations for further talks based on your choices. Head Start will be used as an alternate way of looking at the topics of the conference, and to give better context to the talks that you already selected. I will return to Austria in June, just in time for Peter Murray-Rust‘s visit to Vienna. There are already a lot of activities planned around his stay, and I am very much looking forward to that. As always, please get in touch if you have any questions or comments, or in case you want to collaborate on one or the other project.

Tags: open source, openscienceASAP, podcast, report, scio14, visualization 2 Comments »

New version of open source visualization Head Start released

Peter Kraker - February 24, 2014 in Panton Fellowships

In July last year, I released the first version of a knowledge domain visualization called Head Start. Head Start is intended for scholars who want to get an overview of a research field. They could be young PhDs getting into a new field, or established scholars who venture into a neighboring field. The idea is that you can see the main areas and papers in a field at a glance without having to do weeks of searching and reading.

Interface of Head Start

You can find an application for the field of educational technology on Mendeley Labs. Papers are grouped by research area, and you can zoom into each area to see the individual papers’ metadata and a preview (or the full text in case of open access publications). The closer two areas are, the more related they are subject-wise. The prototye is based on readership data from the online reference management system Mendeley. The idea is that the more often two papers are read together, the closer they are subject-wise. More information on this approach can be found in my dissertation (see chapter 5), or if you like it a bit shorter, in this paper and in this paper.

Head Start is a web application built with D3.js. The first version worked very well in terms of user interaction, but it was a nightmare to extend and maintain. Luckily, Philipp Weißensteiner, a student at Graz University of Technology became interested in the project. Philipp worked on the visualization as part of his bachelor’s thesis at the Know-Center. Not only did he modularize the source code, he also introduced Javascript Finite State Machine that lets you easily describe different states of the visualization. To setup a new instance of Head Start is now only a matter of a couple of lines. Philipp developed a cool proof of concept for his approach: a visualization that shows the evolution of a research field over time using small multiples. You can find his excellent bachelor’s thesis in the repository (German).

Head Start Timeline View

In addition, I cleaned up the pre-processing scripts that do all the clustering, ordination and naming. The only thing that you need to get started is a list of publications and their metadata as well as a file containing similarity values between papers. Originally, the similarity values were based on readership co-occurrence, but there are many other measures that you can use (e.g. the number of keywords or tags that two papers have in common).

So without further ado, here is the link to the Github repository. Any questions or comments, please send them to me or leave a comment below.

Tags: altmetrics, github, open source, scholarly communication, timeline, visualization 3 Comments »

First Quarterly Report on my Panton Fellowship Activities

Peter Kraker - January 15, 2014 in Panton Fellowships

by jakeandlindsay

I am now a little more than three months into my Panton Fellowship. This means it is time to give an overview of my activities so far. As outlined in my initial blog post, there are two main objectives of my fellowship: working on open and transparent altmetrics, and the promotion of open science.

Regarding the promotion of open science, I would like to highlight two local activities first. Since September, I have contributed to a monthly sum-up of open science activities in the German-speaking world and beyond in order to make these activities and more visible within the local community. You can find the sum-ups (only available in German) here: September, October, November, December. At this point, I would like to add a big shout out to the other contributors: Christopher Kittel, Stefan Kasberger, and Matthias Fromm.

I was also a panelist at the kick-off event of the openscienceASAP platform in Graz, entitled “The Changing Face of Science: Is Open Science the Future?”. openscienceASAP promotes open science as a practice, and this event was intended as a forum for interested students, researchers, and the general public. It ended up to be a very lively discussion that covered a lot of ground including open access, open peer review, altmetrics, open data, and so forth.

Regarding wider community work, I have started to develop an open data policy for the International Journal of Technology Enhanced Learning. IJTEL will become one of the first journals in the field that has such a policy, and hopefully this will inspire others to follow suit. Furthermore, in my role as an advocate for reproducibility I wrote a blog post on why reproducibility should become a quality criterion in science. The post sparked a lot of discussion, and was widely linked and tweeted.

The fellowship also enabled me to attend several other events related to open science: in September, I went to OKCon in Geneva, and in November I attended SpotOn in London. Furthermore, I attended a meeting of the Leibniz research network “Science 2.0” in Berlin. These events were a great experience for me. I learned a lot, and I met many new and wonderful people who are passionate about open science.

I also used these events to discuss my second objective: the need for open and transparent altmetrics. Altmetrics will be the main objective for the second quarter of my fellowship. I will be looking at different altmetrics sources and how they can be used for aggregation and visualization. To kickstart the activities, I have outlined my thoughts on this topic in this blog post. Furthermore, I helped to organize a OKFN Open Science Meetup in Vienna on the topic. I also gave an introduction to altmetrics at this occasion; the slides can be found here.

The first three months of my fellowship were a busy yet wonderful time. Besides the activities above, I finally finished my PhD on altmetrics-based visualization. Now I am off for a three-month visit to the Personalized Adaptive Web Systems Lab of University of Pittsburgh. I cannot wait to see what the second quarter has in store for me! As always, please get in touch if you have any questions or comments, or in case you want to collaborate on one or the other project.

Tags: activities, altmetrics, community, open science, report No Comments »

Open and transparent altmetrics for discovery

Peter Kraker - December 9, 2013 in Panton Fellowships, Research, Tools

by AG Cann

Altmetrics are a hot topic in scientific community right now. Classic citation-based indicators such as the impact factor are amended by alternative metrics generated from online platforms. Usage statistics (downloads, readership) are often employed, but links, likes and shares on the web and in social media are considered as well. The altmetrics promise, as laid out in the excellent manifesto, is that they assess impact quicker and on a broader scale.

The main focus of altmetrics at the moment is evaluation of scientific output. Examples are the article-level metrics in PLOS journals, and the Altmetric donut. ImpactStory has a slightly different focus, as it aims to evaluate the oeuvre of an author rather than an individual paper.

This is all good and well, but in my opinion, altmetrics have a huge potential for discovery that goes beyond rankings of top papers and researchers. A potential that is largely untapped so far.

How so? To answer this question, it is helpful to shed a little light on the history of citation indices.

Pathways through science

In 1955, Eugene Garfield created the Science Citation Index (SCI) which later went on to become the Web of Knowledge. His initial idea – next to measuring impact – was to record citations in a large index to create pathways through science. Thus one can link papers that are not linked by shared keywords. It makes a lot of sense: you can talk about the same thing using totally different terminology, especially when you are not in the same field. Furthermore, terminology has proven to be very fluent even in the same domain (Leydesdorff 1997). In 1973, Small and Marshakova realized – independently from each other – that co-citation is a measure of subject similarity and therefore can be used to map a scientific field.

Due to the fact that citations are considerably delayed, however, co-citation maps are often a look into the past and not a timely overview of a scientific field.

Altmetrics for discovery

In come altmetrics. Similarly to citations, they can create pathways through science. After all, a citation is nothing else but a link to another paper. With altmetrics, it is not so much which papers are often referenced together, but rather which papers are often accessed, read, or linked together. The main advantage of altmetrics, as with impact, is that they are much earlier available.

Bollen et al. (2009): Clickstream Data Yields High-Resolution Maps of Science. PLOS One. DOI: 10.1371/journal.pone.0004803.

One of the efforts in this direction is the work of Bollen et al. (2009) on click-streams. Using the sequences of clicks to different journals, they create a map of science (see above).

In my PhD, I looked at the potential of readership statistics for knowledge domain visualizations. It turns out that co-readership is a good indicator for subject similarity. This allowed me to visualize the field of educational technology based on Mendeley readership data (see below). You can find the web visualization called Head Start here and the code here (username: anonymous, leave password blank).

http://labs.mendeley.com/headstart

Why we need open and transparent altmetrics

The evaluation of Head Start showed that the overview is indeed more timely than maps based on citations. It, however, also provided further evidence that altmetrics are prone to sample biases. In the visualization of educational technology, the computer science driven areas such as adaptive hypermedia are largely missing. Bollen and Van de Sompel (2008) reported the same problem when they compared rankings based on usage data to rankings based on the impact factor.

It is therefore important that altmetrics are transparent and reproducible, and that the underlying data is openly available. This is the only way to ensure that all possible biases can be understood.

As part of my Panton Fellowship, I will try to find datasets that satisfy these criteria. There are several examples of open bibliometric data, such as the Mendeley API, and figshare API that have adopted CC BY, but most of the usage data is not available publicly or cannot be redistributed. In my fellowship, I want to evaluate the goodness of fit of different open altmetrics data. Furthermore, I plan to create more knowledge domain visualizations such as the one above.

So if you know any good datasets please leave a comment below. Of course any other comments on the idea are much appreciated as well.

Tags: altmetrics, knowledge discovery, openness, transparency 4 Comments »

“It’s not only peer-reviewed, it’s reproducible!”

Peter Kraker - October 18, 2013 in Panton Fellowships, Panton Principles, Reproducibility

Peer review is one of the oldest and most respected instruments of quality control in science and research. Peer review means that a paper is evaluated by a number of experts on the topic of the article (the peers). The criteria may vary, but most of the time they include methodological and technical soundness, scientific relevance, and presentation.

“Peer-reviewed” is a widely accepted sign of quality of a scientific paper. Peer review has its problems, but you won’t find many researchers that favour a non peer-reviewed paper over a peer-reviewed one. As a result, if you want your paper to be scientifically acknowledged, you most likely have to submit it to a peer-reviewed journal.

Even though it will take more time and effort to get it published than in a non peer-reviewed publication outlet.

Peer review helps to weed out bad science and pseudo-science, but it also has serious limitations. One of these limitations is that the primary data and other supplementary material such as documentation source code are usually not available. The results of the paper are thus not reproducible. When I review such a paper, I usually have to trust the authors on a number of issues: that they have described the process of achieving the results as accurate as possible, that they have not left out any crucial pre-processing steps and so on. When I suspect a certain bias in a survey for example, I can only note that in the review, but I cannot test for that bias in the data myself. When the results of an experiment seem to be too good to be true, I cannot inspect the data pre-processing to see if the authors left out any important steps.

As a result, later efforts in reproducing research results can lead to devastating outcomes. Wang et al. (2010) for example found that they could not reproduce almost all of the literature on a certain topic in computer science.

“Reproducible”: a new quality criterion

Needless to say this is not a very desirable state. Therefore, I argue that we should start promoting a new quality criterion: “reproducible”. Reproducible means that the results achieved in the paper can be reproduced by anyone because all of the necessary supplementary resources have been openly provided along with the paper.

It is easy to see why a peer-reviewed and reproducible paper is of higher quality than just a peer-reviewed one. You do not have to take the researchers’ word of how they calculated their results – you can reconstruct them yourself. As a welcome side-effect, this would make more datasets and source code openly available. Thus, we could start building on each others’ work and aggregate data from different sources to gain new insights.

In my opinion, reproducible papers could be published alongside non-reproducible papers, just like peer-reviewed articles are usually published alongside editorials, letters, and other non peer-reviewed content. I would think, however, that over time, reproducible would become the overall quality standard of choice – just like peer-reviewed is the preferred standard right now. To help this process, journals and conferences could designate a certain share of their space to reproducible papers. I would imagine that they would not have to do that for too long though. Researchers will aim for a higher quality standard, even if it takes more time and effort.

I do not claim that reproducibility solves all of the problems that we see in science and research right now. For example, it will still be possible to manipulate the data to a certain degree. I do, however, believe that reproducibility as an additional quality criterion would be an important step for open and reproducible science and research.

So that you can say to your colleague one day: “Let’s go with the method described in this paper. It’s not only peer-reviewed, it’s reproducible!”

Tags: Open Data, open source, peer-review, publication process, quality, reproducible 34 Comments »