You are browsing the archive for IsItOpenData.

IsItOpenData? tips

- November 26, 2010 in IsItOpenData

Are you wondering about the policies of your favourite publisher around open data?  Check the website terms of use and author guidelines for policies.  If that doesn’t clear it up (and at the moment, it rarely does), head over to the IsItOpenData? site to see if someone has already asked your fav publisher for clarification.  You can browse enquiries and drill in to responses.

Don’t find the publisher you are looking for?  Or you find an enquiry but no response?  No worries: you are empowered to do something about it:

  1. Register for a new account at the IsItOpenData? site
  2. Make an Enquiry
  3. Wait for a response
  4. Follow up with a thank you!

Not only will you learn more about the journal’s open data policies, but you will do so in a way that also clarifies the issues for others.

ok, it isn’t quite that simple, but almost.  Here is a guide with more info. I recently used the system to send enquiries to three publishers.  Based on that experience, here are a few more tips:

  • Start with a well-considered email.  In fact, you can base your email on this template that we developed after several rounds of feedback.
  • Try to compose your email such that it isn’t mistaken for spam.  This probably means limiting links.
  • Recognize that your email may be identified as spam anyway.  Follow up with a short email from your personal email account, alerting the recipient that they have been sent an IsItOpenData email and it may be in their spam filter.
  • In the main email AND the personal email, highlight (in a central place in the main body of the email) that responses will be made public on the IsItOpenData site.  Emphasize this very clearly.  It is important, easily missed, and potentially very embarrassing if not clear.  I learned this the hard way.
  • Put the organization name in email subject.  This will make your request easy to browse in the enquiry list.
  • The “IsItOpenData” footer will automatically be appended to the bottom of your email
  • Send the orignal email through the IsItOpenData site using the Make Enquiry link.  This email will be sent with an IsItOpenData reply address.  You will receive a copy of this email, I think as a bcc: recipient.
  • If people reply to the original email, replies will be automatically posted onto the website.  IsItOpenData? will email you an alert that you received a response.  Note that these alerts may be considered spam by your email program.
  • If publishers write back to your personal email address, send them an email thanking them and confirming that you can post their email to the website.  If yes, log back into the original query on IsItOpenData, and “FollowUp” with another post to them, thanking them, with their response appended to the bottom.  This will archive the response at IsItOpenData.  example.
  • Be sure to sincerely thank the respondents.  Articulating these policies is not easy.
  • You will be able to change the Enquiry Status of enquiries that you initiate.
  • You know this already: keep the tone respectful, since the goal of IsItOpenData is to understand current policy.  Lobbying for more open policies is a different task for a different time and place.
  • Questions?  Email the OKFN open science list.  We’re friendly and will help out 🙂  Or have feedback (questions, bugs, etc)? Email info@okfn.org.

That’s it!  If you have more tips or suggestions, please add them in the comments below.

ETA:  Tips about updates from IsItOpen

Nature’s response to IsItOpenData

- November 26, 2010 in IsItOpenData

Thanks to Nature for responding to our IsItOpenData enquiry.  The enquiry is archived at the IsItOpenData website, and the response is appended below.

All three publishers to whom we sent enquiries in late August have now responded:  BMC, PLoS and Nature.  We appreciate the time they have taken to make their policies clear.


Many thanks for your email and, as a commercial publisher interested in ensuring scientific data is easily accessible and the rights of the author are preserved, we are happy to respond to your questions.

Where we have been unsure of your terminology we have referred back to the statement of the Association of Learned and Professional Society Publishers (ALPSP) and the International Association of Scientific, Technical and Medical Publishers (STM) (http://www.alpsp.org/ForceDownload.asp?id=129) when providing our answers:

1) YES – Supplementary Information is published under a non-exclusive license across all Nature-branded journals.

2) YES – Providing it is only data extracted and not text, such as figure legends.

3) YES – Providing it is only data extracted and not text.

4) YES – If substantial data is taken from a paper we believe there is a potential obligation on the extractor to request permission from the author (of the paper) about the re-use of their data, and a requirement to credit the original author.

5) YES – If substantial data is taken from a paper we believe there is a potential obligation on the extractor to request permission from the author (of the paper) about the re-use of their data, and a requirement to credit the original author.

6) YES/NO – Whilst all data in papers published under a CC license is available for downloading and reuse, data in subscription content (generally available via a Site License) is not. This is due to the significant traffic nature.com receives 24 hours a day, seven days a week from millions of scientists around the globe. If a user wishes to download data from content on nature.com, in a systematic way, please contact Jessica Rutt at NPG (j.rutt@nature.com). NPG specifically permits downloading and mining of data in archived NPG content on UKPMC.

7) YES – this is something we would consider if our answers to point 6 do not preclude us.

I hope this helps but if you need any more information please do not hesitate to contact me.

Jason

Dr Jason Wilde CPhys

Publishing Director

Nature Publishing Group

BMC and PLoS: All the data is CC-BY. Enjoy!

- September 13, 2010 in IsItOpenData

Great news:  our recent Is It Open Data? enquiries received rapid, clear, and positive responses from BMC and PLoS.

You can see the responses (or, due to technical difficulties, in some cases restatements of the responses) on the IsItOpenData site.

In summary:  research data published by PLoS and BMC, whether embedded in an article or as supplementary information, is available under CC-BY: it is available for use “without discrimination against users, groups, or fields of endeavor” with attribution.

Although this may seem obvious given the clear CC-BY terms BMC and PLoS apply to articles, the license of supplementary information was not explicit on their websites (aside from perhaps being part of “The Work” as discussed on copyright pages, and an explicit mention in “All articles and accompanying materials” in PLoS’s Terms Of Use page).

Both BMC and PLoS explicitly stated that they support the Panton Principles in theory.  Nonetheless, both require attribution for reuse and redistribution, and thus their data reuse and redistribution policies are not compatible with CC-Zero or other public domain licenses at this time.

Perhaps even more revelatory, however, is the fact that PLoS and BMC both welcome automated downloads of data they publish!  Work remains to clarify how this can be done without placing undue burden on their resources.

Public thanks to BMC and PLoS for giving our enquiries their time and attention, and for their leadership roles toward making scientific data fully Open.

Notes:  we haven’t had a response from Nature yet, though they did reply to say the enquiry is receiving attention.  Also, BMC has posted a draft statement on open data and they want your feedback!

For reference, here are the PLoS and BMC responses together, to make for a quick read:

  1. May users extract raw data and metadata (contextual facts about data collection) from supplementary information published in your journal?

BMC: Yes

PLoS:  Yes

  1. May users extract raw data and metadata from figures, tables, and text in the narrative of your published articles?

BMC:  Yes

PLoS:  Yes

  1. May users extract this information from freely available articles and supplementary information, as well as those that are available by subscription only? For the latter, users would obtain access through an existing subscription.

BMC:  Yes

PLoS:  Yes but all articles are OA

  1. May the extracted data be used as Open Data [1,2] without discrimination against users, groups, or fields of endeavor?

BMC:  Yes

PLoS:  Yes

  1. May users expose the extracted data as Open Data [1,2], in a manner consistent with the Panton Principles? Specifically, may they expose the extracted data on the internet under a Public Domain, PDDL (http://www.opendatacommons.org/licenses/pddl/) or CC0 license (http://wiki.creativecommons.org/CC0)?

BMC:  All BMC research content is published under the Creative Commons Attribution Licence (http://www.biomedcentral.com/info/about/openaccess), meaning that the copyrightable material within it can be freely redistributed as long as attribution is given. We recognize that the copyright-ability of data/facts varies by jurisdiction, creating potential obstacles to reuse, and so we support the Panton Principles goal of explicit open licensing of data, putting it into the public domain to ensure maximum interoperability. This is particularly necessary because providing full attribution for all facts/data in a large collection may not be practical. Putting the Panton Principles into practice needs to be done in careful consultation with the scientific community to ensure that researchers still receive appropriate credit for their contributions. Rather than restricting access to data through restrictive licensing terms, cultural norms need to be defined for the assignment of credit, priority with respect to initial publication and the determination of reasonable embargo periods. Field such as astronomy, economics and genomics have already made significant progress in this direction. BioMed Central has drafted a position statement on data sharing, Open Data and licensing, and we invite the wider scientific community to join the discussion to help us define an explicit Open Data licensing policy going forwards.

PLoS:  All the content that we publish, including datasets and so on, is made available under the terms of CCAL, and therefore reusable with attribution. We haven’t yet introduced an explicit statement about data being reusable under the CC zero waiver.

  1. May users obtain articles and supplementary materials (other than audio and video) from your website via automated means for the purposes of extracting raw data, if it is done in a manner that does not place undue burden on your resources? Users would obtain access through an existing subscription where necessary.

BMC:  Yes

PLoS:  Yes — there might be a need to discuss this with our IT folks, to ensure that the performance of the site is not compromised etc.

  1. Will you consider displaying the OKF’s “Open Data” button – http://opendefinition.org/buttons – as a means of clarifying to readers and users the Open parts of your material?

BMC:  Yes – we already do.

PLoS:  Yes – we will consider this. We are always looking for ways to improve the way in which PLoS content is presented to emphasize that it can be reused.

Dear Publisher, is the data open?

- September 13, 2010 in IsItOpenData

Hi all.  I’ve recently been using the OKF’s Is It Open Data? service to enquire about the openness of several data sources.  This is the first in a series of posts summarizing these enquiries and findings.

Background links on Is It Open Data?:

In August 2010, motivated by the GreenChainReaction project, Peter Murray-Rust suggested initiating several open data enquiries on behalf of the OKF.  We drafted the email enquiry text publicly, benefitting from community input.

Below is a summary of our initial enquiry (tweaked repost from my personal blog).  Stay tuned (ok, click here) to read about the publisher responses!

—————

Publishers make article text available under a variety of copyright terms. Data, however, are not copyrightable. So what are we allowed to do with them, these datums and datasets within and beside article text? It isn’t clear. Few publisher sites say. It matters. So let’s ask.

On behalf of the Open Knowledge Foundation and benefitting from very useful feedback from a number of colleagues, Peter Murray-Rust and I recently sent email to PLoS, BMC, and Nature, asking them to confirm the openness of their data. The email is below. All email queries and responses can be browsed at the Is It Open Data website. Furthermore, you can feel free to initiate your own enquiry from there. (And we’d love volunteers to help tweak the code to make the enquiry site even more useful.)

Peter Murray-Rust will highlight the responses-to-date in the #solo10 Green Chain Reaction session at the Science Online London conference later this week.

While this effort won’t answer all surrounding questions, hopefully it will clarify a few policies, illuminate outstanding issues, and liberate some text and data mining efforts on the way.

Subject: Enquiry about data openness at [Publisher]

Dear [Publisher],

I’m a postdoc researcher with NESCent, studying scientific data sharing and reuse. I’m writing to you, with Peter Murray-Rust, on behalf of the Open Knowledge Foundation. The Open Knowledge Foundation (OKF) is a non-profit global organization dedicated to the creation, dissemination and labelling of Open Knowledge.

On behalf of the OKF, we are writing to a large number of science publishers to ask for confirmation of their policies with respect to data published within their journals.

There is now great public interest in the Open availability of scientific data for validating scientific findings, detecting fraud and exploring new hypotheses. It is generally accepted by publishers that data per se are not copyrightable: several statements by publisher associations have made this point explictly. The Association of Learned and Professional Society Publishers (ALPSP) and International Association of Scientific, Technical, & Medical Publishers (STM) issued a joint statement in 2006 recommending that “research data should be as widely available as possible.” (http://www.alpsp.org/ForceDownload.asp?id=129) The 2007 Brussels Declaration from the STM states in part:

“Raw research data should be made freely available to all researchers.
Publishers encourage the public posting of the raw data outputs of research.
Sets or sub-sets of data that are submitted with a paper to a journal should
wherever possible be made freely accessible to other scholars.”
http://www.stm-assoc.org/public_affairs_brussels_declaration.php

Combined with the acceptance and increasingly widepread adoption of the Panton Principles (https://pantonprinciples.org/), it is now possible to articulate policies that are consistent with the publication and reuse of Open Data.

We would like to ask your for clarification on several points with respect to your journals. It will help everyone if your answers are clear so that users of your material can know what they may and may not do without requesting further permission.

1. May users extract raw data and metadata (contextual facts about data collection) from supplementary information published in your journal?

2. May users extract raw data and metadata from figures, tables, and text in the narrative of your published articles?

3. May users extract this information from freely available articles and supplementary information, as well as those that are available by subscription only? For the latter, users would obtain access through an existing subscription.

4. May the extracted data be used as Open Data [1,2] without discrimination against users, groups, or fields of endeavor?

5. May users expose the extracted data as Open Data [1,2], in a manner consistent with the Panton Principles (https://pantonprinciples.org/)? Specifically, may they expose the extracted data on the internet under a Public Domain, PDDL (http://www.opendatacommons.org/licenses/pddl/) or CC0 waiver (http://wiki.creativecommons.org/CC0)?

6. May users obtain articles and supplementary materials (other than audio and video) from your website via automated means for the purposes of extracting raw data, if it is done in a manner that does not place undue burden on your resources? Users would obtain access through an existing subscription where necessary.

7. Will you consider displaying the OKF’s “Open Data” button (http://opendefinition.org/button) as a means of clarifying to readers and users the Open parts of your material?

Our questions are being asked through the OKF’s IsItOpen(Data) service (http://www.isitopendata.org), which has been designed to clarify in what sense published and online datasets are actually open. IsItOpen(Data) saves everyone time by allowing a question to be asked just once and making the reply permanently visible in a high-profile site.

On behalf of the scientific community, thank you in advance for your response. The clear labelling of Openness will save scientists hundreds of years’ work per year in asking permission and speculating. Enabling open access data, both for use and reuse, will help to validate published findings, discourage fraud and misconduct, and explore new research areas. Your clear support for these principles will demonstrate the value you place on these activities and surely benefits science.

We look forward to hearing from you. Could you let us know the timeframe in which we might expect a response?

Sincerely,

Heather Piwowar, hpiwowar@nescent.org

Peter Murray-Rust, pm286@cam.ac.uk

on behalf of the Open Knowledge Foundation, https://okfn.org/

[1] http://www.opendefinition.org/1.0

[2] http://www.opendefinition.org/licenses/

Sent by “Is It Open Data?” http://isitopendata.org/ A service which helps scholars (and others) to request information about the status and licensing of data and content.

Disclaimer: This message and any reply that you make will be published on the internet for anyone to access and copy. For more information see:

http://isitopendata.org/about/

—————-

Posted by Heather Piwowar, postdoc researcher with the DataONE project, affiliated with NESCent, Dryad, and the Zoology Dept at the U of British Columbia. http://researchremix.org