Last week,
Peter Murray-Rust of the University of Cambridge and
Cameron Neylon of the University of Southampton met with some colleagues in a pub (the Panton Arms
pictured, just round the corner from the Chemistry Department in Cambridge). I don't know if they sampled the IPA or the Abbot, but it must have been good as they came up concise plan for open access science which has been baptised the "Panton Principles". The currently accepted statement of the Panton Principles is as follows:
- A simple statement is required along the forms of “best practice in data publishing is to apply protocol X”. Not a broad selection of licenses with different effects, not a complex statement about what the options are, but “best practice is X”.
- The purpose of publishing public scientific data and collections of data, whether in the form of a paper, a patent, data publication, or deposition to a database, is to enable re-use and re-purposing of that data. Non-commercial terms prevent this in an unpredictable and unhelpful way. Share-alike and copyleft provisions have the potential to do the same under some circumstances.
- The scientific research community is governed by strong community norms, particularly with respect to attribution. If we could successfully expand these to include share-alike approaches as a community expectation that would obviate many concerns that people attempt to address via licensing.
- Explicit statements of the status of data are required and we need effective technical and legal infrastructure to make this easy for researchers.
It might not be clear from a first reading – I am making this analysis based on the blog comments of the people involved – but there are two important points here that, up until now, have been stumbling blocks in the discussion of open scientific data:
- The separation of the decision to publish from the question of open access to published data. Not all data can be published, for example data which identifies a specific person in clinical research. The scientific process knows how to deal with this, usually by making such data available to a couple of trusted outsiders (referees), on request and on the basis of confidentiality, and letting the referees vouch for its veracity or verisimilitude.
- The idea that "best practices" might be different in different domains. This is related to the point above, but also allows a healthy diversity in approaches adapted to different circumstances. Does a chemist really have to run (and publish) an NMR spectrum of every brown-tar reaction product, or will a photo suffice?!
A third, more technical point is that the Panton Principles eschew "non-commercial" and "share-alike" restrictions on licences. I agree with the authors' arguments on this one, but I fear that we've not heard the last of that argument.
So, where now? PMR (as ever!) has launched a challenge: can we (scientists committed to open access science) condense this into
a single paragraph that anyone can understand? Actually, PMR gives the example of the
Budapest Open Access Initiative, which is neither a single paragraph nor really quite as comprehensible to mere mortals as all that… Mind you, this blog is always up for a challenge, so here goes:
This data has been obtained and made public for the benefit of Society as a whole. Anyone may use it for any purpose so long as the source is acknowledged.
These two sentences could be preceded by a reference to a Code of Practice from a learned society or funding body (
e.g., the
BBSRC data sharing policy), or could be completed with a reference to a specific licence, e.g.
CC0 or the
PDDL. And all of this needs to be fitted in with the
parallel process at Science Commons, however much that process appears to be reinventing the wheel…
No comments:
Post a Comment