Wednesday, 13 May 2009

Wikipedia and CAS

*** OFFICIAL ***
The community of chemists who contribute to Wikipedia is happy to announce a novel collaboration with Chemical Abstracts Service, Inc. (CAS), a division of the American Chemical Society. Wikipedia is one of the top-ten sources of online information; CAS is an acknowledged world-leader in the provision of chemical information to professionals.

CAS has provided Wikipedia with access to some of its most widely used data – its CAS Registry Numbers®, which are recognized throughout the world as the most commonly used identifiers of chemical substances. The collaboration between CAS and Wikipedia provides a free and, more importantly, verified dataset of CAS Registry Numbers® for common substances for all users.

CAS has published 7800 CAS Registry Numbers®, along with a wide selection of synonyms for chemical names, on a new site, – Wikipedia will continue in its role as a source of information for a wide audience (including professionals). The links between the two complementary systems will help to ensure a high quality of data for users of both sites, and both sides hope that their number will increase over the coming months.

The work leading up to this formal announcement has been going on for more than a year now. During that time, Wikipedia chemists have been able to audit the accuracy of the chemical data we present: our error rate before correction was comparable with printed compendiums of chemical data. Our aim is not to be authoritative, but to present a snapshot of knowledge in an accessible manner. We are sure that this new collaboration will help us make that snapshot even more accurate.


  1. I've been asked by a number of people about what the website is out to achieve. It allows validation of CAS number-name-structure relationships with look-ups only possible for names and CAS numbers. These are of course validated. However, they could go a lot further with minimal effort in my opinion. Certainly growing the collection would be good but why not allow downloads of molfiles for example? Adding SMILES and InChIs is unlikely but would be useful. Adding links out to other resources such as ChemSpider and PubChem..why not, other than politics? And, most importantly, some licensing terms for the data. Is it Open and can it be reused. I'd like to get the data to include in ChemSpider and link back to

  2. Well, in a way you'd have to ask CAS that! I can give you my opinion as someone who was involved on the WP side of the negotiations, but I cannot speak for CAS itself – or Wikipedia, for that matter!
    These data are published for people to use, that is quite obvious, otherwise there'd be no point. I can't see any objection to ChemSpider linking to Common Chemistry, indeed I think that Common Chemistry would like that.
    The individual data do not need licensing, as they cannot be copyrighted. The collection as a whole should be assumed to be under copyright: but, as it is hosted in the United States, a jurisdiction that does not support database rights, that should not be a big barrier to honest reuse.
    ChemSpider already hosts a huge number of CASRNs that it has legally obtained from many sources: they idea that it links a minute proportion of those CASRNs to Common Chemistry would seem to be common sense to me, not anything requiring lengthy lawyers meetings.