pondělí 22. června 2009

Structure search in the IUPAC Gold Book

Last week I published a new version of the online version of the IUPAC Compendium of Chemical Terminology (aka the Gold Book; goldbook.iupac.org).
One of the most interesting features of this new release is the structure search. Alongside the InChI and InChIKey metadata hidden in GoldBook pages and the ring index, this is another example of the benefits of having the structures stored in a semantic format (we use BKChems format which describes both semantics and presentation).
The structure search uses ChemAxons Marvin Sketch plugin for the user drawing interface (thanks to ChemAxon for making this possible) and AJAX (through JQuery) to query the server which runs a custom built system based on OpenBabel and Pybel. The system consists of a small database which stores structures from the GoldBook and their fingerprints. The fingerprints are used in the screening process and the final hits are determined by using OpenBabels SMARTS matcher.
Because the number of compounds in the Gold Book database is small (~ 500 structures), it works very fast.
Any comments are welcome.

