BKBlog: 2009

středa 14. října 2009

Simple debugging of Ajax in Django

One of the great features of Django is that when in debugging mode something goes wrong with your code, you will see the traceback of the error and many other useful information directly in your browsers window.

This is cool and helps you to quickly debug your code. The problem comes when you start to put Ajax into your site. Then the result of a request is not directly displayed in the window and all you can see is the HTTP return code of 500 in the log of your server.

In this short post I will show a simple yet useful solution to this problem. The goal is to get the error messages somehow. In my case, it is sufficient to read it from the log of the development server. To accomplish this, I wrote a simple decorator (wrapper) that you can use in your view function (or anywhere else if you wish). Like this:


@console_debug
def my_awesome_view(req,....

The code for the decorator, which I choose to place in a separate module, but you can also place it somewhere in views.py, is as follows:


import sys
import traceback

def console_debug(f):
  def x(*args, **kw):
    try:
      ret = f(*args, **kw)
    except Exception, e:
      print >> sys.stderr, "ERROR:", str(e)
      exc_type, exc_value, tb = sys.exc_info()
      message = "Type: %s\nValue: %s\nTraceback:\n\n%s" % (exc_type, exc_value, "\n".join(traceback.format_tb(tb)))
      print >> sys.stderr, message
      raise
    else:
      return ret
  return x

středa 22. července 2009

Pure and Applied Chemistry now uses automatic links to the IUPAC Gold Book

For a few years now, we have been using an automated system to insert links between different terms defined in the IUPAC Gold Book.
Now, for the first time, we also applied this approach to an external source - the Pure and Applied Chemistry journal. The result is about 10 links on average added to each abstract.
The software we use to do this was recently released (under the BSD license) to the public under the name goldify. It comes in two different flavors - an offline version written in Java and a client side library written in JavaScript. In case of PAC we use the offline version on the server to add links on-the-fly before the page is rendered and sent to the reader. Because the server runs on Django, we use the python client library to talk to the Java server (both are included in the goldify package).
Any comments are welcome.
p.s.- I almost forgot - the goldify library is by no means limited to the Gold Book - you can use it with any dictionary of terms.

pondělí 22. června 2009

Structure search in the IUPAC Gold Book

Last week I published a new version of the online version of the IUPAC Compendium of Chemical Terminology (aka the Gold Book; goldbook.iupac.org).
One of the most interesting features of this new release is the structure search. Alongside the InChI and InChIKey metadata hidden in GoldBook pages and the ring index, this is another example of the benefits of having the structures stored in a semantic format (we use BKChems format which describes both semantics and presentation).
The structure search uses ChemAxons Marvin Sketch plugin for the user drawing interface (thanks to ChemAxon for making this possible) and AJAX (through JQuery) to query the server which runs a custom built system based on OpenBabel and Pybel. The system consists of a small database which stores structures from the GoldBook and their fingerprints. The fingerprints are used in the screening process and the final hits are determined by using OpenBabels SMARTS matcher.
Because the number of compounds in the Gold Book database is small (~ 500 structures), it works very fast.
Any comments are welcome.

středa 20. května 2009

Cairo PDF - text vs. curves

When I started to use Cairo, I was at first surprised and then pleased by finding out that the text in PDF export was converted into curves. However, later I found that cairo also supports normal text and embedding of fonts in PDF. Because both can be useful in different situations, I would like to show in this post how both can be achieved in PyCairo (the Cairo bindings to Python).
The basic difference is in the PyCairo context object method you use to create the text. The two possibilities are:

show_text(text)
text_path(text)

The first one puts a normal text into the drawing (and embeds fonts), while the second creates a path for the text, which can later be filled (or stroked), thus converting the text into curves.
The following code demonstrates this feature:


import cairo
width, height = 200, 100
# create the surface and context
surface = cairo.PDFSurface( "output.pdf", width, height)
context = cairo.Context( surface)
# white background
context.set_source_rgb( 1, 1, 1)
context.rectangle( 0, 0, width, height)
context.fill()
# draw the text
context.set_font_size( 32)
context.select_font_face( "Arial")
context.move_to( 10, 40)
context.set_source_rgb( 0, 0, 0.5)
# text as curves
context.text_path( "Hello curves")
context.fill()
context.move_to( 30, 80)
# text as text
context.show_text( "Hello text")
# finalization
context.show_page()
surface.finish()

When you run this code, you will get a small PDF document called output.pdf in which part of the text will be selectable as normal text and part will not, because it is converted to curves.
As mentioned, the text path can be also used for stroking, which will produce a nice text outline:

Cairo and EPS format - a short detective story

The cairo library is a great tool that allows generation of hi-quality graphics in both bitmap (PNG) and vector (PDF, EPS, SVG) formats. It is multi-platform and has bindings into several scripting languages, including Python. For this reason I use it for export both in OASA and BKChem.
In this post I would like to show one feature of Cairo that recently backfired on me - fallback rendering.

As was reported by several users, the BKChem EPS export via Cairo was not what it was expected to be - it was a bitmap rendering of the drawing embedded into an EPS. I tried to find out what is going on and after much googling I quite unexpectedly found a clue in the source of the EPS file - one magical word "fallback". I googled "fallback" in relation to cairo and found these release notes mentioning that the PostScript export uses fallback rendering in case translucency is used.
This led back to discovery that BKChem cairo export inserts some invisible (completely transparent) rectangles into the drawing (these are used internally in BKChem for text borders) and these are causing the fall back to bitmap rendering (even though they are invisible in the output). By removing these from the export, I was able to make the EPS export work properly. It will be part of the next BKChem release.

As always, I hope that this post would be useful to other people possibly struggling with the same problem.
The lesson is clear - Watch out for invisible rectangles :)

úterý 17. března 2009

Default values of keyword arguments in Python

Today I stumbled upon a piece of code on the web which showed some addition to the Django framework. The code itself is not important, but it contains one serious and hard to spot (when you are not aware of this problem) flaw.
In Python you can give function argument default values - these arguments are called keyword arguments. It works like this:


def hi(name="there"):
  print "Hi %s." % name

If you call the function without arguments, the default value will be used, producing the output:


Hi there.

Otherwise the supplied value of name will be used.
Thus far nothing special and certainly nothing dangerous...
The problem is that the value of the default argument is not created afresh each time the function is executed, but only once when the function object is created (typically when the module is loaded). While this is not problem for numbers, strings and other immutable types, it has unexpected side-effects for mutable types, such as lists or dictionaries. For these types the content of the default objects is preserved between function calls.
The following code shows how this works:


def hi(param={}):
  print param
  param['test'] = 1

hi()
hi()

The result is:


{}
{'test': 1}

Because of this, when the function is called without an explicit value of such keyword argument, it might not get empty dictionary as expected, but a dictionary that was already populated by the previous run of the function or even some other parts of the code if the dictionary was returned as result of the previous function call.
Even though this feature could be exploited consciously to preserve state between function calls, it is most often undesired and unexpected.
There are several ways out of this problem. I prefer to use the following solution:


def hi(param=None):
  if param == None:
    param = {}
  print param
  param['test'] = 1

I believe that this problem is not well understood by many Python developers and belongs to the category of problems that you have to be bitten by to fully appreciate (at least it was my case). If this post could save at least one person from making the above mistake in their code, I would consider it worth the time it took me to write it :)

neděle 15. března 2009

Sending big files using Django

And now for something completely different...

Even though it is recommended not to serve static files using Django and one should run a separate lightweight server (such as lighthttpd) for this task, it is not uncommon that website systems written in Django sometimes need to send big files directly. In my case it was to restrict access to PDF files only to specific IP addresses which are stored in the systems database.

The main reason you should not send static files directly from Django - by reading the content of the file and sending it out - is that the whole file would be read into memory before sending it. For larger files - in my case about one to several megabytes - this is very inefficient and could easily choke the system where 90% of the traffic is generated by those bulky PDFs.

My original idea was to serve this data as other static files and use some form of name mangling in order for the user not to be able to guess the right name. However, such security through obscurity just moves the problem somewhere else - once a user obtains the right URL, he is not restricted in any way to use it.

Because of this, I decided that it would be useful to find a way to more effectively serve static files from Django, even if I should write it myself. Fortunately, I did not have to :)
After quick Google search, I found this ticket on Django website. It is an already approved patch that adds the HttpResponseSendFile function which does exactly what I need - very efficiently sends static files using the underlying systems optimized routines.
The patch attached to this ticked applied without problems to my Django 1.0.2 installation and in fifteen minutes was serving my static files to the world :)

I hope this information might get useful to other Django fans who stumble upon a similar problem.

čtvrtek 26. února 2009

BKChem enters Trophees du Libre

I am pround to announce that BKChem has entered the competition for important free software awards - Trophees du Libre.

pondělí 23. února 2009

Find 9 differences

Can you find nine differences in the following two pictures?

This is the difference between double bonds for which the second line is drawn using only 2D coordinates of the end atoms and for which the complete 3D geometry is used.
Because chemical drawings, unlike real molecules, are often flat, there was never much need for me to implement the better drawing method, especially when there are always many more important things to do. However this omission cropped up again when I rotated benzene in 3D during testing of new BKChem code that allows 3D rotation of molecular fragment around a particular bond. Because I had this feature on my to-do list for a long time anyway, I finally decided to dust off my basic knowledge of analytic geometry and implement this feature.
You can see the result on the pictures above.
Note: the pictures were created using OASA 0.12.7 and 0.13.0 respectively and a similar code is used in BKChem 0.13.0.

pondělí 16. února 2009

How to embed PDF in LaTeX documents

In the previous entry to this blog, I have shown that OASA can create pictures of molecules in PDF. However a picture of a molecule as a separate PDF file is not very useful.
Today, I will show a snippet of LaTeX code that can be used to embed the exported PDF file inside a PDF document produced by pdflatex. (Please note that this post is not intended to be an in-depth tutorial, its main purpose it to inform that the possibility to use PDFs this way even exists.)
The whole LaTeX source looks like this:


\documentclass[a4paper,12pt]{article}
\usepackage{graphics}
\usepackage[left=1.5cm,right=2cm,top=2cm]{geometry}

\begin{document}

\begin{center}
\begin{large}Exam - \bf{1A}\end{large}
\end{center}

Write down SMILES for the following molecule:

\includegraphics{test}

Name the following molecule:

\includegraphics{test2}

\end{document}

The most important commands are the following two - \usepackage{graphics} and \includegraphics{test} (please note that the name of the file to embed is used without the extension .pdf).
After running pdflatex on this code you will get a PDF file with the two images embedded inside (provided the two images exist in the same directory). The result looks like this, you can download the source files of this example here.

pondělí 9. února 2009

PDF and SVG support added to OASA

In the newest versions of OASA, I have added the possibility to create images not only in PNG, but also in SVG and PDF.
This functionality comes for free as a courtesy of the Cairo library and gives the user the possibility to create output much more suitable for printing.
Here is a small example code showing how to create a PDF export:


from oasa import smiles, cairo_out
mol = smiles.text_to_mol( "Oc1ccc(N)cc1Cl")
mol.normalize_bond_length( 30)
mol.remove_unimportant_hydrogens()
cairo_out.mol_to_cairo( mol, "example.pdf", format="pdf")

The output looks like this (of course, this is rendered to PNG, the PDF can be found here):

(Because SMILES was used as input, OASA generates the 2D structure for you.)
For SVG, just replace "pdf" with "svg" in the example above.
The cairo_out.mol_to_cairo function has many keyword arguments that allows for changing of almost every aspect of the output. For example, the following adds hydrogen symbols to hetero-atoms and changes the line width for bonds.


cairo_out.mol_to_cairo( mol, "example.svg", format="svg",
                     show_hydrogens_on_hetero=True,
                     line_width=2)

The output now looks like this (again in PNG, the SVG can be found here):

One big advantage of the SVG generated by Cairo is that all text is converted to curves. This means that the output will look the same on every machine, regardless of installed fonts.

pátek 30. ledna 2009

Converting old InChI strings to the new standard InChI format

Today I was faced with a task of converting a list of InChI strings in the format used prior to the release of InChI 1.02 final to the new standard format.
Even though it would be a no-brainer to write a small script for OASA to do it, it seemed to me that the InChI software itself should allow something like this.
After some poking around the stdinchi-1 help messages, here is my solution:

cat test.inchi | stdinchi-1 -STDIO -InChI2Struct 2>/dev/null |
grep AuxInfo | stdinchi-1 -STDIO -InpAux 2>/dev/null |
grep InChI=

(the code was broken into multiple lines to display properly)

This command assumes you have the stdinchi-1 program in your path and that you are using a decent OS, with grep and cat commands available (or at least some approximation like Cygwin).
The input file test.inchi contains a list of InChIs - one per line. The output is in the same format.
The sequence of commands at first converts the old InChI into the AuxInfo format used by the InChI program, then uses it as input to generate a new standard InChI.
Hopefully someone would find this information useful.

BKBlog