středa 14. října 2009

Simple debugging of Ajax in Django

One of the great features of Django is that when in debugging mode something goes wrong with your code, you will see the traceback of the error and many other useful information directly in your browsers window.


This is cool and helps you to quickly debug your code. The problem comes when you start to put Ajax into your site. Then the result of a request is not directly displayed in the window and all you can see is the HTTP return code of 500 in the log of your server.


In this short post I will show a simple yet useful solution to this problem. The goal is to get the error messages somehow. In my case, it is sufficient to read it from the log of the development server. To accomplish this, I wrote a simple decorator (wrapper) that you can use in your view function (or anywhere else if you wish). Like this:



@console_debug
def my_awesome_view(req,....

The code for the decorator, which I choose to place in a separate module, but you can also place it somewhere in views.py, is as follows:



import sys
import traceback

def console_debug(f):
def x(*args, **kw):
try:
ret = f(*args, **kw)
except Exception, e:
print >> sys.stderr, "ERROR:", str(e)
exc_type, exc_value, tb = sys.exc_info()
message = "Type: %s\nValue: %s\nTraceback:\n\n%s" % (exc_type, exc_value, "\n".join(traceback.format_tb(tb)))
print >> sys.stderr, message
raise
else:
return ret
return x

středa 22. července 2009

Pure and Applied Chemistry now uses automatic links to the IUPAC Gold Book

For a few years now, we have been using an automated system to insert links between different terms defined in the IUPAC Gold Book.
Now, for the first time, we also applied this approach to an external source - the Pure and Applied Chemistry journal. The result is about 10 links on average added to each abstract.
The software we use to do this was recently released (under the BSD license) to the public under the name goldify. It comes in two different flavors - an offline version written in Java and a client side library written in JavaScript. In case of PAC we use the offline version on the server to add links on-the-fly before the page is rendered and sent to the reader. Because the server runs on Django, we use the python client library to talk to the Java server (both are included in the goldify package).
Any comments are welcome.
p.s.- I almost forgot - the goldify library is by no means limited to the Gold Book - you can use it with any dictionary of terms.

pondělí 22. června 2009

Structure search in the IUPAC Gold Book

Last week I published a new version of the online version of the IUPAC Compendium of Chemical Terminology (aka the Gold Book; goldbook.iupac.org).
One of the most interesting features of this new release is the structure search. Alongside the InChI and InChIKey metadata hidden in GoldBook pages and the ring index, this is another example of the benefits of having the structures stored in a semantic format (we use BKChems format which describes both semantics and presentation).
The structure search uses ChemAxons Marvin Sketch plugin for the user drawing interface (thanks to ChemAxon for making this possible) and AJAX (through JQuery) to query the server which runs a custom built system based on OpenBabel and Pybel. The system consists of a small database which stores structures from the GoldBook and their fingerprints. The fingerprints are used in the screening process and the final hits are determined by using OpenBabels SMARTS matcher.
Because the number of compounds in the Gold Book database is small (~ 500 structures), it works very fast.
Any comments are welcome.

středa 20. května 2009

Cairo PDF - text vs. curves

When I started to use Cairo, I was at first surprised and then pleased by finding out that the text in PDF export was converted into curves. However, later I found that cairo also supports normal text and embedding of fonts in PDF. Because both can be useful in different situations, I would like to show in this post how both can be achieved in PyCairo (the Cairo bindings to Python).
The basic difference is in the PyCairo context object method you use to create the text. The two possibilities are:
  • show_text(text)
  • text_path(text)

The first one puts a normal text into the drawing (and embeds fonts), while the second creates a path for the text, which can later be filled (or stroked), thus converting the text into curves.
The following code demonstrates this feature:

import cairo
width, height = 200, 100
# create the surface and context
surface = cairo.PDFSurface( "output.pdf", width, height)
context = cairo.Context( surface)
# white background
context.set_source_rgb( 1, 1, 1)
context.rectangle( 0, 0, width, height)
context.fill()
# draw the text
context.set_font_size( 32)
context.select_font_face( "Arial")
context.move_to( 10, 40)
context.set_source_rgb( 0, 0, 0.5)
# text as curves
context.text_path( "Hello curves")
context.fill()
context.move_to( 30, 80)
# text as text
context.show_text( "Hello text")
# finalization
context.show_page()
surface.finish()

When you run this code, you will get a small PDF document called output.pdf in which part of the text will be selectable as normal text and part will not, because it is converted to curves.
As mentioned, the text path can be also used for stroking, which will produce a nice text outline:

Cairo and EPS format - a short detective story

The cairo library is a great tool that allows generation of hi-quality graphics in both bitmap (PNG) and vector (PDF, EPS, SVG) formats. It is multi-platform and has bindings into several scripting languages, including Python. For this reason I use it for export both in OASA and BKChem.
In this post I would like to show one feature of Cairo that recently backfired on me - fallback rendering.

As was reported by several users, the BKChem EPS export via Cairo was not what it was expected to be - it was a bitmap rendering of the drawing embedded into an EPS. I tried to find out what is going on and after much googling I quite unexpectedly found a clue in the source of the EPS file - one magical word "fallback". I googled "fallback" in relation to cairo and found these release notes mentioning that the PostScript export uses fallback rendering in case translucency is used.
This led back to discovery that BKChem cairo export inserts some invisible (completely transparent) rectangles into the drawing (these are used internally in BKChem for text borders) and these are causing the fall back to bitmap rendering (even though they are invisible in the output). By removing these from the export, I was able to make the EPS export work properly. It will be part of the next BKChem release.

As always, I hope that this post would be useful to other people possibly struggling with the same problem.
The lesson is clear - Watch out for invisible rectangles :)

úterý 17. března 2009

Default values of keyword arguments in Python

Today I stumbled upon a piece of code on the web which showed some addition to the Django framework. The code itself is not important, but it contains one serious and hard to spot (when you are not aware of this problem) flaw.
In Python you can give function argument default values - these arguments are called keyword arguments. It works like this:

def hi(name="there"):
print "Hi %s." % name

If you call the function without arguments, the default value will be used, producing the output:

Hi there.

Otherwise the supplied value of name will be used.
Thus far nothing special and certainly nothing dangerous...
The problem is that the value of the default argument is not created afresh each time the function is executed, but only once when the function object is created (typically when the module is loaded). While this is not problem for numbers, strings and other immutable types, it has unexpected side-effects for mutable types, such as lists or dictionaries. For these types the content of the default objects is preserved between function calls.
The following code shows how this works:

def hi(param={}):
print param
param['test'] = 1

hi()
hi()

The result is:

{}
{'test': 1}

Because of this, when the function is called without an explicit value of such keyword argument, it might not get empty dictionary as expected, but a dictionary that was already populated by the previous run of the function or even some other parts of the code if the dictionary was returned as result of the previous function call.
Even though this feature could be exploited consciously to preserve state between function calls, it is most often undesired and unexpected.
There are several ways out of this problem. I prefer to use the following solution:

def hi(param=None):
if param == None:
param = {}
print param
param['test'] = 1

I believe that this problem is not well understood by many Python developers and belongs to the category of problems that you have to be bitten by to fully appreciate (at least it was my case). If this post could save at least one person from making the above mistake in their code, I would consider it worth the time it took me to write it :)

neděle 15. března 2009

Sending big files using Django

And now for something completely different...

Even though it is recommended not to serve static files using Django and one should run a separate lightweight server (such as lighthttpd) for this task, it is not uncommon that website systems written in Django sometimes need to send big files directly. In my case it was to restrict access to PDF files only to specific IP addresses which are stored in the systems database.

The main reason you should not send static files directly from Django - by reading the content of the file and sending it out - is that the whole file would be read into memory before sending it. For larger files - in my case about one to several megabytes - this is very inefficient and could easily choke the system where 90% of the traffic is generated by those bulky PDFs.

My original idea was to serve this data as other static files and use some form of name mangling in order for the user not to be able to guess the right name. However, such security through obscurity just moves the problem somewhere else - once a user obtains the right URL, he is not restricted in any way to use it.

Because of this, I decided that it would be useful to find a way to more effectively serve static files from Django, even if I should write it myself. Fortunately, I did not have to :)
After quick Google search, I found this ticket on Django website. It is an already approved patch that adds the HttpResponseSendFile function which does exactly what I need - very efficiently sends static files using the underlying systems optimized routines.
The patch attached to this ticked applied without problems to my Django 1.0.2 installation and in fifteen minutes was serving my static files to the world :)

I hope this information might get useful to other Django fans who stumble upon a similar problem.