BKBlog

úterý 23. února 2010

Python IDE for Linux

Emacs

Since I started working with Python some 10 or more years ago, I always used Emacs for coding. Some people complain that Emacs is a nice operating system, but that it lacks a good editor, but I learned to like its power and (really, no kidding) simplicity.
To edit Python source code in Emacs, a dedicated modules exists. It supports syntax highlighting, launching of scripts, has nice features for indentation and even a rudimentary class and function browser. Being still Emacs, it does not require you to touch the mouse for anything, which I found to be great for productivity. The only thing that I found inconvenient was debugging using pdb. Fortunately you can often find your way around this using print statements, so it did not bother me much.

Eclipse/Pydev

Last year I started working on goldify and because it was my first Java project, I decided to use some tool that would help me with the code. I used Eclipse and was delighted by its power, even though it is slightly more hungry for resources than I would like.
Inspired by this experience, I decided to try it also for Python and installed Pydev. Of course because of the fluid nature of Python, it is not possible to offer that many functions (type of a variable is almost never sure, so it is complicated to infer possible methods, etc.) or they are often limited to simple cases (I found that refactoring does not work for me almost as often as it does). However the general feeling, especially for larger projects, is much better than of Emacs. I like the feature that allows you to select a word in the source code and have all the other occurrences of this term highlighted in the scrollbar. I also found myself using the debugger much more often because it is so easy. What really made me to switch from Emacs to Eclipse for one of my new larger project, was the possibility to setup the key bindings to match that of Emacs. This way I get the best of both worlds - the key bindings I use without thinking and the ease of use of Eclipse and Pydev.

Wing IDE

Some time ago I stumbled upon an article about Python IDEs which mentioned Wing IDE. Because Wingware offers a free license for open source developers (kudos for such support of the community), I decided to give it a try for BKChem programming. I found the IDE to be very intuitive and similar in many ways to Pydev. One of the distinguishing features of Wing IDE is its lightweight nature. While Eclipse with 5-10 opened Python files takes about 250 MB of my computers memory, Wing IDE only needs 90 MB. Also the startup and closing times are much lower with Wing IDE. Unfortunately I was not able to find a way to emulate the feature I described above that allows quick jumps to other occurrences of the same term in a file. On the other hand, Wing IDE contains a useful feature called the "Source Assistant" which offers you instant information about an object on which you place the cursor. This is very useful, especially for built-in functions.

My (personal) conclusion

All in all, I found both Pydev and Wing IDE quite useful. For now, I will stick with Pydev (and Emacs for small scripts), mostly because I do both open source and commercial closed source programming and it would be inconvenient to switch between tools for each type of program. On the other hand, Wing IDE is also very nice and the price even for commercial license is modest.

středa 14. října 2009

Simple debugging of Ajax in Django

One of the great features of Django is that when in debugging mode something goes wrong with your code, you will see the traceback of the error and many other useful information directly in your browsers window.

This is cool and helps you to quickly debug your code. The problem comes when you start to put Ajax into your site. Then the result of a request is not directly displayed in the window and all you can see is the HTTP return code of 500 in the log of your server.

In this short post I will show a simple yet useful solution to this problem. The goal is to get the error messages somehow. In my case, it is sufficient to read it from the log of the development server. To accomplish this, I wrote a simple decorator (wrapper) that you can use in your view function (or anywhere else if you wish). Like this:


@console_debug
def my_awesome_view(req,....

The code for the decorator, which I choose to place in a separate module, but you can also place it somewhere in views.py, is as follows:


import sys
import traceback

def console_debug(f):
  def x(*args, **kw):
    try:
      ret = f(*args, **kw)
    except Exception, e:
      print >> sys.stderr, "ERROR:", str(e)
      exc_type, exc_value, tb = sys.exc_info()
      message = "Type: %s\nValue: %s\nTraceback:\n\n%s" % (exc_type, exc_value, "\n".join(traceback.format_tb(tb)))
      print >> sys.stderr, message
      raise
    else:
      return ret
  return x

středa 22. července 2009

Pure and Applied Chemistry now uses automatic links to the IUPAC Gold Book

For a few years now, we have been using an automated system to insert links between different terms defined in the IUPAC Gold Book.
Now, for the first time, we also applied this approach to an external source - the Pure and Applied Chemistry journal. The result is about 10 links on average added to each abstract.
The software we use to do this was recently released (under the BSD license) to the public under the name goldify. It comes in two different flavors - an offline version written in Java and a client side library written in JavaScript. In case of PAC we use the offline version on the server to add links on-the-fly before the page is rendered and sent to the reader. Because the server runs on Django, we use the python client library to talk to the Java server (both are included in the goldify package).
Any comments are welcome.
p.s.- I almost forgot - the goldify library is by no means limited to the Gold Book - you can use it with any dictionary of terms.

pondělí 22. června 2009

Structure search in the IUPAC Gold Book

Last week I published a new version of the online version of the IUPAC Compendium of Chemical Terminology (aka the Gold Book; goldbook.iupac.org).
One of the most interesting features of this new release is the structure search. Alongside the InChI and InChIKey metadata hidden in GoldBook pages and the ring index, this is another example of the benefits of having the structures stored in a semantic format (we use BKChems format which describes both semantics and presentation).
The structure search uses ChemAxons Marvin Sketch plugin for the user drawing interface (thanks to ChemAxon for making this possible) and AJAX (through JQuery) to query the server which runs a custom built system based on OpenBabel and Pybel. The system consists of a small database which stores structures from the GoldBook and their fingerprints. The fingerprints are used in the screening process and the final hits are determined by using OpenBabels SMARTS matcher.
Because the number of compounds in the Gold Book database is small (~ 500 structures), it works very fast.
Any comments are welcome.

středa 20. května 2009

Cairo PDF - text vs. curves

When I started to use Cairo, I was at first surprised and then pleased by finding out that the text in PDF export was converted into curves. However, later I found that cairo also supports normal text and embedding of fonts in PDF. Because both can be useful in different situations, I would like to show in this post how both can be achieved in PyCairo (the Cairo bindings to Python).
The basic difference is in the PyCairo context object method you use to create the text. The two possibilities are:

show_text(text)
text_path(text)

The first one puts a normal text into the drawing (and embeds fonts), while the second creates a path for the text, which can later be filled (or stroked), thus converting the text into curves.
The following code demonstrates this feature:


import cairo
width, height = 200, 100
# create the surface and context
surface = cairo.PDFSurface( "output.pdf", width, height)
context = cairo.Context( surface)
# white background
context.set_source_rgb( 1, 1, 1)
context.rectangle( 0, 0, width, height)
context.fill()
# draw the text
context.set_font_size( 32)
context.select_font_face( "Arial")
context.move_to( 10, 40)
context.set_source_rgb( 0, 0, 0.5)
# text as curves
context.text_path( "Hello curves")
context.fill()
context.move_to( 30, 80)
# text as text
context.show_text( "Hello text")
# finalization
context.show_page()
surface.finish()

When you run this code, you will get a small PDF document called output.pdf in which part of the text will be selectable as normal text and part will not, because it is converted to curves.
As mentioned, the text path can be also used for stroking, which will produce a nice text outline:

Cairo and EPS format - a short detective story

The cairo library is a great tool that allows generation of hi-quality graphics in both bitmap (PNG) and vector (PDF, EPS, SVG) formats. It is multi-platform and has bindings into several scripting languages, including Python. For this reason I use it for export both in OASA and BKChem.
In this post I would like to show one feature of Cairo that recently backfired on me - fallback rendering.

As was reported by several users, the BKChem EPS export via Cairo was not what it was expected to be - it was a bitmap rendering of the drawing embedded into an EPS. I tried to find out what is going on and after much googling I quite unexpectedly found a clue in the source of the EPS file - one magical word "fallback". I googled "fallback" in relation to cairo and found these release notes mentioning that the PostScript export uses fallback rendering in case translucency is used.
This led back to discovery that BKChem cairo export inserts some invisible (completely transparent) rectangles into the drawing (these are used internally in BKChem for text borders) and these are causing the fall back to bitmap rendering (even though they are invisible in the output). By removing these from the export, I was able to make the EPS export work properly. It will be part of the next BKChem release.

As always, I hope that this post would be useful to other people possibly struggling with the same problem.
The lesson is clear - Watch out for invisible rectangles :)

úterý 17. března 2009

Default values of keyword arguments in Python

Today I stumbled upon a piece of code on the web which showed some addition to the Django framework. The code itself is not important, but it contains one serious and hard to spot (when you are not aware of this problem) flaw.
In Python you can give function argument default values - these arguments are called keyword arguments. It works like this:


def hi(name="there"):
  print "Hi %s." % name

If you call the function without arguments, the default value will be used, producing the output:


Hi there.

Otherwise the supplied value of name will be used.
Thus far nothing special and certainly nothing dangerous...
The problem is that the value of the default argument is not created afresh each time the function is executed, but only once when the function object is created (typically when the module is loaded). While this is not problem for numbers, strings and other immutable types, it has unexpected side-effects for mutable types, such as lists or dictionaries. For these types the content of the default objects is preserved between function calls.
The following code shows how this works:


def hi(param={}):
  print param
  param['test'] = 1

hi()
hi()

The result is:


{}
{'test': 1}

Because of this, when the function is called without an explicit value of such keyword argument, it might not get empty dictionary as expected, but a dictionary that was already populated by the previous run of the function or even some other parts of the code if the dictionary was returned as result of the previous function call.
Even though this feature could be exploited consciously to preserve state between function calls, it is most often undesired and unexpected.
There are several ways out of this problem. I prefer to use the following solution:


def hi(param=None):
  if param == None:
    param = {}
  print param
  param['test'] = 1

I believe that this problem is not well understood by many Python developers and belongs to the category of problems that you have to be bitten by to fully appreciate (at least it was my case). If this post could save at least one person from making the above mistake in their code, I would consider it worth the time it took me to write it :)