úterý 17. března 2009

Default values of keyword arguments in Python

Today I stumbled upon a piece of code on the web which showed some addition to the Django framework. The code itself is not important, but it contains one serious and hard to spot (when you are not aware of this problem) flaw.
In Python you can give function argument default values - these arguments are called keyword arguments. It works like this:

def hi(name="there"):
print "Hi %s." % name

If you call the function without arguments, the default value will be used, producing the output:

Hi there.

Otherwise the supplied value of name will be used.
Thus far nothing special and certainly nothing dangerous...
The problem is that the value of the default argument is not created afresh each time the function is executed, but only once when the function object is created (typically when the module is loaded). While this is not problem for numbers, strings and other immutable types, it has unexpected side-effects for mutable types, such as lists or dictionaries. For these types the content of the default objects is preserved between function calls.
The following code shows how this works:

def hi(param={}):
print param
param['test'] = 1

hi()
hi()

The result is:

{}
{'test': 1}

Because of this, when the function is called without an explicit value of such keyword argument, it might not get empty dictionary as expected, but a dictionary that was already populated by the previous run of the function or even some other parts of the code if the dictionary was returned as result of the previous function call.
Even though this feature could be exploited consciously to preserve state between function calls, it is most often undesired and unexpected.
There are several ways out of this problem. I prefer to use the following solution:

def hi(param=None):
if param == None:
param = {}
print param
param['test'] = 1

I believe that this problem is not well understood by many Python developers and belongs to the category of problems that you have to be bitten by to fully appreciate (at least it was my case). If this post could save at least one person from making the above mistake in their code, I would consider it worth the time it took me to write it :)

2 komentáře:

  1. Hi Beda,

    I exactly had the problem you describe above in my Django application. I had coded a function that mailed some users. The function had a list of extra recipients as a keyword argument.
    The thing is, people complained that they were getting e-mails that didn't belong to them. I've been debugging this problem for multiple evenings, and eventually found out that this is the problem. A Google search later I've found your blog posting...
    What bothered me most is that these keyword arguments are "cached" between multiple HTTP requests (I'm using mod_wsgi). This made it also particularly difficult to find the source of the problem.
    Anyway, thank you very much for documenting this!!!
    Now, I'm going to comfort those upset users :-)

    Regards,
    - Matthias

    OdpovědětVymazat
  2. Hi Matthias,
    thanks for your comment. I am happy that my post helped, even if it did not save you from the trouble.
    Beda

    OdpovědětVymazat