Tag Archives: django

Django performance tip: select_related()

I was optimizing a Django application I’m working on the other day using Simon Willison’s excellent DebugFooter middleware, which adds a footer to each page showing which SQL queries were executed by Django when generating the page.

I’m a bit of a caching addict so I had already added a caching layer on top of my models, and thus I was quite surprised to find that the most important page on the site still generated 5-15 SQL queries on every access, even though the objects it was accessing supposedly were cached.

The objects were indeed cached, but every time I was accessing one of the ForeignKey fields on the model objects Django generated a SQL query to find the data for the related object. This could quickly turn nasty if you follow such relationships in a loop on a high-traffic web site.

The solution was the select_related() QuerySet method. Borrowing from the Django documentation, a normal ORM lookup would look like this:

e = Entry.objects.get(id=5)
b = e.blog

This would generate two SQL queries, one to fetch the entry object and one to fetch the blog object once it’s referred to from the entry object. The same example with select_related() becomes:

e = Entry.objects.select_related().get(id=5)
b = e.blog

This example only generates one SQL query, albeit bigger and slower than each of the individual queries in the first example because of the necessary join between the model tables to find all the data in one go. However, this doesn’t matter if the fetched object will go directly into a cache anyway and stay there for a possibly rather long time, which was the case for me.

Django shortcomings and Facebook architecture

I’ve watched two presentations lately that I enjoyed, so I thought I’d link to them here.

The first one is by Cal Henderson at DjangoCon 2008. Cal is an engineering manager at Flickr, which not surprisingly is written in PHP, and he delivered a keynote address on why he hates Django.

Although made tongue-in-cheek, it contains a bunch of very valid points about Django. One of the main ones being Django’s monolithic database approach. This is probably also my own biggest concern with Django. I have first-hand experience of making this design mistake for a web site that grew rather big, and it can easily turn into a major and prolonged headache.

The other presentation is by Aditya Agarwal, an engineering director at Facebook, at QCon SF 2008. Aditya talks about the Facebook software stack, which somewhat crudely described is a normal LAMP stack, albeit heavily tuned, backed by memcached and a number of backend services. Facebook is obviously a very extreme environment but many of the design choices and observations in this presentation are valid for smaller sites too.

Multi-language content spots with django-chunks and friends

Update March 3: django-better-chunks has now been patched, so there is no need to use my project fork anymore.

In my previous post I wrote about how to use flatpages in Django for serving static content pages. The flatpages module only deals with full pages however, so if you would like to include static content on a more fine-grained level, or have multiple content spots per template, you need to look elsewhere.

This is where django-chunks, or one of the many projects forked from it (e.g. django-better-chunks and django-flatblocks), comes to the rescue. django-chunks allows you to for example create a content spot called “home_page_right” in admin, and then include it in your template like this:

<div id="right">
    {% chunk "home_page_right" %}
</div>

The chunk tag also accepts an optional second parameter that specifies a cache timeout in seconds, e.g. 3600 for an hour’s caching.

So far so good, but a theme of some of my previous posts has been multi-language support, and unfortunately django-chunks is lacking this. Luckily, django-better-chunks was created to remedy just that. Unluckily though, while adding language support they also broke the cache support.

That being said, django-better-chunks is the only module I’ve found that does support multiple languages, so I felt it would be the best project for me to build on. To fix the broken caching I’ve created yet another fork of this project: django-better-chunks-devdoodles. Hopefully the patch can be merged into the original project at some point. Here’s how to get django-better-chunks (or my fork) up and running:

Step 1 — download and install
Download django-better-chunks or, perhaps preferably until the caching is fixed, django-better-chunks-devdoodles and install in your Python/Django path.

Step 2 — edit settings.py
As usual, you need to add LocaleMiddleware to your middleware classes if you want Django to automatically choose which language to use:

MIDDLEWARE_CLASSES = (
    ...
    'django.middleware.locale.LocaleMiddleware',
)

Then add contrib.sites (there by default), admin, and chunks to your installed applications:

INSTALLED_APPS = (
    ...
    'django.contrib.sites',
    'django.contrib.admin',
    'chunks',
)

contrib.sites is needed since django-better-chunks connect the content chunks to a site through a ForeignKey relationship. Admin is needed to add and edit content spots.

Optionally, if you want to use another caching backend for chunks than the default local-memory cache (locmem://), then you add it here too. To use a local memcached cache on the default port instead, add:

CACHE_BACKEND = 'memcached://127.0.0.1:11211/'

Step 3 — activate admin in urls.py
Uncomment the three lines needed to activate admin in urls.py.

Step 4 — sync database
Create database tables:

python manage.py syncdb

Step 5 — create spots in admin
At this point all that remains is to create the content spots in the Chunks section in admin (/admin/chunks/chunk/) and start using them from your templates. Make sure to load the chunk tag library first using:

{% load chunks %}

Finally, beware that the help text in the admin UI gives as language examples ‘sv-se’ and ‘de-de’, which often won’t work well with Django’s automatic language detection. This is because Django by default almost always resolves to base languages (e.g. ‘en’) and not sublanguages (e.g. ‘en-us’), since most of the languages defined in the LANGUAGES setting in global_settings.py are defined only as base languages.

Static content with django-multilingual flatpages

While working on a multilingual Django project I encountered the need to have pseudo-static pages, e.g. an about page or FAQ page, translated into multiple languages. In my earlier post I wrote about how to use the i18n tag library in templates to handle translations, and although this approach would work for static pages too it would not be a perfect fit.

Django comes bundled with the flatpages application, which rather cleverly hooks into the 404 errors generated by Django when it cannot find a page and maps the requested URL to a database list. If there’s an entry for the requested URL, it shows the page stored in the database for that URL instead of the 404 page.

The bundled flatpages application has no inherent multi-language support, and I was pretty close to adapting it for my needs before Google came to the rescue. Obviously, this had already been done by someone, and it’s distributed as a part of the django-multilingual module, which is a generic module for having translated fields in Django models. Here’s what I did to get it up and running, based on the steps described on the project wiki.

Step 1 — install django-multilingual
Check out the Subversion trunk for the project as described in the wiki, and make the checked out module available for Python somehow. I just copied the multilingual sub-folder to my project folder as if it were my own application.

Step 2 — edit settings.py
Add the list of languages you want to support to settings.py, and mark English as the default language (through its tuple index). The LANGUAGES setting is actually already defined in global_settings.py, but I don’t want to support all those languages so I override it.

LANGUAGES = (
    ('en', 'English'),
    ('sv', 'Swedish'),
)
DEFAULT_LANGUAGE = 1

Add the multilingual context processor to TEMPLATE_CONTEXT_PROCESSORS. This setting is not included by default in your settings.py file, but the first four core processors below are set as default in the global settings (for reference, see here and here):

TEMPLATE_CONTEXT_PROCESSORS = (
    'django.core.context_processors.auth',
    'django.core.context_processors.debug',
    'django.core.context_processors.i18n',
    'django.core.context_processors.media',
    'multilingual.context_processors.multilingual',
)

Add the middleware classes in the order listed below to support language detection and for the actual mapping of 404s to flatpages to be triggered. Curiously, the FlatpageFallbackMiddleware is not mentioned in the official installation instructions, but you can deduce that it’s needed by its mentioning in the original flatpage documentation and, of course, the fact that nothing happens without it.

MIDDLEWARE_CLASSES = (
    ...
    'django.middleware.locale.LocaleMiddleware',
    'multilingual.middleware.DefaultLanguageMiddleware',
    'multilingual.flatpages.middleware.FlatpageFallbackMiddleware',
)

Add admin and the multilingual apps to INSTALLED_APPS:

INSTALLED_APPS = (
    ...
    'django.contrib.admin',
    'multilingual',
    'multilingual.flatpages',
)

Step 3 — activate admin in urls.py
Uncomment the three lines needed to activate admin in urls.py.

Step 4 — sync database
Create all database tables needed for admin and flatpages:

python manage.py syncdb

Step 5 — create template
Create a ./flatpages/ sub-folder in your template-root directory, and create a default.html template in it. This is the default template used for displaying the flatpages, but it can be overridden in admin (see the next step). Its context is populated by a flatpage variable with two fields: title and content. An example template is available here.

Step 6 — create pages in admin
Go to the flatpages section in your admin application (/admin/flatpages/multilingualflatpage/) and create all the pages and translations you want to serve using the flatpages application.

Step 7 — done!
We’re done! As before, you can test your work by modifying LANGUAGE_CODE in settings.py or changing the preferred-languages setting in your web browser.

User authentication with django-registration

Please note that this guide is not updated with instructions for the latest version of django-registration. Some things in the steps below may no longer be valid. Comments have been disabled.

The built-in user authentication system in Django is great, but unfortunately it lacks support for sending activation emails to newly registered users. Enter the django-registration application, which adds registration and account activation on top of Django’s standard views for user authentication.

Although certainly not the first of its kind, this post will cover the steps I took to get it up and running for a freshly created Django project. Other tutorials are available here, here and here, as well as in the official documentation.

Step 1 — install django-registration
Instructions on how to install django-registration are covered nicely by the official overview document. You can also simply copy the registration folder directly to your project folder, which enables you to modify the contents of the package specifically for your project, should you wish to do so.

Step 2 — update settings.py
Add the registration application to the INSTALLED_APPS tuple in settings.py. Also add django.contrib.admin if you want to make use of Django’s admin system to handle user accounts (of course you do!). It might look like this:

INSTALLED_APPS = (
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.sites',
    'django.contrib.admin',
    'registration'
)

I also added the following settings:

ACCOUNT_ACTIVATION_DAYS = 2
EMAIL_HOST = 'localhost'
DEFAULT_FROM_EMAIL = 'webmaster@localhost'
LOGIN_REDIRECT_URL = '/'

Strictly speaking, only the first setting is required. It controls how many days emailed activation keys are valid.

EMAIL_HOST should be set to whatever host name your mail server is on. It defaults to ‘localhost’, but it’s explicitly set in the above example for clarity. You should also change the DEFAULT_FROM_EMAIL setting to show a proper sender email address for your activation emails.

Finally, LOGIN_REDIRECT_URL controls where a user is redirected after successful login by the contrib.auth.login view. The default value /accounts/profile/ is fine if you intend to map a view to that URL, but django-registration doesn’t do this for us so we’ll just use ‘/’ for now.

Step 3 — setup database
In addition to the standard Django user models, django-registration needs an additional model (RegistrationProfile) for storing activation keys that are sent out. Set this model up in the database by running:

python manage.py syncdb

Step 4 — update urls.py
The root urls.py needs to be updated with mappings for the registration application and admin. django-registration maintains its own mappings inside ./registration/urls.py, so we just delegate to that file:

from django.conf.urls.defaults import *
from django.views.generic.simple import direct_to_template
from django.contrib import admin
admin.autodiscover()

urlpatterns = patterns('',
    (r'^admin/(.*)', admin.site.root),
    (r'^accounts/', include('registration.urls')),
    (r'^$', direct_to_template, 
            { 'template': 'index.html' }, 'index'),
)

This example also adds a mapping for the ‘/’ URL we added a redirect to in step 2, which directly forwards to an index.html template.

Step 5 — create view templates
All that remains now is to create rendering templates for the registration views. They should go into a ‘registration’ folder under your template root (TEMPLATE_DIRS in settings.py).

django-registration maps URLs to the standard django.contrib.auth.views, so the following templates need to be created:

login.html — user login form
logout.html — shown after a user has logged out
password_change_form.html — password change form
password_change_done.html — shown after successful password change
password_reset_form.html — ask user for email to send password-reset mail to
password_reset_email.html — template for password-reset mail
password_reset_done.html — shown after password-reset email has been sent
password_reset_confirm.html — ask user for new password after reset
password_reset_complete.html — shown after successful password reset

Note that the password_reset_confirm and password_reset_complete views are missing from the official documentation, but it’s possible to see how they can be used in the Django source code here, here, and here.

Additionally, the following templates specific to django-registration need to be created:

registration_form.html — user registration form
registration_complete.html — shown after a user has registered
activation_email_subject.txt — subject of activation email
activation_email.txt — template for activation email
activate.html — shown after a user has activated his account

I’ve created very basic example implementations of these templates that you can check out here.

Step 6 — change site name and domain
The email templates will normally output both the domain name and display name for your web site. To change the default value of “example.com” to the name of your web site you need to log in to the admin system and go to the Sites section (/admin/sites/site/), where you can edit this.

Step 7 — done
That should be all, I hope. 😉

Multi-language support in a Django project

The Django documentation on internationalization describes how to add multi-language support to your application. As it took me a few tries to get it right, here’s a rundown of what I did to add it to an existing project.

Step 1 — update settings.py
In settings.py, make sure that USE_I18N is set to True, which is the default. The LANGUAGE_CODE setting controls the default language for the site, so if you only need to support one language you set it here.

Then add the LocaleMiddleware class to your MIDDLEWARE_CLASSES, which for a clean project might look like this:

MIDDLEWARE_CLASSES = (
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.middleware.locale.LocaleMiddleware',
)

Note that the order of the middleware classes makes a difference. LocaleMiddleware is needed for Django to select the user-preferred language from data in the request (i.e. cookies or the Accept-Language http header).

Step 2 — add translation keys to templates
Every template that should have translation support must load the i18n tag library using the {% load %} template tag. Once loaded, you can use the {% trans %} tag to mark a string for translation. A minimalistic hello-world template could look like this:

{% load i18n %}
{% trans "Hello" %}

Step 3 — create language files
Once your templates are set up, it’s time to create the language files. This step depends a bit on which operating system you’re running. I use Ubuntu Linux, so that’s what I’ll cover.

You can have language files local to an application as well as global for all your site. To set up language files for e.g. English and Swedish, move to the root of your project (or the root of the application), and run:

mkdir locale
django-admin.py makemessages -l en
django-admin.py makemessages -l sv

Note that you need to create the locale directory manually before running makemessages, otherwise you will get an error message. Also, if you get this error message:

Error: errors happened while running xgettext on __init__.py
/bin/sh: xgettext: not found

… it’s because your Linux distribution is missing the xgettext program. In Ubuntu, it’s provided by the gettext package:

sudo apt-get install gettext

Now that the language files are set up as ./locale/<language>/LC_MESSAGE/django.po, you can edit them and provide translations of the “Hello” key for each locale. When you’re done, they need to be compiled to .mo files before Django can use them:

django-admin.py compilemessages

Step 4 — done!
That’s all! You can try changing the LANGUAGE_CODE setting to switch between languages, or change the preferred-languages setting in your web browser (under Tools->Options->Content->Languages in Firefox), and Django should adapt automatically.

Django and memcached library confusion

I’ve been playing around a bit with Python and Django lately, and I’m pretty impressed with them both so far. Django brings some nice features out of the box to the developer, one of them being the cache framework and the built-in support for memcached. I’ve never actually used memcached before, so I thought this would be a good time to try it out.

Installing memcached itself went fairly smoothly, but I got a bit confused when it was time for the Python client library. According to the documentation, Django supports cmemcache and python-memcached, of which cmemcache is the recommended option since it’s built on top of the C library libmemcache and almost twice as fast as the pure Python library.

On the other hand, the library used by cmemcache, i.e. libmemcache, is according to the memcached documentation wiki and this thread no longer under active development, which the early-2006 timestamps of the file listings agree with. It’s a bit strange that a seemingly abandoned library is recommended by the Django documentation, especially since memcached surely must have evolved a bit during the last 3 years.

To the detriment of both libraries, neither seem to support the consistent hashing introduced in memcached by libketama. Consistent hashing makes it possible to add or remove memcached nodes to a live environment without having all keys become remapped to different servers, effectively nuking the cache. This will not be a problem for all environments of course, but it makes dynamic scaling less attractive.

Consistent hashing is also supported by libmemcached (not to be confused with libmemcache), which is still actively maintained and used by python-libmemcached (not to be confused with python-memcached). Despite the low version number, it looks like a more appealing choice for Python clients, given that it’s actively developed, presumably fast, and supports consistent hashing. A pity that Django doesn’t support it.

So far this is just theory and speculation, of course, and only testing may tell what’s the best choice for a Django site. What’s the worst offense; the slower performance of python-memcached or the outdated underlying library of cmemcache? Does its age make any difference in practice? Thoughts, anyone?