Using memcached with OpenStack Nova

By Everett Toews, Senior Developer, Edmonton

Using memcached with OpenStack Nova is easy, but if you're new to memcached (like me) there are a few things you need to be aware of.

First of all, what does Nova use memcached for?

As of the time of this writing, all that is being cached are the user's roles. These are cached in order to speed up authentication, which is done with each and every request. Caching was added, in particular, to speed up authentication when you are using LDAP as your authentication backend, since LDAP lookups can be a bit slower than DB lookups.

Setup is relatively straight-forward:

Management Node (nova-api, nova-network, nova-scheduler, nova-objectstore)

apt-get install memcached python-memcache
sed -i "s/127.0.0.1/192.168.0.1/" /etc/memcached.conf
/etc/init.d/memcached restart
echo "--memcached_servers=192.168.0.1:11211" >> /etc/nova/nova.conf
restart nova-api; restart nova-network
restart nova-scheduler; restart nova-objectstore

Worker Nodes (nova-compute, nova-volume)

apt-get install python-memcache
echo "--memcached_servers=192.168.0.1:11211" >> /etc/nova/nova.conf
restart nova-compute; restart nova-volume

Or, to put it another way, setup is relatively straight-forward once you've made all your mistakes and learned how to do it properly. The Cybera team ran into some problems when we first started using memcached. However, these issues didn't become apparent until we started working with a user other than the Cloud Administrator (cloudadmin) (see Managing Compute Users). When working with a normal user we were seeing delays of up to 10-15 seconds for each and every request (e.g. euca-describe-images).

To begin the troubleshooting process I ran nova-api with strace.

strace /usr/bin/python /usr/bin/nova-api --flagfile=/etc/nova/nova.conf

I found that, 3-4 times per request, nova-api was attempting to connect to memcached. Each connection attempt took 3-4 seconds, which accounted for the total delay we were experiencing.

Digging a bit deeper I decided to take a look inside memcached and see what was going on. I found a great Python utility from the post View Memcached Keys During Development that I used to see what was being cached, which turned out to be nothing. Either we had misconfigured something or there was a bug in Nova, my money was on the former.

I took a look to see what IP address and port the memcached was running on with ps.

ps aux | grep memcached

Memcached was running on 127.0.0.1 and port 11211 (the default) of our management node.

I checked what IP address and port Nova was configured to use in /etc/nova/nova.conf. It was looking for memcached on the public IP of our management node. There is the problem.

The fix was simple. Change the IP address memcached runs on in /etc/memcached.conf are restart it.

/etc/init.d/memcached restart

Change the IP address Nova uses in /etc/nova/nova.conf on the management node and all worker nodes and restart everything.

restart nova-api; restart nova-network
restart nova-scheduler; restart nova-objectstore

restart nova-compute; restart nova-volume

With that memcached started to get populated with items and the delays disappeared.