We use Apache VCL to run the Virtual Computing Lab (VCL), a project developed by Cybera and the University of Alberta. Recently, the Apache VCL project graduated from being 'incubated' to being a fully-fledged project within the Apache Software Foundation open source ecosystem. The foundation has more than 100 top-level projects, and includes much more than just the ubiquitous Apache HTTP server.
The Virtual Computing Lab is used to deliver customized computing environments to student end users. It's possibly one of the earlier, certainly a precursor to, infrastructure-as-a-service systems such as OpenStack, OpenNebula, Eucalyptus, and others. It could also be considered a form of virtual desktop infrastructure (VDI), though I would prefer to call it 'VDI-lite' to avoid the assumptions that are usually made with VDI systems ' such as that they require massive, fast, and expensive centralized storage systems (which we do not use in VCL).
Differences in Cybera's VCL use
Cybera has made several customizations to VCL to meet the requirements of the project. We have also made some interesting hardware decisions. Together, these modifications and deployment choices provide a different take on the average installation of Apache VCL, and this post will detail a few of those differences.
Deployment of VCL code
It's quite common when using Linux to have the ability to install software with simple commands such as: yum install some_software. Cybera uses a custom spec file to create Red Hat Package Managers (RPM) that deploy the Apache VCL codebase.
Currently, the Apache VCL project does not provide Linux packages. Because Cybera has created a custom RPM for Apache VCL, we can deploy the codebase with a command as simple as:
# yum install vcl-*
It's important to note that the RPM does not do any configuration. Rather, it simply deploys the code that can then be configured, either manually, or by use of a configuration management system.
Automated configuration of VCL
Configuration management is a powerful way to programmatically create and maintain information technology systems. There are many open source configuration management systems, such as Ansible, Chef, and Puppet, among others.
In our current production deployment we used the Chef automation system. We are actively evaluating which configuration management system to use in the future, and may go so far as to explore Puppet and Saltstack before coming to a final conclusion.
That said ' by using either Ansible or Chef we can deploy an entire Apache VCL system in a few minutes, and do so in a replicable, repeatable fashion. In combination with easily creatable (and deletable) virtual machines, we can quickly create production, development and test environments, and in the event of a disaster can speedily rebuild our VCL deployment.
Network address translation
Even though Cybera provides large, fast, advanced networks, we don't have unlimited IPv4 addresses. Initially the VCL project started off with a /24 network, which is about 250 public IP addresses. Given that it was technically possible to run VCL using network address translation (NAT), which would allow us to give back many of these addresses, we patched VCL to use NAT, which means that we have gone from using a /24 network to a /28 network, reducing our IP usage by almost 240 IPs.
The NAT patch was submitted to the Apache VCL project in August of 2012, and we have been using it in production for almost a year. As of publishing this blog, the patch has not yet been accepted by the maintainers, but it will hopefully be in the early 2014 release.
Use of OpenStack
Cybera is a heavy user of OpenStack. We operate several OpenStack-based private clouds. Our VCL system has its own small private OpenStack cloud made up of one cloud controller and seven compute nodes. There is a blog post on the #canstack site that details the hardware we use.
Apache VCL does not yet officially support OpenStack, though the developers have begun experimenting with the Grizzly release.
As far as getting VCL to use OpenStack, essentially we use a slightly modified version of the OpenStack module that was submitted to Apache VCL here. We are in the process of updating that module, which will be shared with the VCL community.
Use of striped solid state drives
We exchanged the software RAID10 arrays of 1TB SATA drives to stripes (RAID0) of solid state drives. When trying to overcommit virtual machines we found that storage IOPS were the bottleneck.
We have done a good amount of testing, and while it's unfair to compare RAID10 of SATA drives to a stripe of SSDs, it is true that in some cases the SSD stripes can be 1,000 times faster. We have found that Windows 7 instances require a considerable storage IOPS while booting, and the SSD stripes provide more than enough performance to keep Windows happy, and allow for overcommitting.
It's important to note that the stateless use of VCL instances means stripes are okay to use in production. A stripe is considerably more likely to fail than a RAID10 array. However, should a stripe in one of the compute nodes fail, we can simply replace the drive, rebuild the compute node and add it back to the OpenStack cluster. Users will not lose data because they have been saving their work to their local computer over the RDP connection, or to something like Google drive or another network accessible storage.
There are several resources that can be overcommitted with virtual machines: memory, storage and CPU.
In Openstack the default memory overcommit is 1.5:1. With virtual CPUs it is 16:1. For storage, when qcow2 images are used, instances are booted off a snapshot of a single base image, i.e. thinly provisioned. Because the virtual machines in VCL + Openstack are no long running, they don't use much storage. In fact, they usually use less than 1GB for a 60GB virtual image.
With regards to main memory, there is an interesting memory deduplication system that is installed by default in Ubuntu 12.04 called Kernel Samepage Merging (KSM). In our OpenStack system we have been working to determine optimal KSM settings. Suffice it to say that when all the virtual machines are Windows 7, there is a lot of deduplication that can happen.
KVM takes care of CPU overcommitting. Again by default, OpenStack sets this to 16. In our system we are trying to limit the virtual machines to a single virtual CPU (VCPU).
Gridcentric Virtual Memory Streaming
Gridcentric Virtual Memory Streaming (VMS) is a fascinating technology that allows virtual machines to essentially 'pop' into existence without having to actually boot.
'With VMS, virtual machines do not have to go through a performance intensive boot process. Instead, they launched from pre-booted live images and are ready to serve requests within seconds. The launched virtual machines have a low memory footprint, resulting in dense memory oversubscription.' ' via Gridcentric
This is useful in VCL for two reasons:
- Booting Windows 7 instances can take up to 10 minutes. With VMS, instances are often up and accessible from the network in well under three minutes. So that means VCL can quickly provision the virtual machine for the user. This means lower login times for users.
- The Windows 7 boot process is extremely IOPS (i.e. storage) and memory intensive. When a live image is booted, it uses 10 to 100 times fewer IOPS and much less memory. This means we can support more virtual machines with the same hardware.
VCL Deletes Itself: Chaos Monkey
While we have certainly implemented some interesting and practical technologies in our VCL system, we have also encountered issues.
The most interesting issue was with the VCL management node deleting itself. This reminds me of the Netflix Chaos Monkey system, which essentially randomly deletes virtual machines to help ensure the entire system is highly available.
Because we use NAT, the management node has to be in the same OpenStack tenant as the virtual machines it's managing. And because the management node's job is to create and delete virtual machine instances, it can theoretically delete itself. Which it did. Twice.
The current openstack VCL module stores some state about the OpenStack instances in /etc/hosts using the IPs of the instances instead of the OpenStack UUID. It then uses that IP to delete instances. In two cases the management node decided to destroy an instance using its own IP, and thus deleted itself. We are updating the OpenStack module to keep better state using the UUID instead of the IP address.