Let’s Build a Cloud — Introduction

This is the first in a series of blogs that will explain how to build an OpenStack cloud, from conception to production.

To begin, I will explain what needs to be gathered when planning an OpenStack deployment. This will cover OpenStack architecture, hardware and network configuration.

In the next blog, I will show how to build and install a few OpenStack components from the source, as well as from the official Ubuntu packages.

In later posts, I will introduce the configuration management tool, Puppet, and how it can be used to perform a complete OpenStack installation.

Technical Requirements for this Series

While the first part of this series will be more about planning than implementation, the latter parts will be the opposite. If you would like to follow along with the work, I recommend having two servers available: one for a '€œCloud Controller'€ and one for a '€œCompute Node.'€ (Both roles can also be run on one server, if preferred).

These servers can either be physical servers or virtual machines. Virtual machines are easier to use when following along as they can easily be created, destroyed, and reverted to a previous point in time. VMWare and VirtualBox will work just fine in this regard.

The servers should have at least two CPU cores and 512mb of RAM. 1,024mb of RAM is recommended. Two NICs on each server is also required.

Ubuntu 12.04 will be used throughout this series.

If possible, during the installation and configuration of the virtual machine that will be the '€œCompute Node,'€ create a dedicated LVM partition called '€œnova-volumes.'€ If this is not possible to do or was forgotten, there is an alternative described later on in the series when '€œnova-volumes'€ is used.

Planning and Preparation

OpenStack Planning

OpenStack consists of several parts (Nova, Keystone, Glance, Horizon, Swift, etc), and those parts have sub-parts. Click here to see the list of OpenStack parts, and what each does. 

OpenStack architectures comes in two basic forms: everything installed on one server, or across multiple servers.

For this series, we will focus on a two-server architecture. One server, known as the '€œCloud Controller,'€ will host all of the non-compute OpenStack components: nova-api, nova-cert, nova-consoleauth, nova-network, nova-objectstore, nova-scheduler, Keystone, Glance, and Horizon. The other server will be known as a '€œCompute Node'€, and will run nova-compute and nova-volume.

This division of components might seem a little lopsided but it makes sense. The individual components that run on the Cloud Controller use very little resources. They can all comfortably fit on one server. The components that run on a Compute Node, however, can be very resource-intensive depending on how many instances are hosted and how large they are. This division also keeps a clean logical separation: one server manages the OpenStack cloud and the other server does all of the work.

Operating System Planning

Choosing the Operating System that will host your OpenStack cloud is a very important decision. OpenStack currently runs best on Linux using KVM as the hypervisor. Ubuntu is the de-facto distribution used in OpenStack development, so it has the most exposure.

OpenStack has been known to work with RHEL and CentOS as well as with the Xen hypervisor. If you have a laboratory where you can work on experimental OpenStack installs, by all means try out RHEL and Xen. However, if you plan on using OpenStack in production, I would advise erring on the side of caution and using the known working choices.

This series will use the recommended configuration of Ubuntu 12.04 for the Linux distribution and KVM as the hypervisor.

Hardware Planning

'€œThe Cloud'€ has been described as a volatile environment where servers can be created and terminated at will. While this may be true, it does not mean that your servers must be volatile. Ensuring your cloud'€™s hardware is stable and configured correctly means your cloud environment will stay up and running.

Basically, put effort into creating a stable hardware environment so you can host a cloud that users may treat as unstable and volatile. (That sounds about right).

Hard Drives

At the very base of any operating system are the hard drives that the OS is installed to. There are two types of configurations that should be done to the server'€™s hard drives: partitioning and adding to a RAID array.

Partitioning is the act of dividing a single hard drive into logical sections. These sections are then dedicated to certain parts of the operating system. If you only have one hard drive (note: this is not recommended to do in production), then you will only have to partition one drive. However, if you have more than one, multiple drives should be partitioned in the same way. The reason for this will become clear when RAID is configured.

I will use the partitioning scheme used for a learning management system cloud created by Cybera. Each server contains six 1 terabyte drives. Four of those six drives were partitioned in the following way:

Partition Size Type
Partition 1 300 megabytes Linux RAID
Partition 2 20 gigabytes Linux RAID
Partition 3 900 gigabytes Linux RAID

The other two drives were partitioned as:

Partition Size Type
Partition 1 300 megabytes Linux RAID
Partition 2 20 gigabytes Linux Swap
Partition 3 900 gigabytes Linux RAID

Next, similar partitions are grouped together to form RAID arrays. As a quick refresher: RAID provides the ability to group hard drives or partitions together to either increase speed, decrease failure, or both.

The RAID configurations will differ depending on whether the server is a Cloud Controller or Compute Node.

For a Cloud Controller, the RAID configuration is:

RAID Device RAID Type Mount Point Partitions
/dev/md0 RAID1 /boot Disk 1 Partition 1, Disk 2 Partition 2
/dev/md1 RAID10 unused D1P2, D2P2, D3P2, D4P2
/dev/md2 RAID10 / D1P3, D2P3, D3P3, D4P3, D5P3, D6P3
 

For a Compute Node, the RAID configuration is:

RAID Device RAID Type Mount Point Partitions
/dev/md0 RAID1 /boot Disk 1 Partition 1, Disk 2 Partition 2
/dev/md1 RAID10 / D1P2, D2P2, D3P2, D4P2
/dev/md2 RAID10 LVM nova-volumes D1P3, D2P3, D3P3, D4P3, D5P3, D6P3
 

This partitioning and RAID configuration can be done during the Disk Configuration part of the Ubuntu installation. Alternatively, the Disk Configuration part of the Ubuntu installer can be overridden with your own custom script. Here are examples of that for the Cloud Controller and the Compute Node. Implementing these scripts is outside of the scope of this series.

Hard Drive Configuration Conclusion

Please note that this exact configuration does not need to be implemented in order to follow along with the rest of this series. This is just an example of a production-quality disk configuration. Planning your disk configuration is extremely important — going back and changing your configuration after the operating system is installed is extremely difficult if not impossible.

The Physical Network

Network configuration is just as important as disk configuration, except that it'€™s easier to change after everything has been installed.

Note: When I talk about the '€œphysical network,'€ I mean the physical network devices and NICs that will be used in the server environment.

Each server that has the OpenStack components installed on them will need to communicate with one another. If one server has all OpenStack components, then only a connection to the Internet is needed. For two servers, a crossover cable can easily be used for communication between the two (and indeed we have done this for some of our test servers). For three or more servers, and especially in production, a network switch is required.

Modern servers come with a myriad of network config options. They can have anywhere from one to four (or more!) NICs. These NICs can range in speeds of 10/100mbps, 1gbps, or 10gbps. As well, the physical medium of the NICs can now be copper or SFP+.

A simple OpenStack configuration can utilize one NIC. A more complex configuration will utilize many more. The servers chosen for Cybera'€™s learning management cloud have a total of five NICs: two 10gb SFP+, two 1gb copper, and one 10/100 copper IPMI. Of these five, four were used.

Cybera also utilized three network switches: two Arista 7050 52 port SFP+ switches and one Cisco Catalyst 10/100 copper switch.

We decided to use the 10gb SFP+ NICs for the bulk of the network traffic. Since each server has two 10gb NICs, each of those are plugged into one of the Arista switches. This gives us network redundancy; if one of the NICs or one of the switches fails, the other switch or NIC is still running and service is not interrupted.

One of the copper 1gb NICs and the single IPMI NIC are plugged into the Cisco 10/100 switch. This provides us with PXE and IPMI connectivity that is separated from the rest of the traffic. Since PXE and IPMI traffic are not essential to the operation of the cloud, we decided to only utilize one switch. Note that PXE and IPMI are outside of the scope of this series.

The Virtual Network

OpenStack has the ability to utilize VLANs (virtual LANs) for communication between instances and the Internet. Cybera also decided to use VLANs for public Internet traffic as well as internal infrastructure traffic.

I consider OpenStack'€™s VLAN network manager to be the best network manager available at the moment. It provides good segregation between users of the OpenStack Cloud (meaning traffic from one user to another is totally separate '€” there is no way the two network streams can conflict), as well as good bridging support between multiple Compute Nodes.

OpenStack allows you to configure the starting VLAN identifier. By default it is VLAN 100. This default is sufficient for most environments, but it'€™s important to review your current network setup and ensure VLAN 100 is not in use. Similarly, whenever a new project is created in OpenStack, that new project will then use the next available VLAN ID (for example, VLAN 101, 102, 103 and so on). Again, if a VLAN that OpenStack would use is already in use, you will need to plan around this.

Another virtual network that needs to be configured is OpenStack's fixed network. This network is a private network that the instances will communicate on. We usually use a fixed network of 10.0.0.0/8 with a total of 255 networks. With this in effect, the first project will communicate on a private network of 10.0.0.0/24. The second project will communicate on 10.0.1.0/24, and so on.

Network Configuration Conclusion

As mentioned with the Hard Drive Configuration, you do not need to replicate Cybera'€™s configuration to follow along with this series. This is only given as an example of what we have used in production.

To visualize the layered complexity of the network config, please see the following rough sketch I made during my own planning stage:

During our testing phase of this project, the Ubuntu network configuration went through a few revisions. Here is the final configuration that was used:

auto lo
iface lo inet loopback
auto eth0
iface eth0 inet static
    address 192.168.255.1
    netmask 255.255.255.0
auto eth3
iface eth3 inet manual
    bond-master bond0
auto eth2
iface eth2 inet manual
    bond-master bond0
auto bond0
iface bond0 inet manual
    bond-slaves none
    bond-mode 802.3ad
    bond-miimon 100
auto vlan422
iface vlan422 inet static
    address 10.0.0.2
    netmask 255.255.255.0
    gateway 10.0.0.1
    dns-nameservers 8.8.8.8
     dns-search private.edu.cybera.ca
    vlan-raw-device bond0
auto vlan423
iface vlan423 inet static
    address 192.168.1.1
    netmask 255.255.255.0
    vlan-raw-device bond0

Conclusion

This part of the series detailed several areas of planning that are critical when deploying a cloud. I did not list every detailed command that Cybera used in its learning management system cloud project, but this should be a thorough enough walkthrough of the planning process.

Out of all of the cloud building instructions, this is the part that I was most anxious to write about. When new cloud software and tools are released, it'€™s common for the author or company to post a little YouTube video of the cloud being made with the touch of a button or a single command. This is cute, but totally impractical. If you are building a cloud environment, educate yourself on the technology you will be using, factor in redundancy and uptime, and be aware that most of the initial steps are inalterable later on.

In my next blog, I will show how to build and install a few OpenStack components from the source, as well as from the official Ubuntu packages.