DAIR vs Amazon’s EC2: How do the two clouds stack up?

By Everett Toews, Senior Developer, Edmonton

The DAIR cloud is €œan advanced R&D environment for product design, prototyping, validation and demonstration. When organizations have outgrown DAIR and are ready to move their service into a production environment, they may want to migrate to another cloud. Two major factors they need to consider when contemplating migration are performance and cost.

If an organization was considering migrating to a cloud other than Amazon, or just wanted to get a feel for what infrastructure performance is like with various service providers, it can have a look at CloudHarmony, a cloud computing performance comparison engine.

David Morais from the Université de Sherbrooke in Quebec did a performance comparison between DAIR and Amazon'€™s EC2.

Performance

The goal of this benchmark is to compare an EC2 core to a DAIR core. The benchmark consists of two tests: the first one evaluates the I/O performance of the cloud, using only 1 CPU and a full node; the second one tests how the numerical calculations were performed. The relevant compute host specifications for DAIR are below. Amazon makes no such information available.

DAIR Compute Host

Processor: Intel Xeon X5650

RAM: 48 GB
Disk: 6 x 500GB 7.2K RPM SATA 2.5"

The iozone figure below shows a comparison between one CPU unit in amazon and one core in DAIR. Both EC2 and DAIR instances were type m1.small.

dairvsamazon2

Figure: iozone

The iozone full node figure below shows the same test now with the node fully occupied. In other words, all the CPUs were being used in this test. It is important to note that although all the CPUs were used, the results reflect the usage of one CPU. Indeed, the results shows the mean of all CPUs.

dairvsamazon1

Figure: iozone full node

The software used in this benchmark were iozone and nbench. Below is a detailed explanation of each measurement and its caveats.

IOZONE TEST

Write: This test measures the performance of writing a new file. When a new file is written, not only does the data need to be stored but also the overhead information for keeping track of where the data is located on the storage media. This overhead is called the '€œmetadata'€. It consists of the directory information, the space allocation and any other data associated with a file that is not part of the data contained in the file. It is normal for the initial write performance to be lower than the performance of re-writing a file due to this overhead information.

Re-write: This test measures the performance of writing a file that already exists. When an existing file is written the work required is less as the metadata is already there. It is normal for the rewrite performance to be higher than the performance of writing a new file.

Read: This test measures the performance of reading an existing file.

Random Read: This test measures the performance of reading a file with access being made to random locations within the file. The performance of a system under this type of activity can be impacted by several factors such as: size of the operating system'€™s cache, number of disks, and seek latencies, among others.

Reverse Read: This test measures the performance of reading a file backwards. This may seem like a strange way to read a file but in fact there are applications that do this. MSC Nastran is an example of an application that reads its files backwards. With MSC Nastran, these files are very large (Gbytes Gigabytes to Tbytes Terabytes in size). Although many operating systems have special features that enable them to read a file forwards more rapidly, there are very few operating systems that detect and enhance the performance of reading a file backwards.

Strided Read: This test measures the performance of reading a file with a strided access behaviour. An example would be: Read at offset zero for a length of 4 Kilobytes, then seek 200 Kbytes, and then read for a length of 4 Kbytes, then seek 200 Kbytes and so on. Here the pattern is to read 4 Kbytes and then seek 200 Kbytes and repeat the pattern. This again is a typical application behaviour for applications that have data structures contained within a file and are accessing a particular region of the data structure. Most operating systems do not detect this behavior or implement any techniques to enhance the performance under this type of access behavior. This access behavior can also sometimes produce interesting performance anomalies. An example would be if the application'€™s stride causes a particular disk, in a striped file system, to become the bottleneck.

Fwrite: This test measures the performance of writing a file using the library function fwrite(). This is a library routine that performs buffered write operations. The buffer is within the user'€™s address space. If an application were to write in very small size transfers then the buffered and blocked I/O functionality of fwrite() can enhance the performance of the application by reducing the number of actual operating system calls and increasing the size of the transfers when operating system calls are made. This test is writing a new file, so again the overhead of the metadata is included in the measurement.

Fread: This test measures the performance of reading a file using the library function fread(). This is a library routine that performs buffered and blocked read operations. The buffer is within the user'€™s address space. If an application were to read in very small size transfers, then the buffered and blocked I/O functionality of fread() could enhance the performance of the application by reducing the number of actual operating system calls and increasing the size of the transfers when operating system calls are made.

Cost

To get an indication of what an organization might pay for infrastructure at Amazon, have a look at this pricing guide. If the organization has long-running servers, it will want to consider using reserved instances to get a better deal on costs. If it has short-lived instances that are effectively disposable, it may want to consider using spot instances. Amazon also provides a handy calculator to give an indication of what the monthly costs might be.

To get an idea of what the price of infrastructure at Rackspace, have a look at this pricing guide. If the organization has long running servers, it will want to consider using dedicated servers to get a better deal on costs or maybe consider a hybrid hosting approach. The Rackspace pricing guide also provides a handy calculator to give them an indication of what the monthly costs might be.

If you are considering migrating to a cloud other than Amazon, or just want to get a feel for what infrastructure costs are a various service providers, have a look at Cloudorado, a cloud computing price comparison engine.

Overall

With respect to performance, DAIR compares very favourably with EC2. Of course, DAIR is running nowhere near the scale of EC2, but organizations do need to know that they can expect a performance hit when leaving the DAIR cloud.

With respect to cost, DAIR was offered at no charge during the pilot phase. It'€™s tough to compete with free, but it'€™s worthwhile for organizations to look before they leap.