It's no secret that I've been sorely disappointed in the I/O performance found on Amazon's Elastic Cloud Computing (EC2) platform but, as this is the key technology that we use at work, it's my responsibility to eke every last ounce of performance from the systems. One of the biggest issues that I've had has been the relatively poor performance of EC2 instances running as dedicated MySQL Servers. The reason for using dedicated EC2 instances for MySQL rather than RDS instances is a topic for another day, so let's examine a usage scenario that I see becoming a reality in the next six months.
There are currently two MySQL servers serving data to 15 high-traffic web servers operating with a peak of 381 transactions per second (each). The database system was designed at a time when peak transactions rarely exceeded 220 per second but, as more applications and platforms being to use the same pair of servers, ease of expansion needs to be considered. The MySQL servers have a tested maximum of 627 transactions per second over a 1 hour period. Going further, the database servers are configured in a standard master-slave configuration with the master being used only for writes, and the slave being used only for reads. For every one write transaction to the Master, there are 3.25 reads from the slave.
Simple math, right?
Let's examine some other details. For the sake of data persistence, the MySQL data is stored on EBS volumes. Each server has four EBS volumes in RAID0 for the data, and two EBS volumes in RAID0 for the binary logs.
Some people might already be scoffing at using multiple EBS volumes and stringing them together with software RAID, but this particular solution was found to provide the best result for what we needed at the time. That said, it may be time to re-examine this decision and implement something with ephemeral volumes.
Let's look at some of the pros and cons of each:
- free (included in cost of EC2 instance)
- stable, predictable performance on par with a standard physical hard disk
- abundant storage (x00 to x000 GB depending on instance type)
- ephemeral (data is lost when instance is terminated)
- average random seek performance (6-7ms seek times per spindle)
- "highly available" (AWS typically provides redundancy and a lower failure rate than physical disks)
- portable (EBS volumes can be moved from one instance to another in seconds)
- backups (snapshots are relatively easy to take)
- extremely variable performance (seek times can range from .5ms to 10ms+)
- maximum throughput of 1Gbit/s (regardless of how many volumes are mounted)
- costs associated with storage and I/O
The Test System
For testing, m1.large instances were used to keep costs down and provide some real-world performance numbers. The most potent instances I'm using at work are 32-bit c1.mediums, but I am strongly considering moving the database servers over to m1.large to take advantage of the extra RAM (among other advantages). Two configurations were used, and their results were compared to the current database servers. Here is the breakdown:
1x - m1.large instance, 4 Compute Units, 7.5 GB RAM, 2x 240GB Ephemeral Volumes
1x - m1.large instance, 4 Compute Units, 7.5 GB RAM, 4x 120GB EBS Volumes
2x - c1.medium instance, 5 Compute Units, 1.7 GB RAM, 4x 20GB EBS Volumes
Each database is smaller than 40GB, and the largest tables typically have over a million rows inserted daily. It's a very cut and dry system that is barely dependent on the 'R' in RDBMS.
mdadm was used on each of these systems to construct RAID0 volumes for data with chunk sizes set at 128K,
blockdev was used to set the read-ahead buffer to 64K, and the file systems are all XFS. Bonnie++ was used to perform basic IO tests.
Now the fun can begin!
A bit of a disclaimer: I ran these tests on three separate occasions over the course of this past week in the us-west region and the numbers reported here are just averages. The results for the EBS volumes varied drastically depending on what time of the day I was performing the tests, and the ephemeral volumes never went outside of a 4% variance. The numbers reported from the c1.medium instances were from tests performed in October of 2010 during their shakedown, as I would never conduct these tests on live servers.
The first real test performed was the sequential throughput test, and it was only done after the volumes had their drives filled to 80% capacity and dumped accordingly. I've read in more than one Amazon forum that EBS storage performs terribly until it's written to a few times (and I've seen this first hand on some servers in us-west), and didn't want this affecting the results. So, how was it?
Larger numbers are definitely better and, as we can see, the ephemeral RAID0 volume stole the show with more than twice the performance of the EBS volumes. What struck me as most interesting about this test is that the write speeds were consistently higher than the reads ... something that many of us would expect to be reversed. I ran these tests twice and, much to my disbelief, the numbers remained consistent. This could be due to any number of reasons, but I'll be sure to do some more testing before replacing any of the live database servers.
These random seek numbers were quite amazing. It's clear that the EBS volumes are much better suited to seeking than the ephemeral, so much so that I ran these tests several times extra just to make sure the numbers were valid. One word of warning, though: EBS seek times are not easily predictable. EBS volumes are accessed over Amazon's internal network so, if there is a great deal of network traffic in the data center, response times can suffer greatly. I checked these numbers ten times in total, and the EBS values varied wildly from 2890 to 6710 for the m1.large with 4x EBS volumes. The ephemeral volumes, however, never went outside of a 10 point variance in either direction. Your mileage may vary.
So how about MySQL Performance? Let's take a look at a (admittedly simulated) set of results:
Looks like a three-fold increase from the existing c1.medium server to a m1.large with ephemeral storage. Now, these are simulated values with randomly created
SELECT statements, but real-world performance (I would expect) would be within 90% of these numbers. Again, your mileage may vary depending on the type of data being written and queried as well as the type of database engines and indexes being used.
What Does This Mean?
These tests show a night and day difference between EBS and ephemeral volumes. While EBS does benefit from excellent seek times, these times can vary wildly from one EBS volume to another. Unfortunately, when it comes to RAID, the system is only as fast as it's slowest component. There were times when three of the four EBS volumes would return data in less than 5ms, while the fourth volume would take more than 25ms. Other times the entire system ran so fast you'd swear EBS volumes ran on top-of-the-line solid state disks. This unpredictable performance hit may not be much of an issue for web servers but, when a database server is feeding information to hundreds of concurrent users, every millisecond counts. If you plan on using EBS volumes in a very busy data center (such as the us-east block), you can guarantee the response times will be slower. We also need to remember that because EBS runs over a gigabit network, transfer speeds will never exceed 1Gb/sec. A two-volume RAID array could, theoretically, max out the EBS throughput for the instance. For most situations, thought, this may be sufficient. Once your systems come under significant load for any amount of time, this could be a problem.
According to Amazon, they will guarantee a set number of CPU cycles and a certain amount of RAM. It's not clear whether there are any guarantees about EBS volume performance. That said, we can see that ephemeral volumes are incredibly fast. The problem with these, however, is that any data written to them is lost once the machine is terminated. While this may not be a problem for most people, it can be a problem if you work in an environment where servers are dying or co-workers accidentally terminate vital systems.
One thing that I did not do for these tests was attempt to find the "sweet spot" when it comes to block sizes, read-ahead buffers, and the other important details that come with configuring a server for optimal usage. This is, however, a post for another day.