Understanding hardware still matters in the cloud

By kellabyte  //  Cloud  //  1 Comment

Cloud services and infrastructure from Amazon, Windows Azure and other cloud providers offer resources with high levels of abstraction that sometimes provide scalability and fault tolerance benefits. One common misconception with deploying to the cloud is that you don’t have to care about how anything works underneath, especially the hardware. I don’t agree that the cloud makes the computer transparent. Often these abstractions come at a cost of reduced performance and are far from what you can get with a bare metal machine. Understanding how hardware and software works underneath is a way to gain some of that back.

Amazon EBS and Windows Azure storage provide storage resources that you can mount as drives to store data. Cloud storage is usually replicated and that is one reason for the reduced performance, a trade-off for the benefits of fault tolerance. Underneath all the storage API’s and replication are like any other computer: some HDD or SSD drives and either a standard or custom filesystem of some kind.

Similar to accessing memory, how you access a drive matters. Like memory, even if you request reading a small block, there is a minimum block size that the drive operates at. If your request is smaller it will read the entire block before delivering you the small piece you wanted. Drives might have a physical sector size of 512 bytes or 4096 bytes for example. Legacy BIOS software often only understands 512 byte sectors and drives who provide 4096 sector sizes will always read 4096 bytes even if you requested only 512 bytes.

If you’re not careful with how you access memory or disk you might request something that is 4096 bytes that is misaligned and spans in the middle of two 4096 blocks even though it would fit in one. This will read both blocks even though you only have 1 block worth of data you wanted. The computer hides this from you, but there is a performance cost. These tricks work both with software on bare metal machines but they also work in the cloud.

Martin Monperrus wrote an interesting article where he measured the difference between aligned and misaligned random reads on Amazon EBS. Martin writes:

On Amazon Elastic Block Storage (EBS), I noticed at least a difference of one millisecond in average between accessing aligned random 4096B blocks (~10ms) and misaligned random 4096B blocks (~12ms, see below). It means that EBS is sensitive to disk and partition misalignment with 4KiB requests (that typically correspond to filesystem blocks).

10% to 20% additional latency per random seek of unaligned access is pretty substantial if your workload does a lot of random access!

I encourage you to read more about Martin’s article.


  • Hazem

    i liked your comment, and i think you may have a valid point, but if you looked from the other side, the main idea about the cloud is to re-use the unused devices within data center, removing the headache from end user, in addition to automating services with a LSA that reserves your rights.

    i am just sharing one of my ideas, and i hope you would get my point