Back in April I decided I was going to learn how to write high performance code and that I was going to go all in with learning C to do it. In another learning project I had the need for fast communications so I thought why not see if I can write a HTTP server and see how fast I can make it?
I decided to do this all in open source so that it makes it easier to collaborate and learn. So I created a GitHub project called Haywire and off I went!
Some primary goals for Haywire include being a cross platform library and not so much a HTTP server that you run independently and host applications under. My needs were more about embedding communications into another service so the needs would be more geared towards being an API front end.
Haywire was kind of my “hello world” in C. Whenever I learn something new I have a habit of jumping 50 steps and starting with something probably too difficult as a beginners first step but I find it helps motivate me because the challenge of what I want to do interesting.
Haywire currently uses libuv for I/O and threading. It’s a C library that is the asynchronous I/O event loop based platform layer that node.js sits on top of. I picked libuv for I/O and threading because it had great cross platform support and helps abstract the nasty details of not having to implement epoll/kqueue/event ports on Unix systems or IOCP on Windows. This choice was really good in the beginning but later on I started fighting some of the event loop model. If libuv added a few key things it could be more flexible. More on that in later posts but the libuv team has been extremely helpful at giving me guidance on working around my problems.
I decided to use the build system libuv uses called Gyp which is created by Google. You write a .gyp file and it can generate Make, Xcode or Visual Studio outputs. This has been really helpful since I develop and test Haywire on Linux, OSX and Windows.
Haywire hello world took me about a day to learn how to use libuv to accept TCP connections, read bytes off the stream and parse a little bit of HTTP protocol from those bytes and then respond with a static response.
From the very beginning I was fairly relentless at benchmarking. While early benchmarks were not fair to compare to other HTTP servers doing much more than Haywire was (I was returning a static string after all!) it did give me a baseline. Whenever I added or changed code I would run the benchmark. Every time. This was extremely helpful because sometimes I would find out I did something that killed performance in major ways because I just compared it to the previous results.
I’ve gotten a bit (ok, ok, a lot) obsessed with performance in Haywire. In later stages I got much better at measuring hardware metrics because at times I was hitting bottlenecks like network saturation and I didn’t know this was happening until I created tools to collect and visualize hardware metrics. This became a blessing and a sin because now I was seeing a much better picture of what the machine was doing. Now I was even more obsessed with performance. Now I wanted to be efficient with the CPU so that code that uses Haywire gets more of the CPU rather than Haywire stealing most of it.
Following some talented database and distributed systems infrastructure engineers on Twitter has been a great influence for driving my motivation to get better at benchmarking and measuring. I see some of the things they are doing and the level of understanding they have of what is going on in the system and in their code is a great inspiration.
I understand that adding features will slow the performance down because anything is slower than something that does nothing but I try my best after introducing code that degrades performance to try to find creative ways to gain as much of it back as possible.
You can find Haywire on GitHub. While there are only Linux CI builds right now using Travis, I try to test on Windows fairly frequently but at times the build can be broken since Visual Studio 2012 has poor C99 support. I would love to get some Windows CI builds going so that I can stay on top of this better.
I would love to collaborate with you! I’m always open to contributions, comments, feedback and code reviews. Please be kind.
A tremendous thank you to the people who have already helped me out and offered their time. Thank you!
Now that I’ve introduced what Haywire is, stay tuned for some blogs describing some of the things I’ve done inside the code base.
- Responsible benchmarking
- Understanding hardware still matters in the cloud
- The “network partitions are rare” fallacy
- Messaging and event sourcing
- Further reducing memory allocations and use of string functions in Haywire
- HTTP response caching in Haywire
- Atomic sector writes and misdirected writes
- How memory mapped files, filesystems and cloud storage works
- Hello haywire
- Active Anti-Entropy
- Lightning Memory-Mapped Database
- Write amplification
- Amortizing de-duplication at read time instead of write time
- LevelDB was designed for mobile devices
- AMQP and wire format interopability
- Convergent Replicated Data Types
- Configuration is bad but what about operational flexibility?
- An alternative to Paxos, the RAFT consensus algorithm
- Version tolerance and accidental operation complexity
- Hardware configurations can introduce tight coupling and increase failure foot print
- November 2013
- October 2013
- August 2013
- July 2013
- June 2013
- May 2013
- April 2013
- March 2013
- January 2013
- October 2012
- September 2012
- August 2012
- May 2012
- April 2012
- February 2012
- January 2012
- December 2011
- September 2011
- July 2011
- June 2011
- May 2011
- April 2011
- March 2011
- February 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010