Hitting 1M+ req/s wasn’t actually my original intent. I started off working on a largely unrelated blog post, but I somehow found myself going down this optimization rabbit hole. The global pandemic gave me some extra time, so I decided to dive in head first. The table below lists the nine optimization categories that I will cover, and links to the corresponding flame graphs. It shows the percentage improvement for each optimization, and the cumulative throughput in requests per second. It is a pretty solid illustration of the power of compounding when doing optimization work.
Optimization | Flame Graph | Gain | Req/s |
---|---|---|---|
Ground Zero | initial.svg | - | 224k |
1. Application Optimizations | app.svg | 55% | 347k |
2. Speculative Execution Mitigations | spec-exec.svg | 28% | 446k |
3. Syscall Auditing / Blocking | syscall.svg | 11% | 495k |
4. Disabling iptables / netfilter | iptables.svg | 22% | 603k |
5. Perfect Locality | perfect-locality.svg | 38% | 834k |
6. Interrupt Optimizations | interrupt.svg | 28% | 1.06M |
7. The Case of the Nosy Neighbor | nosy-neighbor.svg | 6% | 1.12M |
8. The Battle Against the Spin Lock | spin-lock.svg | 2% | 1.15M |
9. This Goes to Twelve | final.svg | 4% | 1.20M |