-
Notifications
You must be signed in to change notification settings - Fork 284
Description
I'm using Daphne for both http and ws requests for my project, and I've noticed that if my service gets spammed with a surge of requests, they are executed in the order they came in. This is intuitive and fair, however, this effectively shuts my webserver down if the volume of requests is more than it can handle (instead of letting it process as much as it can and fail the rest).
The fundamental problem is this - every request comes with an external timeout (in my case, from the ALB balancer above it), and with a FIFO queue, we're wasting time processing requests that have already timed out OR are close to timing out externally, instead of responding to stuff that the client is still waiting for.
I've seen multiple articles from big tech companies mentioning this, but to my knowledge, they all build custom tools to get this to work (I can't find this setting in daphne / nginx / uwsgi / ALB):
- Using load shedding to avoid overload — AWS Builders' Library
- Building Services at Airbnb, Part 3 — Airbnb Engineering
- Meet Bandaid, the Dropbox service proxy — Dropbox Tech Blog
- Fail at Scale: Reliability in the face of rapid change — Ben Maurer, Facebook
I've also seen this behavior in multiple production projects I maintain.
I built an interactive visualization to help illustrate the point. You can configure the request flow, and then play with FIFO/LIFO/Adaptive LIFO/Max queue size. Check it out here: https://ipeterov.github.io/queue-demo/
Visualization.mov
Do you guys think this makes sense? Or is this a dumb idea?