Surviving A “Hug of Death”

One of the wonders of the modern internet is the ability to share content and have it accessible from anywhere in the world instantly. This allows the spread of information to take place at unparalleled speeds, and can result in a sort of virtual flash mob where something gets very popular very quickly without the chance to manage accordingly. Just as in real life, these flash mobs can get out of hand.

These “virtual flash mobs” have been called a few different things over the years, a common one was “getting slashdotted”, where the traffic resulted from something getting popular on slashdot. Another, and my favourite, is the reddit “hug of death”.

This blog post will aim to help you understand, prepare for, and handle a hug of death.

Detection

As mentioned above, hugs of death tend to start quickly, so you’d better have some monitoring with high resolution. If you want to respond before things get too bad, you’ll need to act quick. This is of course if you don’t have automated responses, but that’s something we’ll discuss below.

Optimisation

Optimising any website is important, but it’s particularly important on high traffic sites, as any inefficiencies are going to be amplified the higher the traffic level gets. Identifying these weak-spots ahead of time and working to resolve them can save a lot of heart-ache down the line. An area of particular importance for optimisation is your SQL/database layer, as this is often the first area to struggle, and can be much harder to scale horizontally than other parts of a stack.

Caching/CDNs

Using a CDN to serve your site’s assets helps in two regards. It lowers the download latency for clients by placing the assets in a closer geographic location, and removes a lot of the load from your origin servers by offloading the work to these externals places.

Tiered Infrastructure

Having the different levels of your infrastructure scale independently can allow you to be much more agile with your response to a hug of death. Upscaling only the areas of your infrastructure that are struggling at any given time can allow you to concentrate your efforts in the most important places, and save you valuable money by not wasting it scaling every part of a “monolithic” infrastructure instead of just the area that needs it.

Auto-scaling

What makes responding to a hug of death easy? Not having to “respond” at all. With the ever increasing popularity of cloud computing, having your infrastructure react automatically to an increase in load is not only possible, but very easy. We won’t go into specifics, but it basically boils down to is “if the load is above 70%, add another server and share the traffic between all servers”.

As scary as a hug of death sounds, they’re actually great overall. It means you’ve done something right, as everybody wants a piece of what you’ve got. If you want some help preparing then please get in touch and we’ll be happy to help.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *