We experienced a very large DoS (Denial of Service) attack. The attack targeted a single site, but unfortunately all sites were affected, resulting in about 10 minutes of outage in two incidents. Our target is to have your site always be available, so we’re very sorry this happened.
The reason all sites were affected is that our servers hit internal rate-limiting on the shared DynamoDB database storing site information. We have fixed this by increasing the limits.
The reason the outage took around 5 minutes each was to that our infrastructure took some time to pick up the block. We have fixed this by improving how blocking works. The decision to block is made by a human, but deploying the block will now take seconds instead of possibly minutes.
The reason a second outage happened was that the traffic from the DoS had not fully subsided and the attackers used the returned availability of the site as a signal to ramp up the DoS again. We have changed our unblocking policy to take this into account in the future.