Microsoft has suffered some unfortunate outages this week, first affecting SQL databases on Monday, and then yesterday storage:
On Friday, February 22 at 12:44 PM PST, Storage experienced a worldwide outage impacting HTTPS traffic due to an expired SSL certificate. This did not impact HTTP traffic. We have executed repair steps to update SSL certificate on the impacted clusters and have recovered to over 99% availability across all sub-regions. We will continue monitoring the health of the Storage service and SSL traffic for the next 24 hrs. Customers may experience intermittent failures during this period. We apologize for any inconvenience this causes our customers.
The outage caused problems throughout the Azure universe, because SSL-based storage underpins just about everything. Without Storage, for example, any VM that goes offline can't restart, because its VHD is kept in Storage. Web sites and Service Bus were also hosed. My customers were annoyed.
These problems can affect any computing system. The problem with Azure Storage going down was the scope of it: millions of applications. Even the largest colo data center only has tens of thousands of computers. With so many people affected, the outage looks like a disaster.
I'll be watching Microsoft closely over the next few days to see what more they can tell us about the outage. But if this was all do to certificates expiring, wow.