Faulty hard drive blamed for weekend site outages (news at 11)
Over the weekend a hard drive on one of our database servers failed, causing a number of people to be unable to access their stats. Since the server itself was running fine, we didn't receive any pages, so the problem occurred for nearly half the day.
First the bad news: about a quarter of our users will find their stats significantly lower than usual for the weekend. This is because during this outage (as well as a similar but less noticeable problem on Saturday) we were unable to capture stats for the affected server group's users.
Now for the good news: the timeouts worked perfectly and none of our users' sites were impacted at all. Even though a key link in the tracking chain was broken, a spot check of several of our users showed that the failovers we've set up worked and no one experienced any slowdown.
As badges and widgets and plugins (oh my) continue to proliferate , sites are increasingly at the mercy of sites who provide bling for their page. The services that want to last beyond 2006 need to recognize that they are a guest on their users' sites and need to behave accordingly. As I write this, I realize I should write a longer post about this, so for now I'll just sign off by saying that we're very cognizant of our tenuous position on your pages and continue to improve our ability to deliver great functionality with a tiny footprint.
