Yesterday at 13:20 ET, one of our read-only databases crashed. The problem did not bring down the entire site, or all AutoTrade servers. Indeed most Gen3 functionality worked fine for most of the day. The Web site continued to operate, with some intermitting problems, since the database in question was only one of several we use.
But the problem had cascading effects. When we discovered the database problem, we brought down that particular machine in order to repair it, which put its load on the other machines. This caused a lot of slowness on the other database machines. The problem wasn’t fully resolved until 20:00 last night, when the affected database was brought back online, fully fixed.
The symptoms of this overloading problem were everywhere: Karl’s orders that did not get processed at the end of the day, some sync oscillations for some Gen3 AutoTraders trading certain systems (their broker account filled, but it took C2 too long to recognize the fill as valid), lost Gen1 server connectivity.
I am sorry for the problems this caused. I’m looking into the underlying cause of the database crash and trying to figure out ways to prevent the same problem from happening in the future.
I know everyone here relies on C2 to be a stable platform. We’ll keep working at it, trying to make these kind of events more rare.
Sincerely
Matthew