CoursePlus Outage on October 7, 2020
CoursePlus was unavailable from 11:48pm US Eastern time on Wednesday, October 7 until 1:03am on Thursday, October 8. This was the first unplanned outage for CoursePlus since 2017. We know that any downtime for CoursePlus is frustrating and difficult for the thousands of students, faculty, and staff who rely on immediate access to CoursePlus every day. This post details the cause of the problem, and how it was remediated.
Like all modern web applications, CoursePlus stores information in a database. At the heart of many database systems is a transaction log. This transaction log is an ordered list of every single change that has occurred in every single part of the database. These transaction logs guarantee data consistency even if there’s a power failure or some other major interruption. These logs grow rapidly with every single request to the database. If the storage on which the the transaction logs reside runs out of space, the database can’t continue to write to this log, and the database rejects all requests. To counter this, transaction logs are backed up automatically by the database system. Last night, this backup process got stuck and caused the database to say “I can’t accurately and safely continue to write to the transaction log, so I’ll just reject all requests right now.” As a result, CoursePlus stopped functioning normally for everyone.
Fortunately, the JHSPH IT team was alerted to this problem fairly quickly by the CoursePlus team. The JHSPH IT team worked as quickly as they could to resolve this problem. Once resolved, CoursePlus was operating normally again.
We know that the time period around midnight each day is a very busy time on CoursePlus, with students turning in assignments and completing quizzes and exams that are due that day by 11:59pm. Any outage at any time is frustrating and difficult for everyone, but an outage during this time is particularly problematic. The CoursePlus team is grateful for the quick response by the JHSPH IT team to this issue, and for the tenacity of everyone who was using CoursePlus during this time.