After further investigation with our database vendor, it was concluded that 2 major factors contributed to this outage.
Database queries were executed with an unexpected consumption of RAM that resulted in cluster’s primary node restarting.
When the primary node failed, the cluster did not fail-over to the secondary node due to a bug in the vendor's cluster software resulting in the cluster disconnecting from Labflow.
Posted Feb 11, 2022 - 22:09 UTC
- Labflow App Outage Time Frame:
Start: 02/10/2022 11:18:11 AM End: 02/10/2022 11:45:11 AM
Labflow's underlying database service failed. Labflow's engineering team is working with its database vendor to understand why the database service failed. We will update this incident when more information is available.
The database service was restarted allowing all Labflow services to become operational again.
We do not expect future incidents at this point.
Students were unable to access Labflow at this time.