502 Gateway Error Reports
Incident Report for Brillium
Postmortem

Background:
Brillium customers experienced an incident from approximately Friday, August 26, 2022 @ 2145 UTC to Saturday, August 27, 2022 @ 0530 UTC. This issue affected customer access to both Brillium Assessment Builder version 10 and Brillium version 11 applications.

Root Cause:
After interfacing with several Amazon AWS engineers and conducting a thorough incident analysis and investigation internally, we have discovered an incompatibility between network configuration components that would present itself as a loss of connectivity even though the systems themselves were in fact nominal.

Steps Taken:

  • We communicated actions taken during analysis and remediation phases of the incident via the Brillium Status Page.
  • We responded to customer inquiries through our support channel
  • We worked with Amazon support engineers to analyze and identify the root cause
  • We applied remediation steps to remediate and fully resolve the root cause.
  • We monitored the systems to ensure its stability.

Mitigation:
The components identified during the investigation are necessary for compatibility in order to support customers on earlier versions of the platform (i.e. v9, v10).  During the remediation steps, we were able to remove any dependency on the components identified as the root cause. Additionally, we added validation enhancements to our QA process that include compatibility checks in order to prevent any such issue from reoccurring in the future.

Posted Aug 31, 2022 - 17:31 EDT

Resolved
We are satisfied that we have resolved the issue and all systems are operating within normal parameters.

For more details, we will provide a post mortem on this issue describing what caused the issue and steps we took to remediate and prevent it from possibly occurring again.
Posted Aug 27, 2022 - 07:55 EDT
Update
All services are available, and we will continue to closely monitor systems.
Posted Aug 27, 2022 - 01:47 EDT
Monitoring
The platform is back in service but we will continue to monitor and scale up for performance.
Posted Aug 27, 2022 - 01:46 EDT
Update
We have resolved the issue and slowly bring the platform back online. You may experience some initial slowness but the system will improve its performance as we began to become fully operational. We will post when the system is fully up and running optimally.
Posted Aug 27, 2022 - 01:41 EDT
Update
Our Engineering Team is currently working with AWS Engineers to discover the root cause of the issue. We will share more information as we gain more insight. We do not have an estimated uptime as of now but will share once we understand the issue.
Posted Aug 26, 2022 - 21:09 EDT
Investigating
Brillium is currently experiencing an issue and investigating the issue. We will post an update as soon as we learn more about the root cause of the issue.
Posted Aug 26, 2022 - 19:35 EDT
This incident affected: Assessment Builder, Administration System, API, Partner Central, and Zapier Integration.