Event Description
Users were not able to login and also logged-in users experienced system slow response time
Findings & Timeline
First Noticed: February 16th, 2021 at 14:46 UTC
Our monitoring systems alerted us to an issue with application performance and failures to login. These issues lasted for 10 minutes at which time the system stabilized and returned to normal operation.
Issue Identified: February 16th, 2021 at 18:30 UTC
Following the incident, our team was able to identify the root cause of the temporary performance degradation.
Root Cause
Preventive Action
The Totango engineering team researched the NPS campaigns processing flow and found several areas that will be optimized:
Optimize the API rate limit to prevent this kind of load on the system - In research