Login issues with Microsoft
Incident Report for Totango
Postmortem

Event Description: On November 20, 2023, a critical issue was reported concerning the Single Sign-On (SSO) mechanism using Microsoft services. Customers began experiencing difficulties in logging in, with multiple complaints received by the support team at approximately 13:00 UTC.

Timeline:

  • 13:00, Nov 20, 2023: Notification received from support about multiple customer complaints regarding SSO login issues.
  • 13:40, Nov 20, 2023: The root cause of the issue was identified.
  • 13:55, Nov 20, 2023: Issue was resolved.

Root Cause: The issue was traced back to the expiration of the secret keys for our SSO application in Azure. This expiration, which likely occurred around November 18th, went unnoticed as there were no alerts or notifications generated for this event.

Steps to Resolution:

  • A new secret key was generated and updated in the parameter store.

Lesson Learned and Preventive Actions:

  1. Engagement with Azure Support: We will be reaching out to Azure support to investigate why no alerts were triggered for the key expiration.
  2. Implementation of New Alert System: An alert has been added in New Relic (NR) to specifically monitor for key expiration errors. This action has already been completed.
Posted Nov 20, 2023 - 14:43 UTC

Resolved
This incident has been resolved.
Posted Nov 20, 2023 - 14:30 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Nov 20, 2023 - 11:57 UTC
Investigating
We are currently investigating reports of failed login attempts.
Posted Nov 20, 2023 - 11:26 UTC
This incident affected: Totango Web Application.