Success Plays unexpectedly fired in the US region on April 19, 14:15 UTC, due to the use of an old Redis cluster.
Reported: April 19, 14:15 UTC The Totango system monitoring detected the issue.
Identified: April 19, 15:20 UTC The root cause was identified as a DNS change from the new Redis cluster to the old one.
The application team immediately stopped all SuccessPlay triggering. The SuccessPlay engine at that time already triggered a few thousands tasks and events (as part of the task creation emails which were sent to users).
The application team then created a cleanup plan, which included deleting duplicated tasks and events and reverting attribute changes.
Resolved: Apr 21, 19:58 UTC
Duplicated task deletion is completed
April 21, 20:51 UTC
Cleanup plan execution was completed in full, including events deletion and reverting attribute changes.
Root Cause:
An old Redis cluster was being used actively instead of the new one due to a DNS change, causing SuccessPlays to fire unexpectedly.
Corrective Action