Below you will find the Root Cause Analysis for the incident that occurred on Wednesday, October 9th 2019.
Original Reported Subject: Possible SMS Delivery Delay
Date: Wednesday, October 9th, 2019
Start Time: 09:52 AM UTC
End Time: 10:25 AM UTC
On Wednesday, October 9th customers experienced SMS delivery latency for Verify SMS and SMS API products, and select PhoneID request failures. The incident stared at 09:52 AM UTC and lasted until 10:25 AM UTC.
Root Cause Analysis:
As a result of a network configuration change, access was temporarily blocked to external resources associated with SMS DLR callbacks and select PhoneID requests. As of 09:52 AM UTC, system alerts detected transaction queuing in TeleSign’s data centers, associated with failure to resolve callbacks. In response to system alerts, engineers immediately began work to move traffic and clear queues. Once root cause was identified, network configuration changes were applied to resolve. Impacted data centers required approximately 20 minutes to clear queues associated with the event. The issue was resolved by 10:25 AM UTC, and no further delays were detected. All SMS messages were delivered during the event, however delivery was occurring at a slower rate.
To minimize the risk of, and/or prevent this issue from recurring in the future, TeleSign’s Tech Ops team has taken the following actions:
· Ran additional performance tests of DLR callback services to investigate functionality in negative scenarios.
· Prior to any similar future network configuration adjustments, TeleSign will initiate tests on small testing domains, outside the platform.
We apologize for the inconvenience this may have caused you. Should you have any questions, please don’t hesitate to contact us at firstname.lastname@example.org.