![]() ![]() They also published a preliminary root cause analysis on the Azure Status History page saying: “ an error occurred in the rotation of keys used to support Azure AD’s use of OpenID, and other, Identity standard protocols for cryptographic signing operations. They anticipate that remediation actions taken by 21:00 UTC will address these lingering problems. ![]() On March 16, Microsoft noted that some organizations might still see some Intune failures. We’ll continue to monitor service health as availability is restored.” Microsoft 365 services continue the process of recovery and are showing decreasing error rates in telemetry. Microsoft says “ The update has finished its deployment to all impacted regions. Customers should begin seeing recovery at this time, and we anticipate full remediation within 60 minutes.” After the mitigation rolled out, the problem appears to be resolved (22:10 UTC). We’ll provide an updated ETA on resolution as soon as one is available.” At 21:25 UTC the status became a little more optimistic “ We are currently rolling out a mitigation worldwide. Current Status and Preliminary Root CauseĪs of 21:15 UTC, Microsoft said “ We’ve identified the underlying cause of the problem and are taking steps to mitigate impact. Or even better, test their software before allowing code to make it through to production services. Although people use Office 365 every day, the weekend demand on the service is much lower than Monday-Friday, so it’s fair to look for Microsoft to make changes which might impact users then. Quite reasonably, people asked why Microsoft had deployed code changes on the first day of the working week. That is, if you have Azure AD Premium licenses. This is despite Microsoft’s December 2020 announcement of a 99.99% SLA for Azure AD authentication, which comes into effect on April 1, 2021. When Azure AD falls over, everything comes to a crashing halt across Microsoft 365. Unfortunately, as I wrote in February 2019, Azure AD is the Achilles Heel of Office 365. Microsoft said then that “ A latent code defect in the Azure AD backend service Safe Deployment Process (SDP) system caused this to deploy directly into our production environment.” In other words, a code change containing a bug made its way into production, which seems very like the reason cited by the Microsoft 365 status twitter account at 20:10 UTC that the problem was due to “ a recent change to an authentication system.” Similar Authentication Woes in September 2020Īt first glance, this problem seems like that of the September 28/29 Azure AD outage last year. Another interesting problem was that messages sent to a Teams channel email address failed because the connector used to deliver the email to Teams couldn’t authenticate. ![]() On the other hand, while writing this article, Word couldn’t upload the document to SharePoint Online or autosave changes because of an authentication failure. I used Outlook desktop and the browser interfaces to SharePoint Online, OneDrive for Business, OWA, and Planner during the outage. If an app didn’t need to authenticate, it continued to work quite happy. This might be the reason why Microsoft highlighted the effect on Teams in their communications for the Microsoft 365 health status page (Figure 2).įigure 3: No joy in the Microsoft 365 service health dashboard As anyone who has ever looked at the Teams activity in the Office 365 audit log can testify, Teams desktop clients authenticate hourly when their access token expires. In my case, I was connected to the Microsoft tenant in Teams and couldn’t switch back to my home tenant because the Teams client wasn’t able to authenticate with that tenant. Why Some Users Could Continue to Workīased on my own experience, it seemed as if apps continued to work if they didn’t need to make an authentication request to Azure AD. The Azure status page said “ Starting at approximately 19:15 UTC on, a subset of customers may experience issues authenticating into Microsoft services, including Microsoft Teams, Office and/or Dynamics, Xbox Live, and the Azure Portal.” The difference between the time users detected issues and Microsoft declaring an incident isn’t unusual, as Microsoft engineers need time to figure out if reported problems are due to a transient hiccup or something deeper. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |