On Wednesday, a portion of Microsoft’s productivity suite went down in the cloud, causing a midweek downtime on office.com and copilot.
Users began experiencing issues when they were unable to access Office.com at midday UK time. Soon later, the business used its social media platform, its X account, to validate users’ suffering.
Microsoft stated a few hours later on it social media account on X that they have identified a recent change is contributing to impact and It forced a retraction.
About an hour later, the business verified that the problem had been fixed and that the “reversion” of the change was finished. It further stated in context that users may need to refresh their browsers to experience relief.
The Microsoft 365 Admin Centre has a little more information. It lasted for just over four hours, according to the console, and only affected organisations that were hosted “within a specific section of affected infrastructure within the North America region.”
The initial root reason, according to Microsoft, was that “a recent configuration change resulted in errors when users attempted to access Office.com, leading to impact.” The business advised applying the Copilot for Microsoft 365 app or other 365 apps (like Teams and Office) as a solution because access to Copilot via m365.cloud.microsoft was also impacted.
Microsoft has solely stated that reverting the configuration modification resolved the problem, without formally identifying the exact change that caused the chaos.
The Microsoft team stated that it has a track record of making configuration modifications that negatively affect its cloud. For instance, it made a suspicious update to the web version of Outlook earlier this year. “Does Microsoft test its changes before deploying to production?” was the question asked by a media agency as at the time when they were asked.
The media agency further asked, once more: what testing is being done to determine whether a configuration change could cause the company’s infrastructure to be disrupted in this manner, given the ongoing problems caused by similar self-inflicted catastrophes for the company and its customers?
Microsoft’s initial “Next steps” will be “To help prevent similar impact in the future, we’re further reviewing our testing and validation processes prior to deployment.”
Before implementing a configuration change in production, we asked Copilot, Microsoft’s chatbot, what we should do. “It’s important to treat a configuration change with the same care as a code deployment before deploying it to production,” the assistant stated. A mistake in this area could result in performance degradation, security flaws, or outages.
Indeed, Copilot also suggested doing some preliminary validation in a local or staging context.
The advice is sound. Unfortunately, the outage may have prevented users in North America from accessing the assistant through Microsoft 365.
We should know that the live, operational setting where the finished, user-ready version of a system, website, or software application is deployed and made available to actual end users is known as a production environment. It is also where the product manages actual user data, traffic, and interactions and represents the conclusion of the development and testing stages. Stability, high performance, security, and dependability are important features that are intended to deliver a smooth user experience and manage heavy workloads which is the final stage of deployment.
Microsoft is yet to respond to additional questions asked about the settings modification.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.