
There was an overheating at one of its data centres that caused an outage that affected businesses like Bitcoin exchange Coinbase, where most of Amazon’s cloud services were back online earlier today, Friday.
This was after a single data centre in northern Virginia lost power due to a sudden surge in temperature on Thursday; the cloud giant, opens new tab, said it was making headway in fixing the problem and that a full recovery would take several hours.
After the interruption hindered its availability, Coinbase announced on a new tab that its services had been restored.
For businesses, overheating in data centres has been a major issue because sophisticated AI and cloud servers that process data need a lot of power and produce a lot of heat.
Data centre operators are increasingly using water or specialty coolants to control the temperature since they are hundreds of times more efficient than conventional air cooling.
The outage on Thursday was the second significant disruption caused by overheating in recent months, following the derivatives marketplace CME Group. A cooling failure at CyrusOne’s data centres caused O to have one of its longest disruptions in years in November.
The outage reports for AWS on the outage tracking website Downdetector have dropped to 72 as of 8:12 a.m. ET from a peak of almost 600 on Thursday night.
Even though it was taking longer than expected to add the capacity needed to safely restore all remaining impacted systems, AWS has been bringing more cooling system capacity online.
In addition, the cloud computing platform reported that it had redirected traffic for the majority of services away from the affected availability zone. Within an AWS Region, there is an “Availability Zone” made up of one or more linked physical data centers that are intended to function independently.
Also, a severe thermal event in one data center knocked out power within a specific availability zone, prompting engineers to reroute global traffic away from it while technicians restored additional cooling. Though safety checks on impacted servers were extended for full recovery, user-reported incidents on Downdetector fell from nearly 600 to double digits by Friday morning.
After completing necessary maintenance, the trading site of CME Group, the biggest derivatives marketplace in the world, came back online after experiencing minor technical difficulties earlier. The company stated that CME Group was unaffected by the AWS disruption.
Last year in October, dozens of websites, including some of the most well-known apps like Reddit and Snapchat, experienced a significant outage on AWS that caused chaos worldwide.
It was recognized as the biggest internet outage since the CrowdStrike failure in 2024, which caused technological systems in banks, hospitals, and airports to malfunction, underscoring the vulnerability of the global network of interconnected devices.
The multi-hour infrastructure degradation was triggered by cascading availability errors across major platforms, causing Coinbase to experience over five hours of trading delays and high error rates; CME Group to require essential maintenance to bring its CME Direct platform fully back online; and AWS native tools like Redshift, SageMaker, IoT Core, and ElastiCache to be forced to face temporary processing impairments.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.






