
Cloudflare has finally revealed the root cause of the network outage that disrupted large swathes of the internet on November 18, 2025. According to a detailed post-mortem, the disruption was triggered by a misconfiguration rather than an external cyber-attack, yet its impact hit deep; “Today was Cloudflare’s worst outage since 2019,” the company admitted.
The chain of events began at 11:20 UTC when Cloudflare’s network began seeing elevated “HTTP 5xx” error rates essentially, the system responsible for routing customer traffic inside Cloudflare’s global network couldn’t function correctly. The symptoms included failures in content delivery, access authentication and the Workers KV backend.
The root issue? A change to the permissions in a ClickHouse database query gradually caused a key internal file the “feature configuration file” used by Cloudflare’s Bot Management system to inflate in size unexpectedly. This file is meant to carry about 60 “features” for every traffic request (indicators such as IP reputation, request patterns, etc.). But due to the error, duplicate rows were generated, more than doubling the size of the file and surpassing internal limits. When this oversized file propagated across Cloudflare’s proxy fleet, the routing software panicked and began returning 5xx errors.
As the flawed file circulated, the network experienced instability: at times parts would recover temporarily, only to fail again when another batch of the bad file hit. That behaviour confused internal diagnostic teams and initially led them to suspect a large-scale DDoS attack. It was only later they confirmed the issue stemmed from Cloudflare’s own systems.
Remediation took effect beginning at 14:24 UTC, when Cloudflare halted the generation of new feature files and rolled out a known-good version. The main impact had begun to be resolved by 14:30 UTC, and full restoration of all services was confirmed by 17:06 UTC.
Several downstream services were impacted: Turnstile authentication failed for users attempting to log into the dashboard, Cloudflare Workers KV returned elevated error rates, and the core CDN/proxy traffic experienced increased latency and errors.
In its post-mortem, Cloudflare said it will carry out a number of corrective actions: hardening the ingestion of internal configuration files, enabling global kill-switches for feature propagation, and reviewing failure modes across its core proxy modules. The company emphasised that while it engineered its systems for resilience, this event exposed a gap that must be closed.
For users and customers whose sites or apps rely on Cloudflare’s infrastructure which includes much of the internet’s traffic the lesson is clear: even highly distributed, globally redundant networks can be vulnerable to internal configuration faults. And as infrastructure scales, the consequences of seemingly minor database or file changes can cascade rapidly.
Cloudflare’s apology looks to be honest, “We know we let you down today.” With the root cause now made public, the next question will be whether the promised safeguards hold up, and whether confidence in the internet’s plumbing remains intact.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.







