
OpenAI has introduced a lighter, faster version of its agentic coding assistant Codex, and for the first time it is leaning on a dedicated chip from Cerebras to power it. The new model, GPT-5.3-Codex-Spark, is described by OpenAI as a smaller variant of its latest Codex release, tuned specifically for low-latency, real-time coding help rather than heavy, long-running tasks.
Spark is currently in a research preview for ChatGPT Pro users inside the Codex app, and OpenAI is positioning it as the “first milestone” in its deepening hardware partnership with Cerebras.
OpenAI launched the latest version of Codex earlier this month. Spark sits alongside that flagship model as a streamlined option aimed at responsiveness. According to OpenAI, GPT-5.3-Codex-Spark is intended as a “daily productivity driver,” geared toward rapid prototyping and real-time collaboration where latency matters more than sheer model size.
In its official statement, OpenAI emphasized that Codex-Spark is designed for “the lowest possible latency on Codex,” and suggested that workflows demanding extremely low response times are a natural fit for Cerebras’ hardware.

CEO Sam Altman hinted at the launch in a post ahead of the announcement, telling Codex Pro users that a “special thing” would arrive later in the day and adding that “it sparks joy for me,” an apparent nod to the model’s name.
Spark is powered by Cerebras’ Wafer Scale Engine 3 (WSE-3), the company’s third-generation waferscale megachip. The WSE-3 contains 4 trillion transistors and represents a significant step-up in integration between OpenAI’s models and Cerebras’ custom silicon compared with earlier collaborations.
OpenAI and Cerebras announced a multi-year deal worth over $10 billion last month, signaling an expanded role for Cerebras in OpenAI’s compute stack. At the time, OpenAI said that bringing Cerebras into its mix of compute solutions was “all about making our AI respond much faster.” With Spark, the company is now presenting a concrete example of that strategy in production infrastructure.
In its latest comments, OpenAI again reiterated low latency as the key reason for using Cerebras, saying the company’s chips are especially well-suited to “workflows that demand extremely low latency.” Spark is the first Codex model OpenAI has explicitly tied to specific Cerebras silicon.
Cerebras CTO and co-founder Sean Lie framed the launch as the starting point for exploring how much fast inference can change the way developers interact with AI models. He said the company is eager to work with OpenAI and its developer ecosystem to uncover “new interaction patterns, new use cases, and a fundamentally different model experience,” adding that this research preview is “just the beginning.”
Cerebras itself has been building toward this moment for more than a decade. The company has become increasingly prominent in the AI hardware landscape and recently raised $1 billion in new funding at a $23 billion valuation. Cerebras has also previously signaled its intention to pursue an IPO, underscoring its ambitions in the sector.
For now, GPT-5.3-Codex-Spark remains limited to a research preview for Codex users on the ChatGPT Pro plan, where OpenAI and Cerebras will be watching closely to see how developers use a coding assistant built around aggressive latency targets.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.






