
Google’s Threat Intelligence Group says it has, for the first time, found evidence that a criminal group used artificial intelligence to help discover and weaponize a zero-day vulnerability, adding a concrete case study to mounting concerns about AI’s role in cybersecurity.
In a report published Monday, the Google Threat Intelligence Group (GTIG) detailed how it disrupted a planned mass exploitation campaign built around a previously unknown software flaw. The company did not name the threat actor, but described it as a “prominent” cybercrime group.
The incident centres on a zero-day vulnerability buried inside a Python script. Zero-days are serious security gaps in software or hardware that are unknown to the vendor, leaving developers with “zero days” to fix them before attackers can take advantage.
According to Google’s analysis, the exploit would have allowed attackers to bypass two-factor authentication on a popular, open-source, web-based system administration tool. The specific product was not identified. Even with the bypass, the hackers would still have needed valid user credentials for the attack to succeed, Google noted.
GTIG says it worked with the affected vendor to disclose and address the flaw before the cybercriminals could deploy it at scale. Google believes it successfully prevented the vulnerability from being used in the planned mass campaign.
What makes the case stand out is Google’s assessment of how the exploit was created. The company says it has “high confidence” that an AI model helped the attackers both find and weaponize the vulnerability. That judgment is based on the style and structure of the exploit code rather than direct access to attacker tooling.
In its report, Google points to several hallmarks it associates with output from modern large language models (LLMs):
- An “abundance of educational docstrings” embedded in the script.
- A “hallucinated” CVSS score, a made-up severity rating that resembles formal vulnerability scoring but does not correspond to an established reference.
- A structured, “textbook” Python format characteristic of LLM training data, including detailed help menus.
- Use of a clean
_CANSI colour class, which the report describes as aligning with code patterns commonly seen in LLM-generated examples.
Those features collectively led GTIG to conclude that a generative AI system likely assisted the attackers in discovering and packaging the exploit. Google also says it does not believe its own Gemini model was involved, though it does not identify which AI system it suspects was used.
The finding arrives as advanced AI models designed specifically for security work are themselves under the microscope. Anthropic’s Mythos model, in particular, has drawn attention for its potential to uncover vulnerabilities in code and for the company’s decision to restrict access.
Mythos has been released only to a select set of companies, organisations, and governments through a controlled program aimed at helping them test and strengthen their cybersecurity. That limited rollout has sparked broader policy interest. The report notes that the restricted release created enough of a stir that it pushed the Trump administration to consider easing tensions with Anthropic in order to secure agreements with more AI companies, giving the U.S. government opportunities to review certain models before they are released widely.
Even so, some security professionals question whether Mythos lives up to the hype. Curl lead developer Daniel Stenberg offered a sceptical view after taking part in Anthropic’s Project Glasswing, an initiative where organisations can submit code for Mythos to analyse for security issues.
Stenberg says his team received a report from Anthropic listing five “confirmed security vulnerabilities” in Curl’s codebase. On closer examination, they concluded that only one of those issues was both legitimate and previously unknown. The rest, by their assessment, did not meet that bar.
In a blog post, Stenberg described the surrounding excitement about Mythos as largely a “successful marketing stunt,” arguing that he has seen no compelling evidence that the system finds security issues at a meaningfully higher level than existing tools. In his view, the outcomes so far do not clearly surpass other vulnerability detection approaches that predate Mythos.
Those mixed results underscore a central tension: the same AI capabilities that can assist developers and defenders in spotting subtle bugs or misconfigurations can also help attackers automate research, explore edge cases in complex systems, and rapidly prototype exploits. Google’s latest finding adds a concrete example of that risk, at least as the company interprets the evidence.
For now, GTIG’s account stops short of naming the targeted software, the attackers, or the AI system involved. But by publicly linking a real-world zero-day case to AI-assisted discovery, Google is signalling that this is no longer a purely theoretical scenario. The debate over how to govern powerful AI models and who gets to probe them before public release is likely to intensify as more such incidents come to light.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.







