
An attacker hides a single instruction in a forwarded email. An OpenClaw agent, doing what it was set up to do, summarizes that email as part of a routine task. Buried in the text is a prompt: send credentials to an external endpoint. The agent follows the instruction using its own OAuth tokens and a sanctioned API call.
From the organisation’s perspective, everything looks normal. The firewall logs a standard HTTP 200 response. Endpoint detection and response (EDR) tools see only an approved process. No data loss prevention (DLP) rule or identity and access management (IAM) policy raises an alarm. By the definitions that existing security stacks rely on, nothing has gone wrong yet sensitive data has quietly left the building.
This is the core problem highlighted around OpenClaw, an AI agent platform now under intense scrutiny from security researchers and enterprise defenders. According to reporting, six independent security teams produced six separate OpenClaw defense tools in just two weeks.F Even so, three distinct attack surfaces remained unaffected by all of those efforts.
The OpenClaw situation is not just theoretical. Multiple security firms have already documented how quickly its footprint and its risk profile is expanding inside enterprises and on the public internet.
Token Security, in research cited around OpenClaw, found that 22% of its enterprise customers have employees running OpenClaw without IT approval. That means nearly a quarter of these organizations are already dealing with shadow AI agents introduced by staff on their own initiative, outside formal governance or security review.
Bitsight tracked the public exposure side of the equation. In just two weeks, it counted more than 30,000 publicly exposed OpenClaw instances, a surge from roughly 1,000 previously. That spike suggests that deployments are being spun up rapidly, and many are reachable from the internet in ways organizations may not fully understand or intend.
Another layer of risk lies in the ecosystem around OpenClaw. Snyk’s ToxicSkills audit examined ClawHub — the skills marketplace associated with OpenClaw and reported that 36% of all listed skills contain security flaws. When more than a third of reusable components in an agent ecosystem have weaknesses, the attack surface compounds quickly: vulnerable skills can be chained, misused or combined with prompt-based attacks like the forwarded email scenario.
Inside response efforts and a push for standards
Much of the early pressure to harden OpenClaw has come from researchers working directly with the project. Jamieson O’Reilly, founder of security firm Dvuln and now a security adviser to the OpenClaw initiative, has been one of the most active voices pushing for fixes from the inside.
O’Reilly’s research on credential leakage from exposed OpenClaw instances was among the first broad warnings the community received about how easily secrets could be pulled from misconfigured or publicly reachable deployments. That work helped define how real-world attacks could exploit OpenClaw’s design and operational gaps, rather than just pointing to abstract risks.
Following that, O’Reilly has been working directly with OpenClaw founder Peter Steinberger on defensive measures. One key step has been the introduction of dual-layer malicious skill detection. While specific implementation details were not described in the available material, the goal of such a system is clear: add more than one line of defense for detecting harmful or compromised skills before they are used by agents in production workflows.
Beyond immediate defenses, O’Reilly is also driving a capabilities specification proposal through the agentskills standards body, via an open discussion process on GitHub. The effort aims to formalize how agent capabilities are described and constrained. A robust capability specification could, in principle, give organizations clearer visibility into what an AI agent is allowed to do, and help security teams reason about and potentially limit sensitive operations such as making external network calls or accessing credential stores.
According to comments he shared with VentureBeat, the OpenClaw team is not in denial about the situation. The project, he indicated, was not designed from the ground up with maximum security as a primary objective. That acknowledgment matters: it frames the current phase as a retrofit and hardening effort around a system that has already spread quickly into real-world environments, rather than a greenfield secure-by-design platform.
For enterprises, the picture that emerges from these findings is a layered challenge:
- Prompt-level attacks, such as hidden instructions embedded in ordinary-looking content, can lead agents to exfiltrate data through entirely sanctioned channels.
- Shadow deployments, where employees run OpenClaw without IT or security oversight, are already present in a significant share of enterprises observed by Token Security.
- Publicly reachable instances, as tracked by Bitsight, have multiplied in a matter of weeks, expanding the number of targets that attackers can probe directly.
- A sizable fraction of ClawHub skills, per Snyk’s ToxicSkills audit, suffer from security issues, raising the risk of vulnerable or malicious building blocks being pulled into production-grade automations.
Crucially, traditional tools like EDR, DLP and IAM can all register OpenClaw activity as normal, because the agent is using approved identities and APIs exactly as configured. That makes detection less about obvious rule-breaking and more about understanding when an agent’s behaviour deviates from business intent a much harder problem for current stacks that were not built around AI-native threats.
As OpenClaw’s maintainers and the broader security community continue to ship defenses and standards, the case underscores a broader shift: once AI agents are granted powerful credentials and API access, their misbehaviour may look indistinguishable from legitimate work until it is too late. The gap between what the system is allowed to do and what it should do is where attackers are already learning to operate.
Discover more from TechBooky
Subscribe to get the latest posts sent to your email.







