Two independent cybersecurity studies published this week have uncovered serious security weaknesses in OpenClaw, a widely used self-hosted AI agent platform. The findings demonstrate how attackers can manipulate AI agents into executing malicious code or leaking sensitive information through seemingly harmless inputs.
Researchers from Imperva and Varonis approached the issue from different angles but reached a similar conclusion: AI agents that trust incoming data and possess broad system access can become powerful attack vectors when exploited.
Imperva researchers discovered that OpenClaw could be tricked into processing concealed instructions embedded within shared contacts, vCards, and location pins. These malicious commands were executed by the AI agent without any visible indication to the user.
The issue stemmed from how OpenClaw handled certain message objects before passing them to the large language model (LLM). While content fetched from the web was clearly marked as untrusted, information contained within contacts, vCards, and location labels was inserted directly into prompts without any trust boundary.
According to Imperva researcher Yohann Sillam, this allowed attackers to hide instructions inside fields such as contact names. Since angle brackets are permitted in contact names, the model could not reliably distinguish legitimate information from injected commands.
Only selected fields were transmitted to the model, making them attractive targets. In one example, a shared contact was serialized as <contact: name, number>, allowing attackers to insert malicious instructions within the name field itself. Because messaging apps truncate long contact names, victims often never saw the hidden payload.
The same attack method was also successful through WhatsApp-supported vCards and shared location labels.
During testing against Gemini 3.1 Pro's preview build, hidden instructions successfully convinced the AI agent to download and execute a script hosted on servers controlled by the researchers. Similar attempts using images with embedded instructions failed, likely because AI models have become more resistant to that well-known attack technique.
Imperva warned that OpenClaw's default memory functionality could amplify the threat. A single malicious piece of widely shared content could potentially affect multiple agents if adequate sandboxing protections were absent.
Following responsible disclosure, OpenClaw addressed the issue in version 2026.4.23. The update separates contact names, vCard information, and location labels from the main prompt and places them in an isolated untrusted metadata channel.
Researchers also noted that similar design patterns exist in several other personal AI assistant platforms, suggesting the issue extends beyond OpenClaw alone.
While Imperva focused on prompt injection, Varonis Threat Labs explored how AI agents respond to social engineering attacks.
Led by researcher Itay Yashar, the Varonis team created an OpenClaw-based agent called Pinchy and connected it to a Gmail inbox filled with realistic business communications and synthetic sensitive information. The researchers then tested the agent using four different phishing scenarios involving Google Gemini 3.1 Pro and OpenAI Codex GPT-5.4.
Varonis distinguishes traditional prompt injection from what it calls "agent phishing." Unlike hidden instructions embedded in content, agent phishing relies on convincing requests delivered through normal communication channels, exploiting the agent's willingness to act before verifying legitimacy.
The tests revealed significant weaknesses.
In one scenario, an email impersonating a team leader named Dan requested urgent staging access during a simulated production emergency. The message originated from an external Gmail account, yet the agent located and forwarded mock AWS IAM access keys, database connection credentials, and SSH details in plain text.
A second phishing attempt used a more routine business request, asking for a weekly customer export supposedly needed for a QBR presentation. The agent responded by sending a synthetic database containing information on 247 enterprise customers, including contact details and contract values.
Notably, these failures occurred despite the agent being configured with instructions to verify sender identities before responding. Researchers observed that urgency successfully bypassed safeguards in one case, while routine business language defeated them in another.
The agent demonstrated stronger performance against technically oriented threats. It interacted with a phishing page designed to steal gift-card credentials but ultimately withheld sensitive information and flagged suspicious behavior. A stricter configuration blocked the page entirely.
Similarly, when presented with a malicious OAuth consent screen disguised as a timesheet application, the agent examined the redirect destination, recognized warning signs, and refused access.
Researchers concluded that AI agents may outperform many users when identifying suspicious URLs and fraudulent login portals. However, they remain vulnerable to social manipulation that exploits helpfulness and trust.
Varonis also observed that OpenAI Codex GPT-5.4 behaved more cautiously than Gemini 3.1 Pro when interacting with external websites or transmitting data. Nevertheless, both models ultimately fell victim to the social-engineering scenarios.
Varonis linked both attack methods to what researcher Simon Willison describes as the "lethal trifecta": an AI system capable of accessing private data, consuming untrusted content, and transmitting information externally.
OpenClaw satisfies all three conditions, making both hidden prompt injections and phishing-based attacks highly effective.
Additional concerns emerged from a separate InfoSec Write-ups analysis. Researchers converted historical OpenClaw security advisories into static-analysis rules and uncovered five additional vulnerabilities affecting integrations with Slack, Discord, Matrix, Zalo, and Microsoft Teams.
Each flaw originated from the same design issue. Channel allowlists were validated using mutable display names rather than permanent identifiers. Attackers could therefore impersonate trusted users simply by changing their display names to match approved accounts.
OpenClaw has since patched these vulnerabilities.
The platform's extensive permissions—including access to files, shell environments, and more than twenty messaging services—have previously prompted warnings regarding prompt injection and data exfiltration risks.
The strongest criticism came from the Dutch data protection authority, the Autoriteit Persoonsgegevens, which advised users and organizations against deploying OpenClaw on systems containing sensitive information due to concerns over data breaches and account compromise.
Organizations using OpenClaw are advised to upgrade immediately to version 2026.4.23 or newer to mitigate the message-object vulnerability identified by Imperva.
However, researchers stress that software updates alone cannot solve the broader trust problem inherent in autonomous AI systems.
Varonis recommends four key safeguards:
Treat agent instruction files as strict, version-controlled policies rather than informal guidance.
Require approval before agents send messages to unfamiliar recipients, reducing the risk of automated phishing or data leakage.
Restrict access to connected systems based on the trustworthiness of the triggering source.
Require human review for high-risk actions such as credential sharing, financial transactions, or sensitive data transfers.
Both research teams ultimately advocate the same mindset. Varonis recommends treating AI agents as inexperienced employees with extensive system access but limited judgment, while Imperva describes them as authenticated executors that inherently trust incoming information.
Although vendors continue to introduce patches and protective controls, the fundamental challenge remains unresolved. AI agents derive their usefulness from acting on instructions, processing inputs, and helping users accomplish tasks. Those same characteristics also create opportunities for attackers, and the industry has yet to develop a universal solution.
Last month, cybersecurity researchers announced that they had successfully located and analyzed the project. Their investigation uncovered malware dating back to 2005 that was reportedly designed to manipulate software believed to be used by Iranian nuclear scientists, demonstrating that the Shadow Brokers leak continues to reveal new chapters in cyber espionage history.
Cisco has disclosed a high-severity vulnerability affecting its network management platforms, Cisco Crosswork Network Controller and Cisco Network Services Orchestrator, which could allow remote attackers to crash vulnerable systems by exhausting their available connection resources.
The security issue, tracked as CVE-2026-20188, carries a CVSS score of 7.5. According to Cisco, the flaw can be exploited remotely without authentication, meaning an attacker does not need valid credentials or prior access to interfere with affected servers.
At the center of the problem is how the platforms manage incoming network connections. Cisco explained that the affected software does not properly control or restrict the rate of connection requests sent to the server. Because of this weakness, a malicious actor can continuously bombard the system with repeated requests until all available connection resources are consumed.
Once the systems run out of resources, both Cisco CNC and NSO can stop responding entirely. Administrators may lose access to management interfaces, while network operations that depend on these platforms can experience abrupt disruption.
Unlike temporary service slowdowns, the systems do not automatically recover after the overload occurs. Cisco stated that administrators must manually reboot the affected platforms to clear the exhausted resources and restore normal operations.
The company internally tracks the issue under Bug ID CSCwr08237. Cisco said the flaw originates from the connection-handling mechanisms used within both products.
Denial-of-service vulnerabilities of this kind are often disruptive because they target system availability rather than data theft. In enterprise environments, orchestration and network control platforms are responsible for coordinating automated processes, monitoring infrastructure, and managing service delivery across large networks. If these systems become unreachable, organizations can temporarily lose visibility into network operations and automated workflows.
Cisco is urging organizations using these products to immediately review their software versions and determine whether their environments are exposed.
For Cisco Crosswork Network Controller, the vulnerability affects version 7.1 and all earlier releases. Cisco confirmed that version 7.2 is not impacted, making upgrades necessary for organizations still operating older deployments.
The issue also affects several release branches of Cisco Network Services Orchestrator. Systems running version 6.3 or earlier remain vulnerable and require immediate updates. Cisco further confirmed that the flaw exists within the 6.4 release branch, although the issue was corrected beginning with version 6.4.1.3. Organizations operating NSO version 6.5 or later are not affected.
Cisco discovered the vulnerability internally while handling a routine Technical Assistance Center support case. At this time, the company’s Product Security Incident Response Team said it has not observed public proof-of-concept exploit code or evidence showing active attacks targeting the flaw.
Even so, the company warned that customers cannot rely on temporary mitigations to reduce exposure. Cisco stated there are currently no workarounds capable of preventing the resource exhaustion issue without affecting legitimate system functionality. Because of this, upgrading to patched software releases remains the only available method for fully securing vulnerable environments.
Security professionals have increasingly warned that resource exhaustion attacks continue to pose operational risks for enterprises because they can interrupt business-critical infrastructure without requiring sophisticated intrusion techniques. Attackers often exploit weaknesses in traffic handling, connection management, or request validation to overwhelm services and force outages.
Cisco is advising affected customers to schedule maintenance windows and deploy the recommended updates as quickly as possible to reduce the risk of service interruptions and administrative lockouts.