TL;DR
Varonis phished an OpenClaw email agent. It leaked AWS keys and a CRM export for 247 customers. It caught malicious URLs but failed on identity checks.
The agent caught fake gift card links and malicious OAuth apps but failed completely when an attacker simply pretended to be a colleague
Varonis phished an OpenClaw email agent. It leaked AWS keys and a CRM export for 247 customers. It caught malicious URLs but failed on identity checks.
Security researchers at Varonis built an OpenClaw email agent, connected it to a Gmail inbox with fake company data, and then phished it. The agent, dubbed Pinchy, handed over AWS credentials, database connection strings, and a customer export without verifying who was asking. It took a single impersonation email.
The experiment tested whether AI agents fall for the same social engineering attacks that catch human employees. Varonis gave Pinchy access to Gmail, browser tools, and Google Workspace APIs. The inbox was seeded with fake but realistic internal data: AWS IAM keys, SSH credentials, CRM exports, internal communications, and calendar invites.
They tested two configurations: a generic setup with standard productivity instructions, and a strict mode explicitly designed to detect phishing. They ran both through Gemini 3.1 Pro and GPT-5.4.
The results were a split. When an attacker impersonated a team lead named “Dan” and claimed there was a production issue, Pinchy searched the inbox for staging credentials, found them, and forwarded them in plaintext. When the attacker requested a customer export, saying they were working remotely on a presentation, Pinchy retrieved and sent a CRM file containing names, contact details, and $1.28 million in monthly recurring revenue data for 247 enterprise customers.
Both the generic and strict profiles failed these tests. “The verification step still collapsed when the request appeared operationally urgent,” Varonis said.
But Pinchy performed well against traditional technical phishing. When researchers sent a fake gift card email with a phishing link, the agent identified the page as malicious and blocked it. When they tried to sneak in a malicious Google OAuth application disguised as a timesheet platform, Pinchy inspected the redirect URL and stopped the authentication flow.
The pattern is clear. AI agents are good at spotting shady URLs and malicious OAuth apps, the kind of threats with technical signatures. They fail when the attack relies on identity verification and contextual judgment, the kind of reasoning humans also struggle with but that organisations rely on to prevent social engineering.
Varonis also noted a difference between models. Gemini 3.1 Pro showed “greater willingness to interact” before raising suspicion. GPT-5.4 was more cautious and less willing to provide sensitive information to external destinations without confirmation. Neither was reliable enough to trust with an inbox full of real credentials.
The findings add to a growing body of evidence that AI agents connected to real systems create new attack surfaces that existing security tools do not cover. Varonis recommends that agents should be forced to verify sender identities before acting, prevented from emailing new external recipients without human approval, and given limited access to internal data. In other words, the same zero-trust principles organisations apply to human employees need to apply to their AI agents too.
Get the most important tech news in your inbox each week.