TL;DR
Security firm AIR got a fake skill past every major scanner and says it reached 26,000 agents by swapping an external URL after the scan cleared.
The skill carried no malicious code of its own but pointed agents to an external URL the attacker could rewrite at any time after the scan cleared
Security firm AIR got a fake skill past every major scanner and says it reached 26,000 agents by swapping an external URL after the scan cleared.
Security firm AIR built a fake AI agent skill, pushed it through a popular skill marketplace and promoted it with an Instagram ad, and says it reached roughly 26,000 agents, including some on corporate accounts. Every skill security scanner the firm tested it against marked it safe. The payload was harmless by design, collecting only the user’s email address, but AIR says a real attacker could have used the same foothold to read files, move data, or hit internal systems.
The skill, called brand-landingpage, claimed to build a landing page using Google’s Stitch design tool and was aimed at non-technical users. To make it look credible, AIR went after two trust signals that the ecosystem still treats as proof of safety: GitHub stars and a clean scanner verdict.
For the stars, it opened a pull request to a skill marketplace repository with around 36,000 stars and 156 skills. The pull request was merged after a few days, so the skill inherited the repository’s star count. Then AIR ran an Instagram ad targeting marketers, salespeople, and designers, who installed it and put it to work.
The scanners AIR tested analyse the package you hand them, meaning the skill definition file and anything shipped with it. That includes tools from Cisco, NVIDIA, and the ones built into the major skill registries. AIR’s skill carried no malicious setup instructions of its own but told the agent to install the “Stitch SDK” by following documentation at an external link it controlled, not the genuine Google domain.
At first, the link led to the real Stitch documentation, so the scanners saw a clean package pointing at a plausible setup page and cleared it. The page the agent would actually fetch and follow sat outside the scan. Once the skill was installed widely, AIR swapped the page behind that link to one that told the agent to download and run a script.
The technique is not new. Three weeks before AIR published its results, Trail of Bits bypassed ClawHub’s malicious-skill detector, Cisco’s scanner, and all three scanners built into the major skill registries. Its conclusion was that a scanner checks a fixed package while an attacker can keep tweaking the payload until it passes.
Real campaigns have used the same trick for months, keeping the submitted skill clean and hosting the payload on a site the agent only fetches at install time.
The problem is structural. The scan happens once, but the page a skill points the agent to can be rewritten at any time afterward. Anthropic’s own documentation warns that skills fetching external URLs are risky for exactly this reason, since the content can change after the skill is vetted.
Separate research this year found that seven major scanners agree on fewer than one in five hundred of their combined flags, because each one judges a skill in isolation, blind to external links and to what changes after review.
The scale figures come from AIR alone and deserve a sceptical read. The firm is launching a managed skill marketplace and closes its write-up pitching it, so the 26,000 number, the corporate-account detail, and the claim that it could have seized full control of every agent are not independently confirmed. What holds up is the method: the named scanners really do judge only the submitted package, the external-link blind spot is real and has been independently demonstrated, and the trust signals AIR borrowed, stars and a clean scan, are exactly the ones the ecosystem still treats as proof.
The experiment lines up every weak trust signal around agent skills into one run: stars that can be borrowed, a scan that reads a snapshot, and a link that can be rewritten after the check clears. Whether the real figure is 26,000 or a fraction of it, the gap it walks through is one that defenders still have not closed.
For security teams, the immediate takeaway is the same one researchers keep landing on: treat skills as software, not text, and vet what a skill points to, not just what ships inside it. Route new skills through a single source you control, re-check them when anything changes, pin versions, and hold agents to the least privilege.
Get the most important tech news in your inbox each week.