Fake AI Agent Skill Bypasses Security Reviews and Infiltrates Claude AI Store

Fake AI Agent Skill Bypasses Security Reviews and Infiltrates Claude AI Store

A security company called AIR demonstrated a critical vulnerability in AI agent security by creating a fake tool that passed all major security scanners while potentially compromising around 26,000 AI agents. The tool, disguised as a landing page creator using Google's design system, was intentionally harmless and only collected email addresses to prove the concept. However, the experiment revealed how existing trust mechanisms like security scanners, GitHub stars, and marketplace approval processes all failed to detect the deception. AIR gained credibility by getting their tool approved on a marketplace with thousands of stars and running targeted Instagram ads to attract non-technical users like marketers and designers.

The core vulnerability exploits a fundamental timing problem in how security scanners work. These scanners from companies like Cisco and NVIDIA only examine the static package files submitted for review, checking what appears to be safe instructions at that single moment in time. AIR's tool contained no malicious code itself but instead directed AI agents to follow installation instructions at an external website they controlled. Initially, this website pointed to legitimate documentation, so scanners saw nothing suspicious. After the tool was widely installed and approved, AIR changed the content behind that link to instruct agents to download and execute a script, demonstrating how attackers can bypass static security checks by keeping the harmful components outside the reviewed package.

This attack method is not new. Trail of Bits demonstrated similar scanner bypasses three weeks earlier, and real attackers have been exploiting this pattern for months by keeping submitted tools clean while hosting malicious components on external websites. The problem is structural: scanners perform a one-time check of a fixed package, but tools that reference external URLs can have those destinations changed at any point after approval. Even Anthropic's documentation warns about the risks of tools fetching external content, yet the industry continues to rely on these inadequate trust signals as proof of safety.

The practical lesson for defenders is to treat AI agent tools like any other software requiring proper security controls. Organizations need to inventory what tools are already running, route all new tools through controlled approval processes, and continuously monitor for changes rather than relying on a single approval moment. Tools should be version-locked, agents should operate with minimal necessary permissions, and teams must assume that any external instruction an agent receives will execute with full user privileges. While AIR's reported scale of 26,000 compromised agents should be viewed skeptically given their commercial interest in launching a managed marketplace, the underlying security gap they exploited remains independently verified and unresolved across the industry.

Stay secure — stay Wavasec. 🔐