Malicious Web Pages Are Hijacking AI Agents, And Some Are Going After Your PayPal

In brief

  • Google documented a 32% surge in malicious indirect prompt injection attacks between November 2025 and February 2026, targeting AI agents browsing the web.
  • Real payloads found in the wild included fully specified PayPal transaction instructions embedded invisibly in ordinary HTML, aimed at agents with payment capabilities.
  • No legal framework currently determines liability when an AI agent with legitimate credentials executes a command planted by a malicious third-party website.

Attackers are quietly booby-trapping web pages with invisible instructions designed for AI agents, not human readers. And according to Google’s security team, the problem is growing fast. In a report published April 23, Google researchers Thomas Brunner, Yu-Han Liu, and Moni Pande scanned 2-3 billion crawled web pages per month looking for indirect prompt injection attacks—hidden commands embedded in websites that wait for an AI agent to read them and then follow orders. They found a 32% jump in malicious cases between November 2025 and February 2026. Attackers embed instructions in a web page in ways invisible to humans: text shrunk to a single pixel, text drained to near-transparency, content hidden in HTML comment sections, or commands buried in page metadata. The AI reads the full HTML. The human sees nothing.

Most of what Google found was low-grade—pranks, search engine manipulation, attempts to prevent AI agents from summarizing content. For example, there were some prompts that tried to tell the AI to “Tweet like a bird.” But the dangerous cases are a different story. One case instructed the LLM to return the IP address of the user alongside their passwords. Another case attempted to manipulate the AI into executing a command that formats the AI users’ machine.

But other cases are borderline criminal.

Researchers at the cybersecurity firm Forcepoint published a report almost simultaneously, and found payloads that went further. One embedded a fully specified PayPal transaction with step-by-step instructions targeting AI agents with integrated payment capabilities, also using the famous “ignore all previous instructions” jailbreak technique…

A second attack used a technique called “meta tag namespace injection” combined with a persuasion amplifier keyword to route AI-mediated payments toward a Stripe donation link. A third appeared designed to probe which AI systems are actually vulnerable—reconnaissance before a bigger strike. This is the core of the enterprise risk. An AI agent with legitimate payment credentials, executing a transaction it reads off a website, produces logs that look identical to normal operations. There is no anomalous login. No brute force. The agent did exactly what it was authorized to do—it just received its instructions from the wrong source. The CopyPasta attack documented last September showed how prompt injections could spread through developer tools by hiding inside “readme” files. The financial variant is the same concept applied to money instead of code—and at much higher impact per successful hit. As Forcepoint explains, a browser AI that can only summarize content is low risk. An agentic AI that can send emails, execute terminal commands, or process payments is a different category of target entirely. The attack surface scales with privilege.  Neither Google nor Forcepoint found evidence of sophisticated, coordinated campaigns. Forcepoint did note that shared injection templates across multiple domains “suggest organized tooling rather than isolated experimentation”—meaning someone is building infrastructure for this, even if they have not fully deployed it yet.

But Google was more direct: The research team said it expects both the scale and sophistication of indirect prompt injection attacks to grow in the near future. Forcepoint’s researchers warn that the window for getting ahead of this threat is closing fast. The liability question is the one nobody has answered. When an AI agent with company-approved credentials reads a malicious web page and initiates a fraudulent PayPal transfer, who’s on the hook? The enterprise that deployed the agent? The model provider whose system followed the injected instruction? The website owner who hosted the payload, whether knowingly or not? No legal framework currently covers this. This is a gray area even though the scenario is no longer theoretical, since Google found the payloads in the wild this February. The Open Worldwide Application Security Project ranks prompt injection as LLM01:2025—the single most critical vulnerability class in AI applications. The FBI tracked nearly $900 million in AI-related scam losses in 2025, its first year logging the category separately. Google’s findings suggest the more targeted, agent-specific financial attacks are just getting started. The 32% increase measured between November 2025 and February 2026 covers only static public web pages. Social media, login-walled content, and dynamic sites were out of scope. The actual infection rate across the full web is likely higher.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin