Beyond Text! Researchers Discover New AI System Vulnerability: Image Resampling Becomes Attack Vector

baoshi.rao

Recently, cybersecurity researchers Kikimora Morozova and Suha Sabi Hussain from Trail of Bits unveiled a novel attack method. This attack exploits the technical characteristics of image resampling to inject malicious instructions into images invisible to the human eye, thereby hijacking large language models (LLMs) and stealing user data.

The core of the attack lies in image resampling. When users upload images to AI systems, the systems typically automatically reduce the resolution for efficiency and cost savings. Malicious images exploit this process: they appear normal at full resolution, but after undergoing resampling algorithms like bicubic, hidden malicious instructions in specific areas of the image become visible as text.

In experiments, the researchers confirmed that this attack method can successfully infiltrate multiple mainstream AI systems, including Google Gemini CLI, Vertex AI Studio, Google Assistant, and Genspark. In one test on Gemini CLI, attackers successfully leaked a user's Google Calendar data to an external email address without any user confirmation.

Trail of Bits pointed out that although the attack requires adjustments based on the specific resampling algorithm used by each LLM, the broad applicability of the attack vector means that more untested AI tools could also be at risk. To help the security community understand and defend against such attacks, the researchers have released an open-source tool called Anamorpher for creating such malicious images.

Hacker Leak

To address this vulnerability, the researchers proposed several defense recommendations:

Size Restrictions: AI systems should enforce strict size limits on user-uploaded images.
Result Preview: Provide users with a preview of the results that will be passed to the LLM after image resampling.
Explicit Confirmation: For instructions involving sensitive tool calls (such as data export), the system should require explicit user confirmation.

The researchers emphasized that the most fundamental defense lies in implementing more secure system design patterns to fundamentally resist such prompt injection attacks.