Researchers Expose AI Prompt Injection Attack Hidden in Images
Researchers have unveiled a new type of cyberattack that can steal sensitive user data by embedding hidden prompts inside images processed by AI platforms. These malicious instructions remain invisible to the human eye but become detectable once the images are downscaled using common resampling techniques before being sent to a large language model (LLM). The technique, designed by Trail of Bits experts Kikimora Morozova and Suha Sabi Hussain, builds on earlier research from a 2020 USENIX paper by TU Braunschweig, which first proposed the concept of image-scaling attacks in machine learning systems. Typically, when users upload pictures into AI tools, the images are automatically reduced in quality for efficiency and cost optimization. Depending on the resampling method—such as nearest neighbor, bilinear, or bicubic interpolation—aliasing artifacts can emerge, unintentionally revealing hidden patterns if the source image was crafted with this purpose in mind. In one demonstrati...