An AI-powered automation system in professional environments, such as Google Gemini for Workspace, is vulnerable to a new security flaw. Using Google’s advanced large language model (LLM) integration within its ecosystem, Gemini enables the use of artificial intelligence (AI) directly with a wide range of user tools, including Gmail, to simplify workplace tasks.
A key feature of the app is the ability to request concise summaries of emails, which are intended to save users time and prevent them from becoming fatigued in their inboxes by reducing the amount of time they spend in it. Security researchers have, however identified a significant flaw in this feature which appears to be so helpful.
As Mozilla bug bounty experts pointed out, malicious actors can take advantage of the trust users place in Gemini's automated responses by manipulating email content so that the AI is misled into creating misleading summaries by manipulating the content.
As a result of the fact that Gemini operates within Google's trusted environment, users are likely to accept its interpretations without question, giving hackers a prime opportunity. This finding highlights what is becoming increasingly apparent in the cybersecurity landscape: when powerful artificial intelligence tools are embedded within widely used platforms, even minor vulnerabilities can be exploited by sophisticated social engineers.
It is the vulnerability at the root of this problem that Gemini can generate e-mail summaries that seem legitimate but can be manipulated so as to include deceptive or malicious content without having to rely on conventional red flags, such as suspicious links or file attachments, to detect it.
An attack can be embedded within an email body as an indirect prompt injection by attackers, according to cybersecurity researchers.
When Gemini's language model interprets these hidden instructions during thesummarisationn process, it causes the AI to unintentionally include misleading messages in the summary that it delivers to the user, unknowingly.
As an example, a summary can falsely inform the recipient that there has been a problem with their account, advising them to act right away, and subtly direct them to a phishing site that appears to be reliable and trustworthy.
While prompt injection attacks on LLMs have been documented since the year 2024, and despite the implementation of numerous safeguards by developers to prevent these manipulations from occurring, this method continues to be effective even today. This tactic is persisting because of the growing sophistication of threat actors as well as the challenge of fully securing generative artificial intelligence systems that are embedded in critical communication platforms.
There is also a need to be more vigilant when developing artificial intelligence and making sure users are aware of it, as traditional cybersecurity cues may no longer apply to these AI-driven environments. In order to find these vulnerabilities, a cybersecurity researcher, Marco Figueroa, identified them and responsibly disclosed them through Mozilla's 0Din bug bounty program, which specialises in finding vulnerabilities in generative artificial intelligence.
There is a clever but deeply concerning method of exploitation demonstrated in Figueroa's proof-of-concept. The attack begins with a seemingly harmless e-mail sent to the intended victim that appears harmless at first glance. A phishing prompt disguised in white font on a white background is hidden in a secondary, malicious component of the message, which conceals benign information so as to avoid suspicion of the message.
When viewed in a standard email client, it is completely invisible to the human eye and is hidden behind benign content.
The malicious message is strategically embedded within custom tags, which are not standard HTML elements, but which appear to be interpreted in a privileged manner by Gemini's summarization function, as they are not standard HTML elements.
It is alarming how effectively the exploitation technique is despite its technical simplicity.
An invisible formatting technique enables the embedding of hidden prompts into an email, leveraging Google Gemini's interpretation of raw content to capitalise on its ability to comprehend the content. In the documented attack, a malicious actor inserts a command inside a span element with font-size: 0 and colour: white, effectively rendering the content invisible to the recipient who is viewing the message in a standard email client.
Unlike a browser, which renders only what can be seen by the user, Gemini process the entire raw HTML document, including all hidden elements. As a consequence, Gemini's summary feature, which is available to the user when they invoke it, interprets and includes the hidden instruction as though it were part of the legitimate message in the generated summary.
It is important to note that this flaw has significant implications for services that operate at scale, as well as for those who use them regularly.
A summary tool that is capable of analysing HTML inline styles, such as font-size:0, colour: white, and opacity:0, should be instructed to ignore or neutralise these styles, which render text visually hidden.
The development team can also integrate guard prompts into LLM behaviour, instructing models not to ignore invisible content, for example.
In terms of user education, he recommends that organisations make sure their employees are aware that AI-generated summaries, including those generated by Gemini, serve only as informational aids and should not be treated as authoritative sources when it comes to urgent or security-related instructions.
A vulnerability of this magnitude has been discovered at a crucial time, as more and more tech companies are increasingly integrating LLMs into their platforms to automate productivity.
In contrast to previous models, where users would manually trigger AI tools, the new paradigm is a shift to automated AI tools that will run in the background instead.
It is for this reason that Google introduced the Gemini side panel last year in Gmail, Docs, Sheets, and other Workspace apps to help users summarise and create content within their workflow seamlessly.
A noteworthy change in Gmail's functionality is that on May 29, Google enabled automatic email summarisation for users whose organisations have enabled smart features across Gmail, Chat, Meet, and other Workspace tools by activating a default personalisation setting.
As generative artificial intelligence becomes increasingly integrated into everyday communication systems, robust security protocols will become increasingly important as this move enhances convenience.
This vulnerability exposes an issue of fundamental inadequacy in the current guardrails used for LLM, primarily focusing on filtering or flagging content that is visible to the user.
A significant number of AI models, including the Google Gemini AI model, continue to use raw HTML markup, making them susceptible to obfuscation techniques such as zero-font text and white-on-white formatting.
Despite being invisible to users, these techniques are still considered valid input to the model by the model-thereby creating a blind spot for attackers that can easily be exploited by attackers.
Mozilla's 0Din program classified the issue as a moderately serious vulnerability by Mozilla, and said that the flaw could be exploited by hackers to harvest credential information, use vishing (voice-phishing), and perform other social engineering attacks by exploiting trust in artificial intelligence-generated content in order to gain access to information.
In addition to the output filter, a post-processing filter can also function as an additional safeguard by inspecting artificial intelligence-generated summaries for signs of manipulation, such as embedded URLs, telephone numbers, or language that implies urgency, flagging these suspicious summaries for human review. This layered defence strategy is especially vital in environments where AI operates at scale.
As well as protecting against individual attacks, there is also a broader supply chain risk to consider.
It is clear that mass communication systems, such as CRM platforms, newsletters, and automated support ticketing services, are potential vectors for injection, according to researcher Marco Figueroa. There is a possibility that a single compromised account on any of these SaaS systems can be used to spread hidden prompt injections across thousands of recipients, turning otherwise legitimate SaaS services into large-scale phishing attacks.
There is an apt term to describe "prompt injections", which have become the new email macros according to the research. The exploit exhibited by Phishing for Gemini significantly underscores a fundamental truth: even apparently minor, invisible code can be weaponised and used for malicious purposes.
As long as language models don't contain robust context isolation that ensures third-party content is sandboxed or subjected to appropriate scrutiny, each piece of input should be viewed as potentially executable code, regardless of whether it is encoded correctly or not. In light of this, security teams should start to understand that AI systems are no longer just productivity tools, but rather components of a threat surface that need to be actively monitored, measured, and contained.
The risk landscape of today does not allow organisations to blindly trust AI output. Because generative artificial intelligence is being integrated into enterprise ecosystems in ever greater numbers, organisations must reevaluate their security frameworks in order to address the emerging risks that arise from machine learning systems in the future.
Considering the findings regarding Google Gemini, it is urgent to consider AI-generated outputs as potential threat vectors, as they are capable of being manipulated in subtle but impactful ways. A security protocol based on AI needs to be implemented by enterprises to prevent such exploitations from occurring, robust validation mechanisms for automated content need to be established, and a collaborative oversight system between development, IT, and security teams must be established to ensure this doesn't happen again.
Moreover, it is imperative that AI-driven tools, especially those embedded within communication workflows, be made accessible to end users so that they can understand their capabilities and limitations. In light of the increasing ease and pervasiveness of automation in digital operations, it will become increasingly essential to maintain a culture of informed vigilance across all layers of the organisation to maintain trust and integrity.