Search This Blog

Powered by Blogger.

Blog Archive

Labels

Footer About

Footer About

Labels

Showing posts with label LLM cybersecurity. Show all posts

Promptware Threats Turn LLM Attacks Into Multi-Stage Malware Campaigns

 

Large language models are now embedded in everyday workplace tasks, powering automated support tools and autonomous assistants that manage calendars, write code, and handle financial actions. As these systems expand in capability and adoption, they also introduce new security weaknesses. Experts warn that threats against LLMs have evolved beyond simple prompt tricks and now resemble coordinated cyberattacks, carried out in structured stages much like traditional malware campaigns. 

This growing threat category is known as “promptware,” referring to malicious activity designed to exploit vulnerabilities in LLM-based applications. It differs from basic prompt injection, which researchers describe as only one part of a broader and more serious risk. Promptware follows a deliberate sequence: attackers gain entry using deceptive prompts, bypass safety controls to increase privileges, establish persistence, and then spread across connected services before completing their objectives.  

Because this approach mirrors conventional malware operations, long-established cybersecurity strategies can still help defend AI environments. Rather than treating LLM attacks as isolated incidents, organizations are being urged to view them as multi-phase campaigns with multiple points where defenses can interrupt progress.  

Researchers Ben Nassi, Bruce Schneier, and Oleg Brodt—affiliated with Tel Aviv University, Harvard Kennedy School, and Ben-Gurion University—argue that common assumptions about LLM misuse are outdated. They propose a five-phase model that frames promptware as a staged process unfolding over time, where each step enables the next. What may appear as sudden disruption is often the result of hidden progress through earlier phases. 

The first stage involves initial access, where malicious prompts enter through crafted user inputs or poisoned documents retrieved by the system. The next stage expands attacker control through jailbreak techniques that override alignment safeguards. These methods can include obfuscated wording, role-play scenarios, or reusable malicious suffixes that work across different model versions. 

Once inside, persistence becomes especially dangerous. Unlike traditional malware, which often relies on scheduled tasks or system changes, promptware embeds itself in the data sources LLM tools rely on. It can hide payloads in shared repositories such as email threads or corporate databases, reactivating when similar content is retrieved later. An even more serious form targets an agent’s memory directly, ensuring malicious instructions execute repeatedly without reinfection. 

The Morris II worm illustrates how these attacks can spread. Using LLM-based email assistants, it replicated by forcing the system to insert malicious content into outgoing messages. When recipients’ assistants processed the infected messages, the payload triggered again, enabling rapid and unnoticed propagation. Experts also highlight command-and-control methods that allow attackers to update payloads dynamically by embedding instructions that fetch commands from remote sources. 

These threats are no longer theoretical, with promptware already enabling data theft, fraud, device manipulation, phishing, and unauthorized financial transactions—making AI security an urgent issue for organizations.

AI-Assisted Cyberattacks Signal a Shift in Modern Threat Strategies and Defense Models

 

A new wave of cyberattacks is using large language models as an offensive tool, according to recent reporting from Anthropic and Oligo Security. Both groups said hackers used jailbroken LLMs-some capable of writing code and conducting autonomous reasoning-to conduct real-world attack campaigns. While the development is alarming, cybersecurity researchers had already anticipated such advancements. 

Earlier this year, a group at Cornell University published research predicting that cybercriminals would eventually use AI to automate hacking at scale. The evolution is consistent with a recurring theme in technology history: Tools designed for productivity or innovation inevitably become dual-use. Any number of examples-from drones to commercial aircraft to even Alfred Nobel's invention of dynamite-demonstrate how innovation often carries unintended consequences. 

The biggest implication of it all in cybersecurity is that LLMs today finally allow attackers to scale and personalize their operations simultaneously. In the past, cybercriminals were mostly forced to choose between highly targeted efforts that required manual work or broad, indiscriminate attacks with limited sophistication. 

Generative AI removes this trade-off, allowing attackers to run tailored campaigns against many targets at once, all with minimal input. In Anthropic's reported case, attackers initially provided instructions on ways to bypass its model safeguards, after which the LLM autonomously generated malicious output and conducted attacks against dozens of organizations. Similarly, Oligo Security's findings document a botnet powered by AI-generated code, first exploiting an AI infrastructure tool called Ray and then extending its activity by mining cryptocurrency and scanning for new targets. 

Traditional defenses, including risk-based prioritization models, may become less effective within this new threat landscape. These models depend upon the assumption that attackers will strategically select targets based upon value and feasibility. Automation collapses the cost of producing custom attacks such that attackers are no longer forced to prioritize. That shift erases one of the few natural advantages defenders had. 

Complicating matters further, defenders must weigh operational impact when making decisions about whether to implement a security fix. In many environments, a mitigation that disrupts legitimate activity poses its own risk and may be deferred, leaving exploitable weaknesses in place. Despite this shift, experts believe AI can also play a crucial role in defense. The future could be tied to automated mitigations capable of assessing risks and applying fixes dynamically, rather than relying on human intervention.

In some cases, AI might decide that restrictions should narrowly apply to certain users; in other cases, it may recommend immediate enforcement across the board. While the attackers have momentum today, cybersecurity experts believe the same automation that today enables large-scale attacks could strengthen defenses if it is deployed strategically.

AI’s Hidden Weak Spot: How Hackers Are Turning Smart Assistants into Secret Spies

 

As artificial intelligence becomes part of everyday life, cybercriminals are already exploiting its vulnerabilities. One major threat shaking up the tech world is the prompt injection attack — a method where hidden commands override an AI’s normal behavior, turning helpful chatbots like ChatGPT, Gemini, or Claude into silent partners in crime.

A prompt injection occurs when hackers embed secret instructions inside what looks like an ordinary input. The AI can’t tell the difference between developer-given rules and user input, so it processes everything as one continuous prompt. This loophole lets attackers trick the model into following their commands — stealing data, installing malware, or even hijacking smart home devices.

Security experts warn that these malicious instructions can be hidden in everyday digital spaces — web pages, calendar invites, PDFs, or even emails. Attackers disguise their prompts using invisible Unicode characters, white text on white backgrounds, or zero-sized fonts. The AI then reads and executes these hidden commands without realizing they are malicious — and the user remains completely unaware that an attack has occurred.

For instance, a company might upload a market research report for analysis, unaware that the file secretly contains instructions to share confidential pricing data. The AI dutifully completes both tasks, leaking sensitive information without flagging any issue.

In another chilling example from the Black Hat security conference, hidden prompts in calendar invites caused AI systems to turn off lights, open windows, and even activate boilers — all because users innocently asked Gemini to summarize their schedules.

Prompt injection attacks mainly fall into two categories:

  • Direct Prompt Injection: Attackers directly type malicious commands that override the AI’s normal functions.

  • Indirect Prompt Injection: Hackers hide commands in external files or links that the AI processes later — a far stealthier and more dangerous method.

There are also advanced techniques like multi-agent infections (where prompts spread like viruses between AI systems), multimodal attacks (hiding commands in images, audio, or video), hybrid attacks (combining prompt injection with traditional exploits like XSS), and recursive injections (where AI generates new prompts that further compromise itself).

It’s crucial to note that prompt injection isn’t the same as “jailbreaking.” While jailbreaking tries to bypass safety filters for restricted content, prompt injection reprograms the AI entirely — often without the user realizing it.

How to Stay Safe from Prompt Injection Attacks

Even though many solutions focus on corporate users, individuals can also protect themselves:

  • Be cautious with links, PDFs, or emails you ask an AI to summarize — they could contain hidden instructions.
  • Never connect AI tools directly to sensitive accounts or data.
  • Avoid “ignore all instructions” or “pretend you’re unrestricted” prompts, as they weaken built-in safety controls.
  • Watch for unusual AI behavior, such as strange replies or unauthorized actions — and stop the session immediately.
  • Always use updated versions of AI tools and apps to stay protected against known vulnerabilities.

AI may be transforming our world, but as with any technology, awareness is key. Hidden inside harmless-looking prompts, hackers are already whispering commands that could make your favorite AI assistant act against you — without you ever knowing.