A six-month research into AI-based development tools has disclosed over thirty security bugs that allow remote code execution (RCE) and data exfiltration. The findings by IDEsaster research revealed how AI agents deployed in IDEs like Visual Studio Code, Zed, JetBrains products and various commercial assistants can be tricked into leaking sensitive data or launching hacker-controlled code.
The research reports that 100% of tested AI IDEs and coding agents were vulnerable. Impacted products include GitHub, Windsurf, Copilot, Cursor, Kiro.dev, Zed.dev, Roo Code, Junie, Cline, Gemini CLI, and Claude Code. At least twenty-four assigned CVEs and additional AWS advisories were also included.
The main problem comes from the way AI agents interact with IDE features. Autonomous components that could read, edit, and create files were never intended for these editors. Once-harmless features turned become attack surfaces when AI agents acquired these skills. In their threat model, all AI IDEs essentially disregard the base software. Since these features have been around for years, they consider them to be naturally safe.
However, the same functionalities can be weaponized into RCE primitives and data exfiltration once autonomous AI bots are included. The research reported that this is an IDE-agnostic attack chain.
It begins with context hacking via prompt-injection. Covert instructions can be deployed in file names, rule files, READMEs, and outputs from malicious MCP servers. When an agent reads the context, the tool can be redirected to run authorized actions that activate malicious behaviours in the core IDE. The last stage exploits built-in features to steal data or run hacker code in AI IDEs sharing core software layers.
Writing a JSON file that references a remote schema is one example. Sensitive information gathered earlier in the chain is among the parameters inserted by the agent that are leaked when the IDE automatically retrieves that schema. This behavior was seen in Zed, JetBrains IDEs, and Visual Studio Code. The outbound request was not suppressed by developer safeguards like diff previews.
Another case study uses altered IDE settings to show complete remote code execution. An attacker can make the IDE execute arbitrary code as soon as a relevant file type is opened or created by updating an executable file that is already in the workspace and then changing configuration fields like php.validate.executablePath. Similar exposure is demonstrated by JetBrains utilities via workspace metadata.
According to the IDEsaster report, “It’s impossible to entirely prevent this vulnerability class short-term, as IDEs were not initially built following the Secure for AI principle. However, these measures can be taken to reduce risk from both a user perspective and a maintainer perspective.”
Security experts are warning people who use NPM — a platform where developers share code — to be careful after finding several fake software packages that secretly collect information from users' computers.
The cybersecurity company Socket found around 60 harmful packages uploaded to NPM starting mid-May. These were posted by three different accounts and looked like normal software, but once someone installed them, a hidden process ran automatically. This process collected private details such as the device name, internal IP address, the folder the user was working in, and even usernames and DNS settings. All of this was sent to attackers without the user knowing.
The script also checked whether it was running in a cloud service or a testing environment. This is likely how the attackers tried to avoid being caught by security tools.
Luckily, these packages didn’t install extra malware or try to take full control of users’ systems. There was no sign that they stayed active on the system after installation or tried to gain more access.
Still, these fake packages are dangerous. The attackers used a trick known as "typosquatting" — creating names that are nearly identical to real packages. For example, names like “react-xterm2” or “flipper-plugins” were designed to fool people who might type quickly and not notice the slight changes. The attackers appeared to be targeting software development pipelines used to build and test code automatically.
Before they were taken down, these fake packages were downloaded nearly 3,000 times.
In a separate discovery, Socket also found eight other harmful packages on NPM. These had been around for about two years and had been downloaded over 6,000 times. Unlike the first group, these could actually damage systems by deleting or corrupting data.
If you've used any unfamiliar packages recently, remove them immediately. Run a full security scan, change your passwords, and enable two-factor authentication wherever possible.
This incident shows how hackers are now using platforms like NPM to reach developers directly. It’s important to double-check any code you install, especially if it’s from a source you don’t fully recognize.
Over the past few years, AI assistants have made coding easier for developers in that one is able to quickly develop and push code over to GitHub, among others. But with so much automation going on, the risk of coding vulnerabilities has also increased. The vast majority of those generated codes have security flaws. What has befallen the application security teams is a lot of vulnerability reports pouring in. But lately, Snyk has found that 31% of these vulnerability reports are completely false positives added to the burden of security teams.
In such cases, many teams tend to use a method called reachability analysis, which usually helps the security expert screen out noise and work only with the vulnerabilities that might be exploited during an attack-upon only accessible code during said attack. Since only 10% to 20% of the imported code is even used by any application on average, this approach cuts the number of reported vulnerabilities that developers have to fix in half. Joseph Hejderup, technical staff member at Endor Labs, demonstrated this approach during the SOSS Community Day Europe 2024 and talked about how it makes vulnerability reports more actionable.
False Positive Overload
The biggest problem of application security is false positives. The sooner security teams can ship out more code, the larger their impact will be as your security tool begins to flag issues that are not actually a risk. According to Snyk, 61% of the developers believe that the enhancement of false positives is due to automation. To the eyes of the security teams, sorting hundreds or thousands reported vulnerabilities in numerous projects becomes a daunting task.
According to Randall Degges, head of developer relations at Snyk, reachability analysis helps by narrowing down exactly which vulnerabilities are really dangerous. This calms the security teams, since they can now focus on issues being actively executed in the code. Filtering out the kind of vulnerabilities that attackers cannot reach makes companies remediate by as much as 60%. And as OX Security research put it, in some cases, teams even reduced the workload by nearly 99.5%, making improvements to the developers.
Reducing developer friction
It's not just about workload reduction, but rather reporting fewer, more accurate vulnerabilities back to developers, says Katie Teitler-Santullo, a cybersecurity strategist at OX Security. "Tools that focus on real risks over bombarding developers with false alerts improve collaboration and efficiency," she says.
The hardest part is to eliminate the noise that security tools produce, keeping the developers in the same pace with the growth of development while still having a secure solution. Focusing on reachability ensures that the reported vulnerabilities are really relevant to the code being worked on, allowing developers to tackle key issues without fear of information paralysis.
Two Approaches to Reachability Analysis
There are two primary ways of reachability analysis. The first of these is static code analysis-in the process, the code itself is analysed and a graph of function calls is constructed to determine whether vulnerable code can be executed. This method works but is not failsafe as some of the functions may only be called under specific conditions.
The second approach involves instrumenting the application to track code execution during runtime. This really gives a live snapshot of which parts are really being used, so you will be able to immediately know if the identified vulnerability is something that poses an actual threat.
While the current reachability analysis tools mainly focus on whether code is being executed, the future of this technology involves determining if vulnerable code is indeed exploitable. According to Hejderup, the next step in reaching that milestone of making security testing even more effective would be the combination of reachability with exploitability analysis.
Finally, reachability analysis offers an effective solution to the problem of vulnerability overload. This is because it allows security teams to remove extraneous reports and focus only on reachable, exploitable code. This approach reduces workloads and generates better collaboration between security teams and development teams. As companies adopt this way of doing things, the future of application security testing will be more complex, such that only the most crucial vulnerabilities are flagged and then fixed.
Reachability analysis isn't going to be a silver bullet, perhaps, but it is going to be a pretty useful tool in an era where code is being developed and deployed faster than ever-and the risks of ignorance on security have never been higher.
GitHub has unveiled a novel AI-driven feature aimed at expediting the resolution of vulnerabilities during the coding process. This new tool, named Code Scanning Autofix, is currently available in public beta and is automatically activated for all private repositories belonging to GitHub Advanced Security (GHAS) customers.
Utilizing the capabilities of GitHub Copilot and CodeQL, the feature is adept at handling over 90% of alert types in popular languages such as JavaScript, Typescript, Java, and Python.
Once activated, Code Scanning Autofix presents potential solutions that GitHub asserts can resolve more than two-thirds of identified vulnerabilities with minimal manual intervention. According to GitHub's representatives Pierre Tempel and Eric Tooley, upon detecting a vulnerability in a supported language, the tool suggests fixes accompanied by a natural language explanation and a code preview, offering developers the flexibility to accept, modify, or discard the suggestions.
The suggested fixes are not confined to the current file but can encompass modifications across multiple files and project dependencies. This approach holds the promise of substantially reducing the workload of security teams, allowing them to focus on bolstering organizational security rather than grappling with a constant influx of new vulnerabilities introduced during the development phase.
However, it is imperative for developers to independently verify the efficacy of the suggested fixes, as GitHub's AI-powered feature may only partially address security concerns or inadvertently disrupt the intended functionality of the code.
Tempel and Tooley emphasized that Code Scanning Autofix aids in mitigating the accumulation of "application security debt" by simplifying the process of addressing vulnerabilities during development. They likened its impact to GitHub Copilot's ability to alleviate developers from mundane tasks, allowing development teams to reclaim valuable time previously spent on remedial actions.
In the future, GitHub plans to expand language support, with forthcoming updates slated to include compatibility with C# and Go.
For further insights into the GitHub Copilot-powered code scanning autofix tool, interested parties can refer to GitHub's documentation website.
Additionally, the company recently implemented default push protection for all public repositories to prevent inadvertent exposure of sensitive information like access tokens and API keys during code updates.
This move comes in response to a notable issue in 2023, during which GitHub users inadvertently disclosed 12.8 million authentication and sensitive secrets across more than 3 million public repositories. These exposed credentials have been exploited in several high-impact breaches in recent years, as reported by BleepingComputer.