New research suggests that the ability to discover software vulnerabilities using artificial intelligence is becoming both inexpensive and widely accessible, raising concerns that advanced cyber capabilities may be spreading faster than anticipated.
A study by Vidoc Security demonstrates that vulnerability discovery techniques similar to those highlighted in Anthropic’s recent “Mythos” work can be reproduced using publicly available AI models. By leveraging GPT-5.4 and Claude Opus 4.6 within an open-source framework called opencode, researchers were able to replicate key findings for under $30 per scan, without access to Anthropic’s internal systems or restricted programs.
Anthropic had earlier positioned its Mythos research as highly sensitive, limiting access to a small group of major organizations and prompting concern across policy and financial circles. Reports indicated that senior figures, including Scott Bessent and Jerome Powell, discussed the implications alongside leading financial executives. The term “vulnpocalypse” resurfaced in cybersecurity discussions, reflecting fears of large-scale AI-driven exploitation.
The Vidoc team sought to test whether such capabilities were truly restricted. Using patched vulnerability examples referenced in Anthropic’s public materials, they examined issues affecting a file-sharing protocol, a security-focused operating system’s networking components, widely used video-processing software, and cryptographic libraries used for identity verification online.
Across three independent runs, both models successfully reproduced two of the documented vulnerability cases each time. Claude Opus 4.6 also independently rediscovered a flaw in OpenBSD in all three attempts, while GPT-5.4 failed to identify that specific issue. In other instances, including vulnerabilities tied to FFmpeg and wolfSSL, the systems correctly identified relevant code regions but did not fully determine the root cause.
The methodology closely mirrored workflows described by Anthropic. Instead of relying on a single prompt, the system first analyzed entire codebases, divided them into smaller segments, and ran parallel detection processes. These processes filtered meaningful signals from noise and cross-checked findings across files. Importantly, the selection of code segments was automated through earlier planning steps, rather than manually guided.
Despite these results, the study underlines a clear distinction. Anthropic’s system reportedly went beyond identifying vulnerabilities by constructing detailed exploit pathways, such as chaining code fragments across multiple network packets to achieve full remote control of a system. The public models, while capable of locating weaknesses, did not reach that level of execution.
According to researcher Dawid Moczadło, this indicates a new turn of events in cybersecurity economics. The most resource-intensive part of the process, identifying credible vulnerability signals, is becoming accessible to anyone with standard API access. However, validating those findings and converting them into reliable security insights or exploit strategies remains significantly more complex.
Anthropic itself has acknowledged that traditional benchmarks like Cybench are no longer sufficient to measure modern AI cyber capabilities, noting that its Mythos system exceeded those standards. The company estimated that comparable capabilities could become widespread within six to eighteen months.
The Vidoc findings suggest that, at least for vulnerability discovery, this transition may already be underway. By publishing their methodology, prompts, and results, the researchers highlight how open tools and commercially available models can replicate parts of workflows once considered highly restricted.
For organizations, the implications are instrumental. As AI reduces the cost and effort required to uncover software flaws, defenders may need to adopt continuous monitoring, faster remediation cycles, and deeper behavioral analysis. The challenge is no longer just identifying vulnerabilities, but managing the scale and speed at which they can now be discovered.
DARWIS Taka, a new web vulnerability scanner, is now available for free and runs via Docker. It pairs a rules-based scanning engine with an optional AI layer that reviews each finding before it reaches the report, aimed squarely at the false-positive problem that has dogged vulnerability scanning for years.
Built in Rust, Taka ships with 88 detection rules across 29 categories covering common web vulnerabilities, and produces JSON or self-contained HTML reports. Setup instructions, the Docker configuration, and documentation are published on GitHub at github.com/CSPF-Founder/taka-docker.
Taka's AI layer runs in one of two modes. In passive (evidence-analysis) mode, the model reviews the data the scanner already collected and returns a verdict without sending any further traffic to the target. In active mode, the AI acts as a second-stage tester: it proposes a small number of targeted follow-up requests, such as paired true and false payloads for a suspected SQL injection, Taka executes them, and the responses are fed back to the AI for differential analysis. Active mode is more decisive on borderline findings but generates additional traffic.
In both modes, every result is tagged with a verdict (confirmed, likely false positive, or inconclusive), a confidence score, and the AI's written reasoning. The report surfaces those labels alongside a summary of how many findings fell into each bucket. Nothing is dropped silently, so reviewers see what the AI believed and why, and can focus triage on the findings marked confirmed.
The validation layer currently supports Anthropic and OpenAI. The project team has tested Taka extensively with Anthropic's Claude Sonnet, which gave the best balance of reasoning quality and speed in their evaluation, and recommends it for the strongest results. AI validation is optional; without a key, Taka runs as a standard scanner with its own false-positive controls.
Most scanners trigger on the first matcher that fires, which is why a single stray string in a response can produce a flood of bogus alerts. Taka uses a weighted scoring system instead. Each matcher in a rule, whether a status code, a regex, a header check, or a timing comparison, carries an integer weight reflecting how strong a signal it is. The rule declares a detection threshold, and a finding is raised only when the combined weight of the matchers that fired meets or exceeds that threshold.
A circuit breaker halts scanning against hosts showing signs of distress, per-host rate limiting caps concurrent requests, and a passive mode disables all attack payloads for environments where only non-intrusive checks are acceptable. Three scan depth levels (quick, standard, deep) trade coverage against runtime, while a two-phase execution model keeps time-based blind rules from interfering with the rest of the scan.
A web interface ships with the tool for launching scans, inspecting findings alongside the raw evidence, and revisiting results.
Only the optional AI validation requires a third-party API key, supplied by the user. Taka is aimed at security engineers, penetration testers, bug bounty hunters, DevSecOps teams, and developers who want a scanner that respects their triage time.
Full setup instructions are available at github.com/CSPF-Founder/taka-docker.
The US Cybersecurity & Infrastructure Security Agency (CISA) confirms active exploitation of the CitrixBleed 2 vulnerability (CVE-2025-5777 in Citrix NetScaler ADC and Gateway. It has given federal parties one day to patch the bugs. This unrealistic deadline for deploying the patches is the first since CISA issued the Known Exploited Vulnerabilities (KEV) catalog, highlighting the severity of attacks abusing the security gaps.
CVE-2025-5777 is a critical memory safety bug (out-of-bounds memory read) that gives hackers unauthorized access to restricted memory parts. The flaw affects NetScaler devices that are configured as an AAA virtual server or a Gateway. Citrix patched the vulnerabilities via the June 17 updates.
After that, expert Kevin Beaumont alerted about the flaw’s capability for exploitation if left unaddressed, terming the bug as ‘CitrixBleed 2’ because it shared similarities with the infamous CitrixBleed bug (CVE-2023-4966), which was widely abused in the wild by threat actors.
According to Bleeping Computer, “The first warning of CitrixBleed 2 being exploited came from ReliaQuest on June 27. On July 7, security researchers at watchTowr and Horizon3 published proof-of-concept exploits (PoCs) for CVE-2025-5777, demonstrating how the flaw can be leveraged in attacks that steal user session tokens.”
During that time, experts could not spot the signs of active exploitation. Soon, the threat actors started to exploit the bug on a larger scale, and after the attack, they became active on hacker forums, “discussing, working, testing, and publicly sharing feedback on PoCs for the Citrix Bleed 2 vulnerability,” according to Bleeping Computers.
Hackers showed interest in how to use the available exploits in attacks effectively. The hackers have become more active, and various exploits for the bug have been published.
Now that CISA has confirmed the widespread exploitation of CitrixBleed 2 in attacks, threat actors may have developed their exploits based on the recently released technical information. CISA has suggested to “apply mitigations per vendor instructions, follow applicable BOD 22-01 guidance for cloud services, or discontinue use of the product if mitigations are unavailable.”
Recently, the U.S. Cybersecurity and Infrastructure Security Agency (CISA) has discovered and added three critical vulnerabilities to its Known Exploited Vulnerabilities (KEV) catalog. These vulnerabilities, impacting North Grid Proself, ProjectSend, and Zyxel firewalls, are being actively exploited, posing serious risks of data breaches and operational disruptions to unpatched systems. At the time of publishing, Zyxel acknowledged the issue and advised users to update their firmware promptly and strengthen admin credentials.
North Grid Proself Vulnerability (CVE-2023-45727): A severe XML processing vulnerability in North Grid Proself has been identified, allowing attackers to bypass restrictions and access sensitive server data. Systems running versions older than 5.62, 1.65, and 1.08 are vulnerable to exploitation through maliciously crafted XML requests, which can extract sensitive account information.
ProjectSend Vulnerability (CVE-2024-11680): A critical authentication flaw in ProjectSend, an open-source file-sharing platform, has been flagged with a CVSS severity score of 9.8. Versions prior to r1720 are susceptible to attacks where malicious actors manipulate the options.php file using crafted HTTP requests. This enables them to create unauthorized accounts, upload webshells, and inject harmful JavaScript code. Security researchers from VulnCheck report that attackers are leveraging automated tools such as Nuclei and Metasploit to exploit this vulnerability.
Notably, exploitation attempts are marked by altered server configurations, including random strings in landing page titles—a trend observed since September 2024. Despite a patch being released in May 2023, over 4,000 exposed instances remain vulnerable.
Zyxel Firewall Vulnerability (CVE-2024-11667): Zyxel firewalls running firmware versions between V5.00 and V5.38 are vulnerable to a directory traversal attack. This flaw allows attackers to upload or download files via manipulated URLs within the web management interface, potentially compromising system integrity.
ProjectSend instances have been the primary focus of attackers. Public-facing systems have seen unauthorized user registrations—a setting not enabled by default—facilitating access for malicious actors. Webshells uploaded during these attacks are often stored in predictable directories, with filenames tied to timestamps and user data. Organizations are urged to review server logs to identify and address suspicious activities.
Under Binding Operational Directive (BOD) 22-01, federal agencies must prioritize these vulnerabilities, while CISA has recommended that private organizations take immediate action to mitigate the risks. Updating software, reviewing server configurations, and enhancing log analysis are critical steps to safeguard systems from exploitation.