Indirect prompt injection is becoming one of the most worrying AI security risks because attackers can hide malicious instructions inside co...
A contemporary study conducted by researchers at Harvard University has revealed that advanced artificial intelligence systems are now capable of exceeding human doctors in both diagnosing medical conditions and determining treatment strategies, including in fast-paced and high-stakes emergency room environments. The research specifically accentuates the potential capabilities of modern AI systems in handling complex clinical reasoning tasks that were traditionally considered exclusive to trained physicians.
The findings, published in the peer-reviewed journal Science, are based on a controlled comparison between OpenAI o1 and experienced attending physicians. To ensure realistic testing conditions, the study used 76 actual emergency department cases sourced from Beth Israel Deaconess Medical Center. These cases were evaluated across multiple stages of the diagnostic process, allowing researchers to assess performance under varying levels of available patient information.
At the earliest stage of patient assessment, commonly referred to as initial triage, where clinicians typically have only limited details about a patient’s condition, the AI model demonstrated a notable advantage. It was able to correctly identify either the exact diagnosis or a closely related condition in 67.1 percent of the cases. In comparison, the two physicians involved in the study achieved accuracy rates of 55.3 percent and 50 percent respectively. This suggests that even with minimal data, the AI system was more effective at narrowing down potential diagnoses.
As the diagnostic process progressed and additional clinical information became available during the emergency room evaluation phase, the model’s performance improved further. Its diagnostic accuracy increased to 72.4 percent, reflecting its ability to refine its conclusions with more context. The physicians also showed improvement at this stage, but their accuracy remained lower, at 61.8 percent and 52.6 percent. This stage is particularly important as it mirrors real-world conditions where doctors continuously update their assessments based on new findings.
In the final phase of care, when patients were admitted either to general hospital wards or intensive care units, the AI model continued to outperform its human counterparts. It achieved an accuracy rate of 81.6 percent, compared to 78.9 percent and 69.7 percent for the physicians. Although the performance gap narrowed slightly at this stage, the AI still maintained a measurable edge, indicating consistency across the full diagnostic timeline.
Beyond identifying illnesses, the study also evaluated how effectively the AI system could design clinical management plans. This included decisions such as selecting appropriate medications, including antibiotics, as well as handling complex and sensitive scenarios like end-of-life care planning. Across five evaluated case studies, the AI achieved a median performance score of 89 percent. In contrast, physicians scored significantly lower, averaging 34 percent when relying on traditional clinical resources and 41 percent when supported by GPT-4. This underlines a substantial gap in structured decision-making support.
The researchers acknowledged that while integrating AI into clinical workflows is often viewed as a high-risk approach due to patient safety concerns, its potential benefits are significant. They noted that wider adoption of such systems could help reduce diagnostic errors, minimize treatment delays, and address disparities in access to healthcare services. These factors collectively contribute to both improved patient outcomes and reduced financial strain on healthcare systems.
At the same time, the study emphasizes that current AI systems are not without limitations. Clinical medicine involves more than text-based data. Doctors routinely rely on non-verbal and non-textual cues, such as observing a patient’s physical discomfort, interpreting imaging results, and making judgment calls based on experience. These aspects are not fully captured by existing AI models, which means human expertise remains essential.
The authors further concluded that large language models have now surpassed many traditional benchmarks used to measure clinical reasoning abilities. However, they stress the urgent need for more detailed research, including real-world clinical trials and studies focused on human-AI collaboration, to determine how these systems can be safely and effectively integrated into healthcare settings.
In comments shared with The Guardian, lead researcher Arjun Manrai clarified that the findings should not be interpreted as suggesting that AI will replace doctors. Instead, he described the results as evidence of a major technological shift that is likely to transform the medical field in the coming years.
From a macro industry perspective, this study reflects a developing trend in which AI is increasingly being used to augment clinical decision-making. However, experts continue to caution that challenges such as data bias, accountability, regulatory oversight, and patient trust must be addressed before such systems can be widely deployed. The future of healthcare, therefore, is likely to involve a collaborative model where AI amplifies efficiency and accuracy, while human doctors provide critical judgment, ethical oversight, and patient-centered care.
Cybersecurity researchers have identified two threat groups that are executing fast-moving attacks almost entirely within software-as-a-service environments, allowing them to operate with very little visible trace of intrusion.
The groups, tracked as Cordial Spider and Snarky Spider, are also known by multiple alternate identifiers across different security vendors. Investigations show that both groups are involved in high-speed data theft followed by extortion attempts, and their methods show a strong overlap in how operations are carried out. Analysts assess that these groups have been active since at least October 2025. One of them is believed to be composed of native English speakers and is linked to a cybercrime network widely referred to as “The Com.”
According to findings from CrowdStrike, these attackers primarily rely on voice phishing, also known as vishing, to initiate their intrusions. In these cases, individuals are contacted and guided toward fraudulent login pages that are designed to imitate single sign-on systems. These pages act as adversary-in-the-middle setups, meaning they intercept and capture authentication data, including login credentials and session details, as the victim enters them. Once this information is obtained, attackers immediately use it to access SaaS applications that are connected through single sign-on integrations.
Researchers explain that the attackers deliberately operate within trusted SaaS platforms to avoid raising suspicion. Because their activity takes place inside legitimate services already used by organizations, their presence generates fewer detectable signals. This allows them to move quickly from initial compromise to data access. The combination of speed, targeted execution, and reliance on SaaS-only environments makes it harder for defenders to monitor and respond effectively.
Earlier research published in January 2026 by Mandiant revealed that these attack patterns represent a continuation of tactics seen in extortion-focused campaigns linked to the ShinyHunters group. These operations involve impersonating IT staff during phone calls to build trust with victims, then directing them to phishing pages in order to collect both login credentials and multi-factor authentication codes.
More recent analysis from Palo Alto Networks Unit 42 and the Retail & Hospitality ISAC indicates, with moderate confidence, that one of the identified clusters is associated with The Com network. These attacks rely heavily on living-off-the-land techniques, where attackers use legitimate system tools instead of introducing malware. They also make use of residential proxy networks to mask their real geographic location and to evade basic IP-based security filtering systems.
Since February 2026, activity linked to one of these clusters has been directed toward organizations in the retail and hospitality sectors. The attackers combine vishing calls, often impersonating IT help desk personnel, with phishing websites designed to capture employee credentials.
Once access is established, the attackers take steps to maintain long-term control. They register a new device within the compromised account to ensure continued access, and in many cases remove previously registered devices. After doing so, they modify email settings by creating inbox rules that automatically delete notifications related to new device logins or suspicious activity, preventing the legitimate user from being alerted.
Following initial access, the attackers shift their focus toward accounts with higher privileges. They collect internal information, such as employee directories, to identify individuals with elevated access and then use further social engineering techniques to compromise those accounts as well. With increased privileges, they move across SaaS platforms including Google Workspace, HubSpot, Microsoft SharePoint, and Salesforce, searching for sensitive documents and business-critical data. Any valuable information is then exfiltrated to infrastructure controlled by the attackers.
Researchers note that in many observed cases, the stolen credentials provide access to the organization’s identity provider, which acts as a central authentication system. This creates a single entry point into multiple SaaS applications. By exploiting the trust relationships between the identity provider and connected services, attackers are able to move across the organization’s cloud ecosystem without needing to compromise each application separately. This allows them to access multiple systems using a single authenticated session.