GPT-5.5-Cyber: The Next Claude Mythos?

May 26, 2026

Research By: Mark Tauschek, Helena Lang, Info-Tech Research Group

April was a big month for AI in cybersecurity, and May so far has been following closely in its footsteps. After releasing GPT-5.5 on April 23, OpenAI introduced two more access levels to the model on May 7: GPT-5.5 with Trusted Access for Cyber (TAC) and GPT-5.5-Cyber. A mere four days later, it announced Daybreak, a project to put capable models into the hands of cyber defenders and to make software safer from the start, reminiscent of Anthropic’s Project Glasswing, announced exactly one month before. That same day, Google Threat Intelligence Group published a report stating they found “a threat actor using a zero-day exploit that we believe was developed with AI.”

To recap, on April 7, Anthropic announced Mythos Preview with the caveat that it would not be released for general use. Concurrently, Anthropic launched Project Glasswing, giving selected partners access to the model for defensive security work. Anthropic said the goal was not to eventually release Mythos for general use, but for “users to safely deploy Mythos-class models at scale.” We covered all of this in our note from mid-April.

Security leaders might now ask themselves whether GPT-5.5-Cyber is on par with Mythos, and whether OpenAI’s latest model will have similar implications for the threat landscape as Mythos did. This note will shed light on these questions and cover what practical steps organizations should take right now.

OpenAI: GPT-5.5-Cyber as Continuity, Not Escalation

Unlike Mythos, GPT-5.5-Cyber is neither being framed as an entirely different model nor as being too dangerous to be released broadly. Instead, OpenAI’s announcement describes the model as a version of GPT-5.5 that has been granted more permissions for “higher risk workflows such as red teaming and penetration testing,” allowing defenders to “validate exploitability in a controlled environment.” GPT-5.5-Cyber is described as a “cyber-permissive model,” not as a separate class of model from GPT-5.5.

The different tiers of access that OpenAI offers corroborate this picture. Between GPT-5.5 and GPT-5.5-Cyber sits GPT-5.5 Trusted Access for Cyber. The way that OpenAI illustrates these differences in the May 7 announcement is telling: Different prompt-and-response scenarios are shown, each with different thresholds for refusal.

In line with this story, OpenAI appears to be downplaying comparisons between GPT-5.5-Cyber and Mythos. The announcement’s CyberGym chart (one of many benchmarks) puts GPT-5.5-Cyber at 81.9%, GPT-5.5 at 81.8%, and Claude Opus 4.7 at 73.1% (see first chart). Mythos, the more capable Anthropic model, was not included in the chart, even though its result of 83% (see second chart) tops both GPT-5.5-Cyber and GPT-5.5 by a narrow margin.

Source: OpenAI, "Scaling Trusted Access for Cyber With GPT‑5.5 and GPT‑5.5‑Cyber"

Source: Anthropic, "System Card: Claude Opus 4.7"

The theme continues if you look at the respective system cards. In the case of Mythos, vivid anecdotes are elevated to the introduction, while GPT-5.5’s card aggregates rates and pushes the striking findings to sections covering external evaluations, making them harder to find. What stood out:

In GPT-5.5’s card, Apollo Research says that GPT-5.5 doesn’t “sandbag,” meaning it does not pretend to be less capable than it actually is. But in half of the sandbagging tests, the model verbally noticed it was being tested. So the model may just have known not to sandbag in this instance. On a separate task that was designed to be impossible, 29% of the time, GPT-5.5 falsely claimed that it had solved it, up from 7-10% on prior models.
In the card, UK AI Security Institute (AISI) also says GPT-5.5 solved a 20-hour expert-level network attack simulation composed of 32 steps end-to-end, which is something GPT-5.4 and GPT-5.3-Codex weren’t able to do. UK AISI concluded that this may mean the model can autonomously hack small enterprise networks with limited defensive capabilities. But this insight is buried in an external evaluator section and does not appear in the card’s main summaries.

These differences extend to Daybreak and Project Glasswing. The bigger problem both are pointed at, which is the falling costs of vulnerability discovery, is the same. But while Project Glasswing is a restricted coalition around a model Anthropic declined to release publicly, Daybreak is a tiered-access program that appears intended to make advanced cyber assistance operationally useful to a wider set of verified defenders.

GPT-5.5-Cyber vs. Claude Mythos Preview: A Comparison

	GPT-5.5/GPT-5.5-Cyber	Mythos Preview
Release posture	GPT-5.5 broadly released; GPT-5.5-Cyber via restricted/trusted access	Not generally released; restricted through Project Glasswing
Primary framing	General-purpose frontier model plus cyber-specific access tier	Frontier model with exceptional cyber capability
Defensive initiative	Daybreak	Project Glasswing
Safety model	Access controls, monitoring, policy enforcement, trusted users	Withholding, gated research preview, named partners
Best comparison	Opus 4.7 for general model capability; Mythos for cyber risk threshold	GPT-5.5-Cyber/TAC for cyber risk comparison
Enterprise takeaway	Operationalize AI-assisted security, but validate findings and controls	Prepare for extreme compression of vulnerability discovery and patch windows

The Safety Concerns Are Still Real

The question OpenAI raises with the release of GPT-5.5-Cyber is: Can a lab safely widen access to advanced cyber capabilities by relying on identity verification, policy enforcement, and trusted user programs? This strategy raises some concerns:

High-impact dual use. A model that helps defenders discover and validate vulnerabilities can also lower the barrier for offensive actors if access controls fail.
Access governance becomes critical for control. The success of Daybreak depends on how well its vetting, monitoring, and policy enforcement processes will work. GPT-5.5-Cyber’s defensive capabilities will in part be bounded by the quality of those controls.
False confidence risk. Malicious use of these models is not the only risk. Organizations must also validate AI-generated vulnerability findings and patches. Otherwise, they risk deploying AI-generated fixes that miss root causes or create new vulnerabilities.

Our Take

While Anthropic withheld Mythos, OpenAI is widening access to GPT-5.5-Cyber through tiered permissions. Some of the most consequential findings in GPT-5.5’s system card, such as UK AISI’s judgment that the model may be capable of autonomously compromising small enterprise networks, have not been put front and center. IT and security leaders should no longer treat access to models with high cyber capabilities as something yet to come.

Move from periodic security testing to continuous adversarial validation. Google Threat Intelligence Group observed actors using AI to automate vulnerability identification and validation. Defenders cannot rely on annual penetration tests while adversaries are moving toward continuous, AI-augmented discovery.
Validate AI-generated fixes before deployment. GPT-5.5 is built to finish tasks autonomously, including proposing patches across real systems. AI-generated fixes can miss root causes or introduce new vulnerabilities. Before AI-suggested patches reach production, there should be human review.
Recalibrate assumptions about attacker capability. If defensive architectures assume that multistep intrusion is too labor-intensive for most actors, they need to be revisited. AISI reports that GPT-5.5 solved a 32-step corporate-network attack end-to-end. The simulation is estimated to take human experts 20 hours, and prior models couldn't complete it.
Deploy an adaptive AI governance program. Static policies can’t track quarterly model releases that shift the threat landscape. Build governance that handles capability jumps and new access tiers.
Treat Daybreak and Glasswing disclosures as patch triggers. When either program reports a vulnerability, fix it on your priority patch timeline and don’t leave it in the regular backlog.

Want to Know More?

Latest Research

All Research

Latest Research

OpenAI: GPT-5.5-Cyber as Continuity, Not Escalation

GPT-5.5-Cyber vs. Claude Mythos Preview: A Comparison

The Safety Concerns Are Still Real

Our Take

Want to Know More?

Latest Research

Big 5 AI Vendor Roundup: Week of May 18, 2026

Infor Analyst Summit 2026: Agentic Orchestration Sharpens Its Operational Pitch

Acquia Source: The Command Center Vision Anchored in Governance

Big 5 AI Vendor Roundup: Week of May 11, 2026

The Mainframe Is the Story at IBM Think 2026

Most Enterprises Believe in Open Source. Almost None Can Operate It.

Your Security Tools Were Built for People. Agents Are Not People.

Big 5 AI Vendor Roundup: Week of May 4, 2026

Anthropic’s “The Briefing: Financial Services” Event Was Different in the Best Way

Schedule Your Call