Anthropic’s restricted Claude Mythos model may be coming to Claude Code

Mythos and the Path to Safer AI Deployment

IntroductionAnthropic appears to be moving toward a public rollout of its restricted Mythos model, a development that emerged from an early April preview and has since sparked debate about security risks and guardrails. Mythos is pitched as a frontier model with notable strengths in code reasoning and autonomous capabilities, raising both the potential for advanced defense and the threat of highly capable cyberattacks. The unfolding story centers on how to balance innovation with protection across critical software ecosystems.

Mythos Preview: Capabilities and Risks

Announced in early April as a restricted model designed to tackle complex security tasks, Mythos purportedly excels at code reasoning and autonomous operations far beyond current flagship models.
Demonstrations claimed the model could automatically develop functional cyberattacks at a professional level, underscoring a profound risk to global digital infrastructure if unleashed without proper safeguards.
Early warnings emphasized that the advantage would go to the group that can most effectively harness these tools, with attackers potentially benefiting in the short term if models are released without robust guardrails.

Guardrails and Public Rollout Strategy

In response to the risk of widespread exploitation via unpatched vulnerabilities in popular applications, Anthropic paused a broad public rollout and focused on building a powerful guardrail system.
The guardrail strategy appears to be taking shape, with indications that Mythos is being integrated into existing Claude tools under protected configurations rather than as a blanket feature.
The approach suggests a staged rollout where access may be gated by safeguards and tiered availability, rather than an across-the-board release.

Observed Deployments: Claude Code and Claude Security

References to Mythos surfaced in the public versions of Claude Code and Claude Security, with identifiers such as claude-mythos-1-preview appearing briefly in internal previews.
A toggle to enable Mythos surfaced in testing environments before being taken offline, signaling ongoing experimentation and the possibility of limited public exposure depending on guardrail performance.
These observations imply that Anthropic is preparing the Mythos capabilities for broader testing while continuing to refine safety controls.

Glasswing Initiative: Industry Collaboration

Anthropic has described a project named Glasswing, aimed at collaborating with other companies to secure critical software against AI-driven exploits.
The initiative reportedly leverages the unreleased Mythos Preview and has engaged as many as 50 organizational partners to date.
A dashboard presented in demonstrations showed open-source vulnerabilities detected by Mythos across a range of severities, illustrating the model’s potential to surface security gaps during real-world use.

Vulnerability Discoveries: Why Mythos Was Delayed

In its first month, Mythos reportedly identified a substantial volume of high- or critical-severity vulnerabilities—on the order of tens of thousands—helping to explain the cautious stance on public release.
The scale of findings reinforced concerns about unmitigated exposure and underscored the need for comprehensive protection, monitoring, and governance around such powerful tooling.

Claude Product Family Context

Anthropic’s current lineup includes Claude Opus 4.7, Opus 4.6, Opus 4.5, Sonnet 4.6, and Haiku 5.5, mapping a spectrum of capabilities alongside Mythos for future integration.
The existence of Mythos in parallel with these models signals a broader strategy to diversify safety-focused offerings while maintaining strict control over the most capable systems.

The Validation Gap: Automated Pentesting and Beyond

Automated pentesting tools deliver value for network traversal testing but do not fully validate a system’s defenses, detection rules, or cloud configurations.
A comprehensive security validation approach must address multiple surfaces beyond initial access, including containment, detection, and response to evolving threats.

Key takeaways

Mythos represents a significant advancement in AI-driven security capabilities, balanced by substantial risk that has prompted guarded, phased deployment.
The Glasswing collaboration demonstrates a real-world use case where AI models are employed to strengthen, not just test, critical software ecosystems.
The delayed public rollout reflects a cautious stance focused on building robust guardrails that can adapt as the threat landscape evolves.
The ongoing integration of Mythos with Claude Code and Claude Security highlights a trajectory toward controlled accessibility rather than universal availability.
A layered approach to security validation remains essential, going beyond simple penetration testing to verify end-to-end defenses, detections, and configuration management.

Bullet points for quick reference

Mythos is a restricted model focused on advanced security tasks and code reasoning.
Public rollout appears delayed pending robust guardrails and deployment controls.
Early previews showed Mythos in Claude Code and Claude Security under preview identifiers.
Glasswing represents industry collaboration to secure software against AI-driven exploits.
First-month vulnerability findings were substantial, driving caution around broad access.
Claude’s current family continues to expand alongside Mythos, signaling a broader safety-first strategy.

Note: This post synthesizes developments and public observations around Mythos and related initiatives, presenting a structured view of the evolving landscape without advocating specific actions or recommendations.

Mythos and the Path to Safer AI Deployment

Mythos Preview: Capabilities and Risks

Announced in early April as a restricted model designed to tackle complex security tasks, Mythos purportedly excels at code reasoning and autonomous operations far beyond current flagship models.
Demonstrations claimed the model could automatically develop functional cyberattacks at a professional level, underscoring a profound risk to global digital infrastructure if unleashed without proper safeguards.
Early warnings emphasized that the advantage would go to the group that can most effectively harness these tools, with attackers potentially benefiting in the short term if models are released without robust guardrails.

Guardrails and Public Rollout Strategy

In response to the risk of widespread exploitation via unpatched vulnerabilities in popular applications, Anthropic paused a broad public rollout and focused on building a powerful guardrail system.
The guardrail strategy appears to be taking shape, with indications that Mythos is being integrated into existing Claude tools under protected configurations rather than as a blanket feature.
The approach suggests a staged rollout where access may be gated by safeguards and tiered availability, rather than an across-the-board release.

Observed Deployments: Claude Code and Claude Security

References to Mythos surfaced in the public versions of Claude Code and Claude Security, with identifiers such as claude-mythos-1-preview appearing briefly in internal previews.
A toggle to enable Mythos surfaced in testing environments before being taken offline, signaling ongoing experimentation and the possibility of limited public exposure depending on guardrail performance.
These observations imply that Anthropic is preparing the Mythos capabilities for broader testing while continuing to refine safety controls.

Glasswing Initiative: Industry Collaboration

Anthropic has described a project named Glasswing, aimed at collaborating with other companies to secure critical software against AI-driven exploits.
The initiative reportedly leverages the unreleased Mythos Preview and has engaged as many as 50 organizational partners to date.
A dashboard presented in demonstrations showed open-source vulnerabilities detected by Mythos across a range of severities, illustrating the model’s potential to surface security gaps during real-world use.

Vulnerability Discoveries: Why Mythos Was Delayed

In its first month, Mythos reportedly identified a substantial volume of high- or critical-severity vulnerabilities—on the order of tens of thousands—helping to explain the cautious stance on public release.
The scale of findings reinforced concerns about unmitigated exposure and underscored the need for comprehensive protection, monitoring, and governance around such powerful tooling.

Claude Product Family Context

Anthropic’s current lineup includes Claude Opus 4.7, Opus 4.6, Opus 4.5, Sonnet 4.6, and Haiku 5.5, mapping a spectrum of capabilities alongside Mythos for future integration.
The existence of Mythos in parallel with these models signals a broader strategy to diversify safety-focused offerings while maintaining strict control over the most capable systems.

The Validation Gap: Automated Pentesting and Beyond

Automated pentesting tools deliver value for network traversal testing but do not fully validate a system’s defenses, detection rules, or cloud configurations.
A comprehensive security validation approach must address multiple surfaces beyond initial access, including containment, detection, and response to evolving threats.

Key takeaways

Mythos represents a significant advancement in AI-driven security capabilities, balanced by substantial risk that has prompted guarded, phased deployment.
The Glasswing collaboration demonstrates a real-world use case where AI models are employed to strengthen, not just test, critical software ecosystems.
The delayed public rollout reflects a cautious stance focused on building robust guardrails that can adapt as the threat landscape evolves.
The ongoing integration of Mythos with Claude Code and Claude Security highlights a trajectory toward controlled accessibility rather than universal availability.
A layered approach to security validation remains essential, going beyond simple penetration testing to verify end-to-end defenses, detections, and configuration management.

Bullet points for quick reference

Mythos is a restricted model focused on advanced security tasks and code reasoning.
Public rollout appears delayed pending robust guardrails and deployment controls.
Early previews showed Mythos in Claude Code and Claude Security under preview identifiers.
Glasswing represents industry collaboration to secure software against AI-driven exploits.
First-month vulnerability findings were substantial, driving caution around broad access.
Claude’s current family continues to expand alongside Mythos, signaling a broader safety-first strategy.

Anthropic’s restricted Claude Mythos model may be coming to Claude Code

More from the Blog

FBI warns of Kali365 phishing service targeting Microsoft 365 accounts

Ghost CMS SQL injection flaw exploited in large-scale ClickFix campaign

Laravel Lang packages hijacked to deploy credential-stealing malware

Stay Updated

Anthropic’s restricted Claude Mythos model may be coming to Claude Code

More from the Blog

FBI warns of Kali365 phishing service targeting Microsoft 365 accounts

Ghost CMS SQL injection flaw exploited in large-scale ClickFix campaign

Laravel Lang packages hijacked to deploy credential-stealing malware

Stay Updated