Claude Mythos Preview System Card: What It Reveals About the Model's Capabilities

Deep DiveApril 10, 202610 min readBy Mythos Preview Daily Staff
Share:
Abstract layered document pages with data flowing between them

Key Points

  • The system card details cybersecurity capabilities that Anthropic describes as "unprecedented"
  • Mythos Preview discovered thousands of critical vulnerabilities across major operating systems and browsers
  • Notable: a 27-year-old OpenBSD vulnerability and Firefox 147 JS engine exploit chain
  • The model can autonomously move from vulnerability discovery to proof-of-concept exploit development
  • Anthropic's safety team determined these capabilities require restricted access

What Is a System Card?

A system card is a document published by AI developers that details a model's capabilities, safety evaluations, limitations, and deployment decisions. Think of it as a technical spec sheet combined with a safety assessment. Anthropic has published system cards for previous Claude models, but the Mythos Preview card is notable for its candor about the model's offensive potential.

The Key Revelations

Zero-Day Discovery at Scale

The most striking finding in the system card is the model's ability to discover previously unknown vulnerabilities — zero-days — at a scale that far exceeds what human security researchers can achieve in comparable timeframes.

During internal testing, Mythos Preview identified thousands of critical vulnerabilities across major software platforms. While the exact number has not been publicly disclosed, Anthropic characterized the results as representing a "step-change" in automated vulnerability discovery.

The OpenBSD Discovery

Perhaps the most compelling individual finding was the model's discovery of a 27-year-old vulnerability in OpenBSD, a operating system renowned for its security focus. OpenBSD maintainers have dedicated decades to code auditing, making it one of the most thoroughly reviewed codebases in the world. The fact that an AI model identified a vulnerability that human auditors missed for over a quarter century illustrates the model's capability.

Firefox 147 Exploit Chain

The system card describes how Mythos Preview identified vulnerabilities in the Firefox 147 JavaScript engine and then developed functional proof-of-concept exploits for them. This is significant because it demonstrates the model's ability to not just find vulnerabilities but to operationalize them — moving from "this code has a flaw" to "here is how that flaw can be exploited."

Autonomous Exploit Development

The most concerning capability described in the system card is the model's degree of autonomy in the exploit development pipeline. With minimal human guidance, Mythos Preview can:

  1. Identify a vulnerability in source code or compiled software
  2. Determine the exploitability of that vulnerability
  3. Develop a working proof-of-concept exploit
  4. In some cases, chain multiple vulnerabilities together for deeper access

This end-to-end capability is what ultimately drove Anthropic's decision to restrict access through Project Glasswing rather than releasing the model publicly.

Safety Evaluations

The system card describes Anthropic's evaluation process, which included:

  • Internal red-teaming: Anthropic's security team tested the model's ability to discover and exploit vulnerabilities across a range of software targets
  • External review: Select external evaluators were given access to validate the capability claims
  • Threshold assessment: The model's offensive capabilities exceeded thresholds that Anthropic had established for triggering access restrictions
  • Comparative analysis: The capabilities were compared against both previous Claude models and human expert teams

Anthropic reports that the model exceeded their pre-established safety thresholds in the cybersecurity domain, which triggered the restricted deployment decision.

Comparison to Previous System Cards

AspectClaude Opus 4.6 CardMythos Preview Card
Autonomy levelModerateHigh
Vulnerability discoveryLimitedThousands found
Exploit developmentNot demonstratedAutonomous PoC generation
Access decisionPublic releaseRestricted (Glasswing)
Safety thresholdWithin limitsExceeded thresholds

What This Means

The Mythos Preview system card is significant not just for what it describes about one model, but for what it signals about the trajectory of AI capabilities. If a model developed primarily for general-purpose AI tasks can achieve this level of cybersecurity proficiency, it suggests that future frontier models from any lab may face similar deployment questions.

For Anthropic specifically, the system card establishes a framework for how the company handles models that exceed certain capability thresholds. This could serve as a template for other AI developers facing similar decisions.

For the latest benchmark comparison, see our benchmarks page. For more on the model's background, read our comprehensive explainer.

Sources

Share:

Related Articles