How dangerous is Mythos, Anthropic’s new AI model?
Dario Amodei’s warnings should not be dismissed

When in 2019 OpenAI finished training a new large language model called GPT-2, the artificial-intelligence lab initially declared it too dangerous to be released. Dario Amodei, then OpenAI’s research director, insisted that the world needed time to prepare.
In the end, it was released later that year. A sequence of far more powerful models have since been developed without unleashing Armageddon. Yet seven years on, Mr Amodei, now the head of OpenAI’s bitter rival, Anthropic, is worried once again. On April 7th he declared that the latest addition to his lab’s Claude family of models, dubbed “Mythos”, is too powerful to be made widely available just yet. This time, he might be right.
According to Anthropic, the capabilities of Mythos are “substantially beyond those of any model we have previously trained”. The lab says it is particularly alarmed by the system’s ability to find software vulnerabilities and either fix them (if set to work as a defender) or exploit them (if acting as a hacker).
Such claims ought normally to be taken with a pinch of salt. Anthropic built the model, ran the tests—and stands to benefit from the perception that its system is far more brilliant than anything to have come before it. The lab has been on a roll lately. On April 6th it announced that its annualised revenue had reached $30bn, up from just $9bn at the end of last year. It is surely eager to maintain its momentum.
Yet there are reasons to take Anthropic’s latest warnings seriously. The first is their gravity: Anthropic says that Mythos has already found severe vulnerabilities in “every major operating system and web browser”, including one that had gone undetected for 27 years.
The second is the response of other companies. Alongside the pause, Anthropic announced Project Glasswing, an effort to help companies use Mythos to boost cyber-defences before it is widely released. The participation of leading software developers—including Apple, the Linux Foundation and CrowdStrike, as well as Google, which competes directly with Anthropic in AI—suggests the threat is credible.
Mr Amodei’s approach to mitigating the danger is sensible. If given a head start, companies can use Mythos to test unpublished code for weaknesses and fix any before release. Even so, Anthropic has plenty to gain from Project Glasswing. The lab will cover the first $100m of costs arising from the use of the model for the initiative. But eventually it will charge participants five times more to use Mythos than it does for its predecessor, Opus.
That may be a price worth paying. Anthropic’s rivals are bound to develop models with similar hacking capabilities sooner or later. Other frontier labs, such as OpenAI and Google, have their own sensible release policies. But open-source labs, particularly those based in China, tend to be less focused on safety.
Hackers are not the only ones who may be miffed by Project Glasswing. America’s government has long sought to exploit weaknesses in adversaries’ cyber-defences. That has meant hoarding undiscovered vulnerabilities, including in American software used abroad, for use at a time when these “zero days” will have most impact. If Project Glasswing works, it could disarm many of America’s cyber-weapons.
That would surely enrage Pete Hegseth, America’s defence secretary, who labelled Anthropic a supply-chain risk earlier this year after a bust-up between it and the Pentagon over limits on the use of the lab’s models for military purposes (a judge has since temporarily blocked the “Orwellian” designation). Mr Amodei may continue to be a thorn in his side. ■
To track the trends shaping commerce, industry and technology, sign up to “The Bottom Line”, our weekly subscriber-only newsletter on global business.