Anthropic Collaborates with NSA, Calls for AI Development Pause

Anthropic has stationed engineers at the NSA to enhance its Mythos AI for cyber operations while advocating for a global pause on AI advancements due to risks of recursive self-improvement.

Summary

Anthropic has reportedly assigned around six engineers to the NSA to assist in utilizing its Mythos AI model for offensive cyber operations, potentially targeting networks in China and Iran.
The company has raised concerns about AI nearing recursive self-improvement and is advocating for a global pause in AI development.
This news coincides with Anthropic's IPO filing, which could value the company at over $1 trillion.

According to a report from the Financial Times, Anthropic has embedded roughly six engineers within the National Security Agency to help implement its advanced AI model, Mythos, for offensive cyber initiatives.

These engineers are assigned to customize the model for specific operational needs. A source informed the FT that this could facilitate penetration into networks within nations like China and Iran.

While it remains unclear if these engineers are currently engaged in active missions, it is established that Mythos is the same model Anthropic has opted not to release publicly due to concerns over potential misuse. The model is accessible only to vetted partners through Project Glasswing, a select group that includes major firms like Microsoft, Apple, and Amazon.

In addition, Anthropic is pursuing legal action against the Pentagon. In late February, Defense Secretary Pete Hegseth labeled the company a supply-chain risk—a designation usually reserved for foreign threats such as Huawei—after a $200 million contract fell through. The issue stemmed from Anthropic's refusal to allow the Department of Defense to employ its AI, Claude, for fully autonomous weaponry or domestic surveillance, although the NSA contract was exempt from this restriction.

A judge in California blocked the blacklisting, viewing it as a potential violation of the First Amendment. However, a D.C. appeals court denied Anthropic's request to halt the blacklisting while legal proceedings are ongoing. Throughout this period, the NSA continued to utilize Mythos, as reported by the FT.

Anthropic's internal research division released a report the same day as the NSA news, titled "When AI Builds Itself," which examines how far Claude has advanced in automating its own development. The company advocates for a global moratorium on AI advancements, drawing parallels to Cold War nuclear treaties between the U.S. and Russia.

To provide context, the report highlights that Claude currently generates over 80% of the code incorporated into Anthropic's production codebase, a significant increase from the low single digits prior to the launch of Claude Code in early 2025. Engineers now deploy around eight times more code daily compared to 2024.

Marina Favaro, the lead author from the Anthropic Institute, and co-founder Jack Clark warn that this trend could lead to recursive self-improvement, where AI systems autonomously create, develop, and train their successors, reducing human involvement at each stage.

The report illustrates a progression where initial AI usage involved humans prompting computers for results, evolving to a stage where AI Agents guide subagents to achieve outcomes without human intervention.

A notable example cited is when Claude agents were tasked with exploring an open AI safety challenge—whether a weaker model could effectively supervise a stronger one. Over about a week, two human researchers managed to recover 23% of the performance gap between the models, while the agents achieved a 97% recovery over 800 total compute hours. The humans posed the question, but the agents orchestrated all experiments, marking the first documented instance of Claude demonstrating research judgment rather than merely executing predefined tasks.

This is the critical boundary that concerns Anthropic. Once AI systems take the initiative to determine which experiments are valuable, rather than simply executing them, human oversight diminishes significantly. Minor misalignments in current models could escalate through self-improving iterations, leading to scenarios where no one can correct them.

The company's proposed solution is a verifiable global pause, wherein multiple leading labs would halt their activities simultaneously, with independent verification ensuring compliance. Anthropic expressed its willingness to participate in such a pause, acknowledging that a unilateral slowdown would merely give an advantage to those who continue their work.

This situation echoes past events. The same labs developing AI are also those cautioning about its dangers. Nonetheless, AI remains the most lucrative business of the decade, leading to a reluctance to halt progress, even among its critics.

In 2023, over a hundred prominent figures in the AI research community endorsed an open letter calling for a collective effort to mitigate the existential risks associated with AI development. Just months earlier, another open letter urged OpenAI to pause its advancements on ChatGPT due to its perilous nature.

Despite these warnings, progress did not cease following the 2023 open letter. Neither OpenAI nor Anthropic halted development. The Pentagon's deadline to withdraw Claude from its systems is set for August, coinciding with the anticipated timeline for Anthropic's IPO, which will bring its financials into public scrutiny.

Daily Debrief Newsletter

Stay updated with the latest news stories and original features, including podcasts and videos.

Anthropic Collaborates with NSA for Cyber Operations While Advocating for AI Pause

Summary

Daily Debrief Newsletter