George Hotz Warns AI Coding Agents Could Lead to Disaster

George Hotz warns that the widespread adoption of AI coding agents could lead to significant declines in code quality, arguing that less skilled engineers may produce more errors.

Summary

George Hotz, known for being the first to jailbreak the iPhone and crack the PlayStation 3, expressed in a blog post that the widespread use of AI coding agents could be "one of the most costly mistakes in the field's history."
He argues that while skilled developers can identify flaws in AI output, less experienced engineers, who produce significantly more code, may not, potentially lowering overall code quality.
This commentary comes shortly after Andrej Karpathy joined Anthropic's pre-training team, presenting a contrasting perspective on the effectiveness of AI agents in software development.

George Hotz, the hacker who famously jailbroke the iPhone at 17 and reverse-engineered the PlayStation 3, released a blog post on Sunday warning that the mass adoption of AI coding agents is likely to lead to significant issues. “I’m calling it now, the adoption of AI agents into software development will be one of the most costly mistakes in the field’s history,” Hotz stated. “Agents cannot program, and it’s taking longer and longer to realize that they can’t.”

He elaborated, stating, “The output is broken, but in a way that’s getting harder and harder to detect. Which is exactly what you’d expect from an increasingly accurate statistical model.”

Hotz's post, named "The Eternal Sloptember," was published just five days after AI expert Andrej Karpathy announced his new role at Anthropic, where he argues that AI agents have already revolutionized software development. The two now represent opposing views in a critical debate among engineers.

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

— Andrej Karpathy (@karpathy) May 19, 2026

Hotz's insights are based on his six months of experience employing AI agents on actual projects, including segments of Tinygrad, his open-source deep learning framework, and a full firmware reverse-engineering of a USB-PCIe chip. He notes that while agents can expedite initial progress, they often leave developers with incomplete outputs that require significant manual intervention.

Hotz anticipates criticism regarding his stance, particularly from those who might argue that a programmer's identity is tied to their craft, making them resistant to tools that could replace them. However, he counters this by noting that tools like Google’s AFL have been more effective at finding bugs than LLMs without generating similar concerns. He points out that the popularity of Chess and Go has only increased, despite AI dominance over human players.

His main worry, however, lies in the potential decline of code quality as more developers use these AI tools, particularly with pressures from major tech companies and financial institutions to adopt them widely. He suggests, “I almost think this is some kind of psyop to sell agents. Fear of loss is one of the only ways to make big companies move. Though I think in that fear they are making a big mistake.”

Hotz argues that high-performing developers can effectively catch errors in AI-generated code before it is deployed, thanks to their strong feedback mechanisms. In contrast, he believes that less skilled engineers, who are likely to use these tools to produce tenfold outputs, lack the ability to self-check their work. This dynamic could lead to a significant decrease in overall code quality within larger organizations.

He envisions a future where there is an abundance of poorly written code, stating, "a golden era for buckets and buckets of slop, and a dark age for gems of quality." He cites Apple’s initiative to implement AI coding tools across its engineering teams and questions whether this will enhance or degrade macOS over the next two years.

In terms of the ongoing debate, Hotz identifies himself with the "LeCun/Marcus camp," referencing Yann LeCun of Meta and Gary Marcus, who are both skeptical of the capabilities of language models. They argue that while these models can replicate existing code patterns, they lack the reasoning ability to tackle genuinely new challenges.

The rise of "vibe coding," which involves describing desired outcomes in natural language for AI to generate code, has gained traction, with major tech companies positioning agent-based coding as a key offering. Microsoft transformed GitHub Copilot into a fully agent-driven system in 2025, with CEO Satya Nadella likening it to a pivotal shift in computing akin to the transition to cloud services.

Hotz's concerns are echoed by Karpathy, who, despite his prior skepticism, shifted his viewpoint after recent advancements in AI models and joined Anthropic's pre-training team just days before Hotz's announcement. He remarked on the formative nature of the forthcoming years in AI development.

Anthropic CEO Dario Amodei has noted that some of their engineers have already transitioned away from coding themselves, opting to rely on AI models while reviewing the outputs. However, Hotz has shared his experience attempting this approach, only to revert to manual fixes each time.

Daily Debrief Newsletter

Stay updated with the latest news stories, plus original features, podcasts, videos, and more.