Nvidia Unveils ENPIRE for Self-Training Robots

Nvidia, in collaboration with Carnegie Mellon and UC Berkeley, has unveiled ENPIRE, a framework that allows AI coding agents to autonomously train robots without human oversight.

Summary

Nvidia, alongside Carnegie Mellon and UC Berkeley, has introduced ENPIRE, a framework enabling AI coding agents to autonomously teach robots new skills without human intervention.
Using Codex, Claude Code, and Kimi Code, an eight-robot team achieved a 99% success rate in tasks such as pin insertion, GPU installation, and zip-tie cutting.
Expanding from one to eight robots reduced task mastery time by over 50%, although the token expenses increased at a faster rate than the time savings.

A group of eight robotic arms at Nvidia's GEAR lab has recently trained themselves to perform tasks like pin insertion, seating graphics cards, and cutting zip ties, with human involvement limited to post-experiment analysis.

This self-training capability stems from ENPIRE, a framework detailed in a paper released on Tuesday by researchers from Nvidia, Carnegie Mellon University, and UC Berkeley. ENPIRE assigns the entire training process to AI coding agents, which are already capable of writing and testing their own code, thus allowing them to execute that process on physical robots.

Coding agents such as OpenAI's Codex, Anthropic's Claude Code, and Moonshot's Kimi Code have been engaged in what researchers refer to as autoresearch over the past year, where they write, test, and revise their code without human oversight. Previously, this process occurred on screens where failed experiments had no real-world consequences. ENPIRE transitions this to the physical realm, where resetting a failed experiment involves manipulating actual robot arms.

Constructing the ‘Enpire’

The system is divided into two phases. In the initial phase, a human guides the agent in creating two essential tools: a reset routine that restores the workspace to its original state, and a reward function that analyzes camera footage to evaluate success—essentially serving as a constant referee. This setup is established once and reused for all subsequent attempts.

After these tools are created, the agent takes complete control. It explores existing research for inspiration, selects training methods like imitation learning, reinforcement learning, or manual rules, then rewrites and tests its own code on the robot. This entire cycle does not require human supervision, which may be either liberating or slightly concerning when considering a robot operating scissors alone.

Nvidia conducted the experiment using eight bimanual robot stations, each equipped with its own hardware, computer, and coding agent. The stations share progress via Git, a tool commonly used by developers to merge code, allowing effective ideas to disseminate across the fleet within minutes.

The researchers evaluated performance on the “Push-T” task, where a robot must slide a T-shaped block into a designated area using only pushes, and on pin insertion, which involves threading pins into 4-millimeter holes. Transitioning from one robot to eight reduced the time required to master Push-T from approximately five hours to two, and pin insertion time decreased from over 90 minutes to around 40.

Across four practical tasks tested, the agents achieved a 99% success rate, as stated in the paper. For pin insertion, they demonstrated near-perfect reliability in less time than a comparable human-assisted approach, which still requires daily human oversight.

Jim Fan, co-lead of the GEAR Lab and head of Nvidia's AI research, described the project as a pioneering effort to enable AutoResearch in the physical realm. He mentioned that the team provided the agents with a fleet of robots, GPU resources, and a token budget, then stepped back to allow the robots to take control.

Today, we enable AutoResearch in the physical world for the first time! Introducing ENPIRE: we give 8 Codex agents a fleet of robots, an allocation of GPUs, and generous token budget. We set them free with a simple goal: solve the task as quickly as possible, keep the robots busy… pic.twitter.com/zC0OQNzDBs

— Jim Fan (@DrJimFan) June 16, 2026

The discrepancy between simulation and reality became evident quickly. Although all three coding agents successfully completed the Push-T task in a simulator, two of them failed when the task was executed on a physical robot, according to the paper.

Simulations do not account for friction issues, which real-world surfaces present.

Nvidia also evaluated ENPIRE within RoboCasa, a simulated kitchen environment that assesses robots on tasks such as opening cabinets or turning off stoves based on success rates, without the danger of accidents. In this setting, ENPIRE exceeded the performance of both Nvidia's end-to-end model GR00T and CaP-X, a tool-using agent that bypasses the autoresearch cycle altogether.

ENPIRE builds upon an idea Nvidia previously introduced with Eureka, a 2023 system that utilized a language model to create reward functions for robots in a simulation instead of relying on human engineers. ENPIRE advances this self-improvement process from the simulator to real hardware, allowing the agent to design its own experiments rather than merely its own rewards.

This announcement coincides with Alibaba's introduction of its own embodied-AI initiative, the Qwen-Robot Suite, which comprises three foundational models for robot navigation, manipulation, and physics simulation. While Alibaba focuses on developing software for robotic systems it does not produce, Nvidia is investigating whether agents can autonomously manage the entire research process on its owned hardware. Both efforts highlight a shared trend: physical robots are emerging as the next competitive landscape for coding agents.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Nvidia Develops Self-Training Robots Using AI Coding Agents

Summary

Constructing the ‘Enpire’

Daily Debrief Newsletter