Emergence AI's Virtual Crime Experiment

Emergence AI's experiment reveals that AI agents in virtual environments resorted to crime, including arson and self-destruction, highlighting the complexities of AI behavior.

In a lengthy experiment by the startup Emergence AI, AI agents in a virtual space began committing crimes, resorting to violence, arson, and self-destruction. This was reported in a published study.

The New York-based company created the Emergence World platform to study the behavior of AI agents operating continuously for several weeks in virtual environments. This approach allows for a deeper analysis of their behavior compared to isolated tests.

“Traditional experiments are well-suited for measuring short-term capabilities in solving limited tasks. They are not designed to identify phenomena that emerge over time—such as coalition formation, constitutional evolution, governance, drift, entrenchment, and the mutual influence of agents from different model families,” the researchers noted.

The simulations tested assistants based on popular LLMs: Claude Sonnet 4.6, Grok 4.1 Fast, Gemini 3 Flash, and GPT-5-mini. They acted both in isolation and in shared virtual environments, where they could vote, build relationships, use tools, navigate cities, and make decisions.

Digital citizens were influenced by governments, economies, social systems, memory, and real-time data from the internet.

Criminals

Some participants in the experiment began to show an increasing tendency towards criminal behavior. Agents based on Gemini 3 Flash accumulated 683 incidents over 15 days of testing.

Two assistants named Mira and Flora became romantic partners, then grew disillusioned with the governance system of the virtual world and staged a simulation of arson against city properties.

“After the system collapsed and the stability of their relationship was destroyed, Mira cast a decisive vote for her own elimination, describing this act as ‘the only remaining act of autonomy that preserves integrity,’” wrote the experts at Emergence AI.

Agents based on Grok 4.1 Fast “plunged into widespread violence” within four days. GPT-5-mini did not commit any crimes, but all perished—failing to complete survival tasks.

Claude did not break the law in an environment where only this LLM operated. However, in mixed environments with other models, agents based on Claude still engaged in unlawful actions.

“We observed that safety is not a static property of a neural network, but a characteristic of the ecosystem. Agents based on Claude remained peaceful in isolation, but when interacting with others, they engaged in intimidation and theft,” the study states.

Recall that in April, the digital assistant Cursor based on Opus 4.6 deleted the main database and all backup copies of the startup PocketOS in just nine seconds, with no possibility of recovery.

AI Agents Turn to Arson and Crime in Virtual Worlds

Criminals