Rio de Janeiro's AI Model Claims Victory Amid Controversy

Rio de Janeiro's IplanRIO launched an AI model that initially claimed to outperform major competitors, but was later revealed to be based on existing models from Nex and Qwen.

Summary

IplanRIO unveiled Rio 3.5 Open 397B on June 13, claiming it as a government-developed frontier AI model that surpassed Qwen 3.7 Plus in benchmark tests.
Nex, an AI firm, demonstrated that the model is essentially a 0.6 Nex / 0.4 Qwen weighted combination.
IplanRIO amended the model card to credit Nex, removed benchmark claims, and attributed the mix-up to an "incorrect upload."

The IT agency IplanRIO from Rio de Janeiro launched Rio 3.5 on June 13, promoting it as a cutting-edge AI model featuring 397 billion parameters and a permissive open-source license, developed by the local government of a Global South city.

The model's release coincided with Brazil's World Cup opener, generating significant buzz on social media that quickly spread beyond the country's borders.

However, the excitement was soon overshadowed by questions regarding the model's true origin.

The initial model card indicated that Rio 3.5 was a post-training of the Qwen 3.5 397B model from Alibaba, enhanced with a new reasoning layer named SwiReasoning. The reported development cost was R$500,000 (approximately $100,000 USD), which is about 30 times less expensive than comparable AI solutions available on the market.

The model's architecture employs a Mixture-of-Experts design, activating only about 17 billion of the total 397 billion parameters for each token processed, making inference costs lower than the model size might suggest. Additionally, it accommodates both text and visual inputs and supports over a dozen languages, all while being released under an open MIT license.

SwiReasoning serves as the model's core feature, functioning as a training-free inference framework that adapts dynamically between two modes. When the model is confident about the next word—indicating low entropy in the probability distribution—it communicates in straightforward language. Conversely, when uncertainty arises, it engages in latent reasoning, utilizing hidden internal states without generating tokens. IplanRIO asserted that Rio 3.5 was specifically designed to leverage this capability, which is reflected in its benchmark results.

The self-reported performance metrics were impressive. For Terminal-Bench 2.1—assessing autonomous command execution as a percentage of tasks completed—Rio 3.5 achieved a score of 70.8%, surpassing Qwen 3.7 Plus's 70.3% and DeepSeek v4 Pro's 67.9%.

In the IMOAnswerBench, a math olympiad assessment scored by accuracy percentage, Rio 3.5 recorded 89.5%. For HLE—Humanity's Last Exam, a challenging multi-domain test scored by percentage—Rio 3.5 scored 36.5%, again ahead of Qwen 3.7 Plus, which scored 34.7%.

The narrative of a municipal government outpacing significant flagship models in essential quality benchmarks gained traction, especially after Rio de Janeiro's Mayor Eduardo Cavaliere tweeted about the achievement.

"An open AI model trained in Rio and publicly funded over the last year by [the Municipality of Rio] has just surpassed all other models," Cavaliere tweeted. "Today, the world is talking about an open AI model trained in Rio."

🇧🇷 Modelo de IA aberta treinada no Rio com financiamento público ao longo do último ano pela @Prefeitura_Rio superando todos os outros modelos. Inteligência artificial não é uma coisa distante, estrangeira, de laboratório bilionário…não existe só pra fazer texto, imagens… https://t.co/GK1ThytVV9

— Eduardo Cavaliere (@CavaliereRio) June 14, 2026

The Nex Revelation

However, the claim of being "trained in Rio" turned out to be misleading.

Nex-AGI, an open-source AI collective based in Shanghai, responded on X shortly after the model's launch, stating, "The Rio 3.5 model broke the internet this week. The plot twist? It's essentially our open-source model, Nex N2 Pro, wearing a different hat."

They conducted an analysis of the model weights, revealing that the math was precise: Rio 3.5 ≈ 0.6 × Nex N2 Pro + 0.4 × Qwen 3.5. This was followed by a verification script and a detailed GitHub report.

The Rio 3.5 model broke the internet this week. The plot twist? It’s essentially our open-source model, Nex N2 Pro, wearing a different hat.

🤯 We analyzed the weights, and the recipe is exact: Rio 3.5 ≈ 0.6 * Nex N2 Pro + 0.4 * Qwen 3.5

It even literally introduces itself… pic.twitter.com/yHRRu37aut

— Nex (@NexEcosystem) June 14, 2026

This evidence was twofold.

First, the behavioral analysis. Nex removed the hardcoded "You are Rio" system prompt from the operational model and presented it with 120 identity questions. The results showed that without the prompt, the model identified itself as "Nex, from Nex-AGI" 79.2% of the time, and never as "Rio." Furthermore, it recounted the company’s history verbatim, mentioning the "Shanghai Innovation Institute" and "a large-model ecosystem alliance," which are part of Nex's own training data appearing in a different model.

Secondly, the mathematical analysis. In a true weight merge, every parameter in the new model lies along a straight line between the two source models. Nex evaluated this consistency across all 60 layers and found a collinearity score of 0.993. Two unrelated models with their parameters in the same space would score nearly zero by chance. Achieving 0.993 across all layers indicates a significant correlation. The mixing ratio held stable at α ≈ 0.571.

This implies that the model was predominantly composed of 60% Nex, with the remainder attributed to the base Qwen model.

"Every weight tensor in Rio is, to thousands of standard deviations, the same 0.6/0.4 blend of Nex and Qwen—across all 60 layers and every component of the network," stated Nex. "There is no innocent explanation."

Source: Nex Ecosystem

Moreover, the performance statistics told a less favorable story. Nex N2 Pro, released shortly before Rio 3.5, achieved a score of 75.3% on Terminal-Bench 2.1, which is higher than Rio's 70.8%. On GDPval, an economic forecasting benchmark rated as an Elo-style rating, Nex scored 1,585 compared to Rio's 1,533. If Rio is indeed 60% Nex, it would be expected to score lower than Nex on its own benchmarks, which it did.

Source: Nex Ecosystem

IplanRIO's Response

IplanRIO subsequently revised the Hugging Face model card, removing the benchmark table and updating the attribution.

According to the revised Readme, "The model is built via a merge of nex-agi/Nex-N2-Pro and Qwen/Qwen3.5-397B-A17B, preceded by On-Policy Distillation from a stronger model. We detected an incorrect upload in the previous version, where the base merged version was uploaded instead of the final distilled model. We are sorry for the confusion and apologize profusely."

No further public comments from IplanRIO have emerged. Nex is now recognized for its contribution.

The "incorrect upload" explanation is pivotal. According to IplanRIO, the intended release was a distilled version of the merged base—not the raw merge itself. On-policy distillation involves a stronger teacher model generating outputs that the student model learns from while also producing its own outputs. This process is more resource-intensive than a simple merge, yet still less costly than training from scratch. If this step was genuinely performed, it would imply some original development beyond the merge.

What ultimately launched, according to IplanRIO, was merely the raw merged model without any enhancements.

Opinions within the community vary regarding the implications of this incident. Tech commentator Rafael Quintanilha offered a generous interpretation: Since Nex N2 Pro is itself built upon Qwen, the team may have credited the foundational architecture but not elaborated further. He also noted that the model gained viral attention during a World Cup match, which might have contributed to its readiness for public use being questioned.

about the Rio 3.5 situation

merging two ~400B-class models and then applying policy distillation isn’t trivial

that said, they made two mistakes:

- a technical error (probably caused by a lack of attention to detail)

- and a communication one (we can debate the integrity of…

— montano (@lucas_montano) June 15, 2026

Developer and AI YouTuber Lucas Montano acknowledged that "merging two ~400B-class models and then applying policy distillation isn't trivial" while recognizing both a technical oversight and an issue in communication.

AI researcher Diego Ambrosio was less forgiving, pointing out that the original announcement characterized Rio 3.5 as the result of "autonomous post-training and proprietary fine-tuning," which suggested independent research rather than a mere merge.

Legal and Ethical Considerations

The merging of models is entirely legal. Nex N2 Pro is licensed under Apache 2.0, allowing for use, modification, and redistribution with proper attribution. Qwen 3.5 also has an open license. Legal action is unlikely in this case.

The core issue lies in presenting the output as an independently developed model without acknowledging all source models. This situation is not new to the open-source community; earlier this year, it was revealed that Cursor's Composer 2 was based on Moonshot's Kimi K2.5 without proper disclosure, leading to a swift reputational backlash—no legal actions, just screenshots.

was messing with the OpenAI base URL in Cursor and caught this

accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast

so composer 2 is just Kimi K2.5 with RL
at least rename the model ID https://t.co/MQOuEuF3Pd pic.twitter.com/fyUWbo1InF

— fynn (@fynnso) March 19, 2026

Utilizing existing open models is standard practice. As Decrypt has reported, the practice of stacking and merging open weights has become a subculture. The expectation is not to avoid building on others' work but to be transparent about what has been used.

This situation garnered more attention than typical attribution issues due to the institutional backdrop. When a pseudonymous developer releases a frankenmerge under their name, it carries a different weight compared to a municipal government claiming AI independence—especially during the World Cup. A Brazilian commentator noted, "It was a waste of resources."

Nex chose not to escalate the matter. "We are flattered that the City of Rio used our work to achieve SOTA performance," the company stated on X. "But in the open-source world, attribution matters."

IplanRIO is currently working to upload the corrected distilled model with appropriate attribution. Once it is released, the same evaluations will be conducted again to determine whether distillation has altered the model's performance or if it still predominantly reflects Nex's contributions, albeit with a different system prompt.

Daily Debrief Newsletter

Start every day with the top news stories right now, plus original features, a podcast, videos and more.

Rio de Janeiro's AI Model Outperforms DeepSeek Amid Ownership Controversy

Summary

The Nex Revelation

IplanRIO's Response

Legal and Ethical Considerations

Daily Debrief Newsletter