Meituan Launches LongCat-2.0 AI Model

Meituan has officially launched LongCat-2.0, an AI model with 1.6 trillion parameters, outperforming competitors in various benchmarks.

Key Highlights

Meituan introduced LongCat-2.0 on June 30, identifying it as the technology behind "Owl Alpha."
The previously anonymous model achieved top rankings on Hermes Agent, Claude Code, and OpenClaw based on call volume.
API pricing is set at $0.75 per million input tokens and $2.95 per million output tokens, significantly lower than GPT-5.5's $5/$30 and Claude Sonnet 5's introductory rates of $2/$10.

On June 30, Meituan, a prominent Chinese tech firm, officially launched LongCat-2.0, revealing it as an open-license AI model with 1.6 trillion parameters. This model had been operating anonymously for two months on OpenRouter, under the alias Owl Alpha.

Parameters refer to the number of settings a model can adjust during training. LongCat-2.0 activates about 48 billion parameters for each token processed, with this number fluctuating between 33 billion and 56 billion based on the complexity of the query.

The stealth mode proved advantageous as, by the time Meituan made its announcement, the model had secured first place on Hermes Agent, second on Claude Code, and third on OpenClaw, all ranked by monthly call volumes.

This marks the first trillion-parameter model that has been trained and deployed entirely on domestic Chinese ASICs, distinguishing it from others like DeepSeek's V4-Pro, which utilized Huawei chips solely for inference after training occurred on Nvidia hardware.

According to Meituan, the pretraining process encompassed over 35 trillion tokens across a network of more than 50,000 domestically made accelerators, concluding with "no rollbacks or irrecoverable loss spikes." This assertion is significant given the frequent failures during large training runs on untested hardware setups and China's aim to diminish reliance on U.S. technology for training models.

The pricing structure is where LongCat-2.0 stands out. The standard API access is priced at $0.75 per million input tokens and $2.95 per million output tokens, but during the current promotional launch, these rates drop to $0.30 and $1.20, respectively, with cached context reads provided at no cost. This pricing is much lower than GPT-5.5's $5/$30 per million tokens and Claude Sonnet 5's initial $2/$10 rates, and it is competitive with DeepSeek V4-Pro’s ongoing $0.435/$0.87 pricing and Xiaomi's MiMo-V2.5 Pro, which adopted similar prices following its own May price reductions.

Meituan also offers a token plan that further reduces costs for developers and intensive users, providing 1 billion token packs for approximately $60.

We conducted a brief game development test using LongCat-2.0, and it performed adequately, producing reasonable results after several iterations. However, it fell short compared to Claude Fable and Opus 4.8, aligning more closely with Sonnet 4.6 in terms of output quality, though the cost-effectiveness of its pricing is compelling.

The model effectively generated waves of enemies from various directions, with the camera auto-centering on the nearest threat. However, it struggled with increased enemy numbers, leading to erratic target-switching during gameplay, which rendered it frustratingly unplayable at faster speeds.

This behavior is typical in vibe coding sessions, where models may overlook logical consequences and focus primarily on user prompts.

Thus, a budget-friendly model remains an appealing choice, as it allows users to iteratively refine results until they achieve the desired outcome.

Overall, the initial quality of LongCat-2.0 appeared to lie between DeepSeek v4 Flash and DeepSeek v4 Pro based on our quick coding assessments.

For further details, you can view the results on our itch.io platform.

Development of LongCat-2.0

To enhance speed and capability without significantly increasing size, LongCat-2.0 employs various techniques.

Its attention mechanism, inspired by DeepSeek's design, prioritizes the most pertinent parts of lengthy conversations rather than treating all parts equally, which aids in faster responses.

Additionally, a new N-gram embedding system allows the model to better grasp groups of words or phrases—yielding around 100 times more potential representations—without adding substantial new AI components. This approach enables the AI to recognize common phrases instead of merely individual words. For instance, it understands "New York City" as a single meaningful entity rather than three separate words. This enriches the model's linguistic comprehension without drastically enlarging its size.

After training, Meituan integrates three specialized systems: one for tool usage (Agent), one for problem-solving (Reasoning), and one for conversational tasks (Interaction). A routing mechanism determines which specialists should address each request, akin to assigning the appropriate team for specific jobs.

On SWE-bench Pro, which evaluates models based on their ability to resolve real GitHub issues from production codebases, LongCat-2.0 achieved a score of 59.5, surpassing GPT-5.5's 58.6 and Gemini 3.1 Pro's 54.2, but still trailing behind Claude Opus 4.7 and 4.8. In the FORTE benchmark, which assesses agents on everyday office tasks across 15 professions within a 45-minute limit, it scored 73.2, tying with Claude Opus 4.6 but lagging behind GPT-5.5's 77.8.

Introducing LongCat-2.0 🐱
1.6T parameters · MoE with ~48B active · 1M context
The full model behind Owl Alpha on @OpenRouter — now available.

Built for agentic coding from the ground up:
◆ LongCat Sparse Attention (LSA) — scales efficiently for 1M-context tokens
◆… pic.twitter.com/zum2SdZ0Z2

— Meituan LongCat (@Meituan_LongCat) June 30, 2026

For teams developing coding agents on a budget or those handling high-volume repository tasks, the model offers significant advantages, especially with the free context-cache reads accumulating. LongCat-2.0 is now accessible via Meituan's API endpoints compatible with OpenAI and Anthropic, as well as through agent platforms like Hermes, Claude Code, and OpenClaw that already support it.

However, self-hosting is currently unavailable. Both the GitHub and Hugging Face repositories indicate that "model weights coming soon," but Meituan has not disclosed a release date for these files.

Daily Debrief Newsletter

Start each day with the latest news stories, along with original features, podcasts, videos, and more.

Meituan Launches LongCat-2.0 AI Model, Surpassing Competitors

Key Highlights

Development of LongCat-2.0

Daily Debrief Newsletter