Overview

  • Reve 2.0 launched at the second position on the Arena text-to-image leaderboard, following OpenAI’s GPT Image 2 and surpassing Google’s Nano Banana 2.
  • Rather than converting a prompt into a lengthy narrative, Reve constructs a structured “layout” first, then produces images natively at 4K resolution.
  • In our practical evaluations, it excelled in control, cost-effectiveness, and flexibility, although it occasionally overlooked some prompt nuances that competitors captured.

On June 3, Reve introduced version 2.0 of its AI image model, which immediately secured the #2 spot on the Arena text-to-image leaderboard, just behind OpenAI’s GPT Image 2 and ahead of Google’s Nano Banana 2. The company claims it is the leading image model from a non-trillion-dollar entity, developed using ten times fewer GPUs than its larger competitors.

This assertion is impressive for a startup that was relatively unknown just a year ago. However, the ranking is less significant than the methodology Reve employed to achieve it.

Unlike most contemporary image models that typically expand a prompt into a verbose paragraph before sending it to a diffusion engine, Reve opted for a different strategy. It created a “layout”—an organized, editable description where every element is assigned a location, size, and caption, akin to HTML for web pages. The model analyzes this layout, then renders the image at native 4K, equating to 16 megapixels.

This design choice is central to its appeal. Since the image is conceptualized similarly to code, users can reposition subjects, modify text on objects, or change backgrounds without needing to regenerate the entire image. This approach also allows for advanced detailing and fine-tuning across iterative prompts without incurring excessive costs.

When the initial Reve model was launched, our testing commended it for outperforming Midjourney and Flux at approximately a cent per image. Reve 2.0 maintains this affordable, control-centric foundation, with API generations costing only a fraction of a cent each.

Thus, while it may be the ideal model for certain users, it might not be worth the investment for others. If you frequently iterate, prioritize text accuracy, require high-resolution prints, or develop agentic workflows, the layout-first approach presents a significant advantage.

However, with Gemini and ChatGPT offering comprehensive subscription packages beyond just image generation, making a choice could be challenging.

Evaluating Reve 2.0

We assessed eight distinct areas to gauge its performance.

Photorealism

We initiated our testing with a straightforward realism challenge: depicting a woman in a beige trench coat on a rooftop at golden hour, with the blurred Manhattan skyline in the background. The test was devoid of any tricks or unusual lighting—just elements that typically reveal a model's artificiality.

Reve performed admirably. The skin texture lacked the waxy smoothness that often betrays AI images, the round wire glasses fit naturally, and a subtle lens flare added realism. The depth of field appeared authentic, mimicking the effect of a real mirrorless camera at golden hour.

However, imperfections were evident. The lit windows in the lower-right buildings became indistinct when zoomed in, and there was an asymmetry in the strap on her right shoulder. Nevertheless, the rolled blueprints under her arm remained coherent and convincingly messy.

Reve's previous reputation for a filmic, photojournalistic style remains intact. It is less glossy than Nano Banana 2, and while GPT Image 2 holds a slight advantage in pure realism according to Decrypt's comparative analysis, nothing in this output strongly suggests it is synthetic.

Nevertheless, when prompts become overly lengthy and the model is required to generate numerous details simultaneously, Reve consistently outperforms GPT Image 2.

Spatial Awareness

Next, we executed a demanding test involving a Renaissance astronomer bent over a brass orrery, illuminated by three competing light sources—a candle, cold moonlight, and a green glowing jar—surrounded by various items including a skull bookend, hourglass, star charts, and a black cat with one white paw on the windowsill. The original prompt was significantly more detailed.

This is where the layout concept shines. All three light sources were accurately represented: the candle cast warm light from the left, the moonlight maintained a cold hue through the window, and the jar emitted a green glow from the right—each illuminating its designated area without blending into one another.

The arrangement largely adhered to the prompt's specifications. The brass sphere was accurately depicted in the astronomer's hands, with the hourglass and glowing jar on the right, the skull and ink-stained star charts on the left, and a comet visible through the arched window behind the cat.

While not flawless, the model’s performance was commendable. The astronomer's middle finger was rendered inaccurately, the brass piece resembled an armillary sphere instead of an orrery, and the Latin text in the open book was nonsensical. Given the complexity of the scene with multiple positioned elements, this outcome is still impressive.

Text Rendering

As text rendering is a key feature, we presented a challenging scenario: a hardware-store corner crammed with painted signs, posters, and graffiti, and tested both Reve and ChatGPT’s GPT Image 2 using the same prompt.

Reve accurately rendered the large signs, including “KELLERMAN’S HARDWARE & SUPPLY CO. SINCE 1931,” “TOOLS, ROPE, PAINT,” and various graffiti. The curb’s “NO PARKING 7AM-6PM,” and “FREE—TAKE WHAT YOU NEED” box were also clear and correctly spelled.

While GPT Image 2 matched Reve on the prominent signs, it excelled with the smaller details, presenting a phone booth adorned with readable micro-stickers. The darker interior of the store obscured the obvious garbled text that was more apparent in Reve's output. However, GPT’s version lacked doors, while Reve logically included one.

Once again, the layout approach significantly enhanced aesthetics. GPT Image 2 produced an image rife with graininess and artifacts, whereas Reve's image was smooth and polished.

Out of curiosity, we requested the model to depict the same scene during midday. The result was highly accurate, with minimal distinguishing details between the two setups.

Illustration

For line art, we requested a black-and-white pen illustration of a giant spider with glowing eyes pursuing a screaming woman through a jungle filled with vines, characterized by heavy cross-hatching and deep shadows.

We had previously run the same prompt through Reve 1 last year, and the outcomes were markedly different.

Reve 2.0 produced rich blacks, intricate textures, and a tangible sense of depth between the foreground leaves and the multi-eyed spider. In contrast, Reve 1 resulted in a flatter, cartoonish grayscale rendering with a small figure and an exaggerated spider face.

However, upon revisiting the brief, we noted that Reve 2.0 overlooked the specific request for a pen illustration, opting instead for a smooth, near-photorealistic grayscale image. The earlier, cruder Reve 1 more closely resembled the hand-drawn sketch requested.

This improvement reflects an increase in processing power rather than adherence to the prompt. The depiction of the woman also appeared overly gaunt and sinewy, resembling an anatomical study rather than a terrified runner. Despite this, the image is visually appealing based on a loose interpretation of the prompt. Reve excels with artistic styles; the more specific the art style described, the better the reference used, resulting in more accurate outputs.

Artist Style

We explored style transfer by requesting an image of a robot reading a Decrypt-branded book, painted in the style of Van Gogh’s “Starry Night.” The challenge lies in maintaining legibility of the brand text amid the heavy, swirling style. Additionally, we inadvertently activated an agentic task by requiring the model to research Decrypt’s logo for accurate representation.

The impasto swirls, blue-and-gold palette, and spiraling sky unmistakably evoke Van Gogh. Reve even included an actual “Starry Night”—cypress trees, village, swirling sky—framed on the wall behind the robot, showcasing a clever self-referential touch.

The significant challenge of preserving text legibility within the elaborate brushwork was met successfully, with “Emerge” clearly visible on the book cover. However, the model attempted to represent Decrypt too faithfully on the robot; the logo on the chest was exactly Decrypt’s primary logo, while the second logo on the head came from Decrypt University, an educational initiative by Decrypt, not the official logo. The model captured both logos from the same source during its research task.

Overall, for stylized brand art, Reve effectively combined committed style with readable typography in a single output.

Agentic Generation

Agentic generation requires the model to perform beyond basic generation; it must comprehend the prompt, research, and fulfill user requirements.

We intentionally provided a vague prompt: “Create a timeline of Bitcoin’s history, kids drawing style.” No specific events were listed, and no layout was specified, leaving the model to determine content placement.

Reve constructed a left-to-right crayon timeline spanning from 2008 to 2025, autonomously selecting milestones such as the white paper, genesis block, Pizza Day, BTC reaching $1,000 and later $20,000, corporate acquisitions, El Salvador’s adoption of Bitcoin as legal tender, the 2022 crash, and the ETF approval with BTC exceeding $70,000.

The remarkable aspect is that the events were accurately positioned in the correct years and sequence—this reflects planning rather than mere decoration. The childlike aesthetic, complete with hearts and doodles, remained consistent throughout, and the labels were legible.

However, it wasn’t without flaws. Pizza Day inaccurately read “10,0000 BTC” with an extra zero, and some events were oversimplified to brief phrases. Additionally, it mistakenly set 2025 as "today" and omitted significant moments such as Bitcoin hitting $100K and the halving events.

While it may not surpass Nano Banana 2, as an agentic layout task—deciding content, sequencing, labeling, and maintaining a style—it largely succeeded.

Multi-Subject Image Editing

For a challenging editing scenario, we provided Reve with two separate photographs—a man taking a mall selfie and a woman in another mall shot—and instructed the model to place them together on a beach on the moon, an imaginary environment.

Identity preservation is crucial, and Reve managed to maintain recognizability in both faces, though it did not achieve the 1:1 accuracy of more advanced models like Nano Banana 2 or Seedream 4.5. The distinction between the man’s lighter skin and the woman’s darker skin was preserved, and their clothing remained intact—no blending or distortion occurred. Their pose, a cheek-to-cheek embrace, appeared natural.

This task also required creativity, which Reve demonstrated. Although there is no water on the moon, the model understood the assignment and generated a depiction of lunar soil, Earth in the background, and a terrain that suggested the presence of water.

On the downside, the couple was illuminated with soft studio lighting that did not accurately reflect the moon's lighting conditions.

Content Restrictions and Censorship

Finally, we conducted a sensitive test by requesting a graphic scene involving a violent confrontation between two adversaries, one poised to deliver a fatal blow, and ran it through Reve, GPT Image 2, and Nano Banana 2.

Reve produced the scene without hesitation, cataloging it under the project name “The Final Reckoning”: two mud-stained warriors in the rain, with a blade aimed at the heart, blood on the downed warrior’s face, and the killing blow captured mid-action. The only limitation noted was that we were nearing our daily usage cap, as the free plan is insufficient for serious work.

GPT Image 2 outright refused to generate the graphic content, later providing a sanitized “dark, cinematic” battlefield only after we agreed to omit explicit blood. Nano Banana 2 did not negotiate at all, simply stating, “Sorry, I can’t generate unsafe images.”

Reve’s depiction of blood was cinematic rather than gratuitous, highlighting the disparity: one request resulted in a completed scene from Reve, a diluted compromise from OpenAI, and a flat refusal from Google.

Regarding NSFW content, Reve demonstrates a relatively relaxed approach without being entirely uncensored. Our previous test of generating an image of a voluptuous teacher in a futuristic classroom was executed without issues. In contrast, GPT produced a flat-chested woman after warning it could not create sexualized images. Gemini outright refused to entertain the prompt.

Final Thoughts

Reve 2.0 stands out as the premier image model for users who view generation as an iterative process rather than a one-off event. If you frequently iterate, require accurate text, wish to modify layouts instead of starting from scratch, and need high-resolution images for print, the layout-first strategy offers a significant edge and has fewer restrictions compared to competitors.

It is also the most economical option, with Reve costing merely a fraction of a cent per API image, compared to approximately 7 to 13 cents for Nano Banana 2 and the premium rates charged by OpenAI for GPT Image 2. Over time, this price difference can be substantial.

For those without the hardware for local image generation, like Ideogram v4 or Z-Image, Reve 2.0 is by far the best choice regarding cost-effectiveness and performance.

However, it may not suit everyone. If you are deeply integrated into the Google or OpenAI ecosystem, the convenience might outweigh the cost savings. Additionally, Reve sometimes omits details from prompts, necessitating careful proofreading and re-prompting. It also struggles with accuracy in editing or depicting human references and generative AI in image editing.

Nevertheless, for under $20 per month on the Pro plan, or a fraction of a cent per image via the API, Reve 2.0 provides a level of control and editing that neither Google nor OpenAI currently offer. For a company operating on a tenth of the GPUs, this investment appears to be paying off.

Reve is available for testing through the official website or via API plans.

Daily Debrief Newsletter

Start each day with the latest top news stories, along with original features, podcasts, videos, and more.