Summary

  • ElevenLabs has introduced Music v2, which can transition between genres mid-track, construct songs incrementally, and modify specific sections.
  • Stability AI launched Stable Audio 3.0, featuring a four-model suite with three open-weight variants, capable of producing tracks lasting up to six minutes and twenty seconds.
  • Despite these advancements, Suno remains the preferred platform, boasting a valuation of $2.45 billion and nearly 100 million users.

This week marked the arrival of two notable AI music developments, neither of which originated from Suno.

ElevenLabs, a voice AI firm founded in Poland and currently valued at $11 billion following a $500 million Series D round in February, has rolled out Music v2. Meanwhile, Stability AI, known for Stable Diffusion, has released Stable Audio 3.0, a suite of four models with open weights and compositions exceeding six minutes.

This development occurs against the backdrop of the Recording Industry Association of America’s copyright lawsuits initiated in 2024 against Suno and Udio, making "trained on licensed data" a crucial term in AI music releases. Both ElevenLabs and Stability AI are emphasizing this aspect to ensure users can generate outputs without legal complications.

Music v2: Seamless transitions from opera to heavy metal

Music v2 is ElevenLabs' second music model, debuting approximately ten months after its predecessor. Its primary feature is its ability to maintain coherence even under complex prompts. According to ElevenLabs, a single track can seamlessly transition from opera to heavy metal, manage rapid rap segments, and incorporate non-musical sound effects—all while preserving the integrity of the composition.

Generative audio often struggles as prompts become more intricate, making this a significant development to monitor, particularly in longer musical pieces.

Inpainting functionality is now practical: users can select a segment, regenerate it, and leave the rest of the track unchanged. Additionally, users can construct songs in sections—intro, verse, chorus—with the model ensuring continuity rather than treating each clip as an isolated creation. The multilingual capabilities have also seen enhancements, although specifics were not disclosed.

The model supports three platforms: ElevenMusic for creators, ElevenAPI for developers, and ElevenCreative for brands. Currently, it is operational on ElevenMusic and ElevenCreative, with early access for the API available through the sales team.

ElevenLabs has also reduced the pricing for Music v1 and v2 by as much as 50% for ElevenAPI and 40% for ElevenCreative self-serve users. The company achieved $500 million in annual recurring revenue in April 2026. While music is still a minor segment of that, ElevenMusic, which launched as a consumer application in April, directly targets Suno's user base.

Stable Audio 3.0: Longer compositions with open weights

Stable Audio 2.0 previously capped at three minutes, positioning it behind Suno at its launch in 2024. The new Stable Audio 3.0 features four models: Small SFX (on-device sound effects), Small (full music generation on-device), Medium (up to 6:20, requiring more robust hardware), and Large (API-only). Three of these models have open weights available on Hugging Face.

The Small models each operate with 459 million parameters—no GPU needed. The Medium model contains 1.4 billion parameters and can generate its 6:20 output in about 1.31 seconds on an H200 GPU. The Large model, with 2.7 billion parameters, is only available via API for entities with over $1 million in revenue. The per-second generation capability allows users to specify the exact track length they desire, rather than an approximation.

It is also compatible with ComfyUI for local installations.

The architecture employs a novel semantic-acoustic autoencoder, termed SAME, which is designed to maintain melodic coherence over extended outputs. LoRA fine-tuning is supported, enabling artists to tailor the models to their unique catalogs. Inpainting is also featured, allowing for single-segment, multi-segment, and causal continuation to extend tracks beyond their original duration.

For context, a LoRA (Low-Rank Adaptation model) is a compact model that conditions how the larger model generates outputs. Training a LoRA on blues will result in outputs that reflect that genre, while training on a specific artist like BB King will yield music reminiscent of that style. Inpainting allows for correction of minor inaccuracies in the generated music, enabling users to modify specific sections while ensuring they blend well with the overall track.

Stability AI has established a reputation for technical credibility in AI music over the years without achieving significant commercial success. The open-weight strategy mirrors the approach used in Stable Diffusion, aiming to engage the developer community and stimulate innovation. Licensing agreements with Universal Music Group and Warner Music Group enhance the clarity of its offerings compared to previous iterations of Stable Audio.

Targeting Suno, the leader in AI music

If ChatGPT is regarded as the leader in AI text, then Suno holds the same status in AI music. The company reached a valuation of $2.45 billion in November 2025, surpassed $300 million in annual recurring revenue, and has attracted around 100 million users.

Suno generates approximately 7 million songs daily. Warner Music resolved its lawsuit against Suno in November 2025, while Sony and Universal Music Group remain embroiled in federal court litigation.

To mitigate potential copyright disputes, ElevenLabs has secured licensing agreements with Believe, Kobalt, and Merlin. Stability AI has similar partnerships with Warner and Universal. Udio has settled with all three major labels and now operates as a closed ecosystem, meaning no generated content can be exported from the platform.

The Small and Medium models of Stable Audio 3.0 are currently available on Hugging Face, while the Large model can be accessed through the Stability AI API. Music v2 is free for users of ElevenMusic, with commercial options available through ElevenCreative and ElevenAPI.