ElevenLabs hit $500 million in annual recurring revenue in April 2026. Ten months after launching its first music model, the company shipped Music v2 on May 27 with three capabilities that no other AI music generator currently combines: section-by-section composition with inpainting, mid-track genre switching, and non-musical sound effects embedded directly inside generated audio.
For creators who score videos, produce podcast intros, or need background tracks they can monetize without Content ID anxiety, this is the update that changes the workflow.
What Changed From v1
The original ElevenMusic, which launched in August 2025, generated full songs from text prompts. It worked, but the output was a single take. If the chorus landed and the bridge flopped, you regenerated the entire track and hoped the chorus came back just as strong.
Music v2 treats a song as a sequence of editable sections. You build the intro, lock it, generate a verse that connects to it, iterate on the chorus independently, and stitch the pieces together into a finished track. The model maintains key, tempo, and timbral continuity across sections without requiring manual alignment.
Three features define the upgrade:
Inpainting. Select any section of a generated track and regenerate just that part. You can change the vocal delivery on the bridge, swap the guitar tone in the second verse, or rebuild the outro without touching anything else. This is the feature that turns Music v2 from a generation tool into something closer to an editing environment.
Genre switching within a single track. The model can transition from acoustic folk to electronic in the same song, or from orchestral swells into a hip-hop beat drop, while keeping the composition coherent. ElevenLabs describes it as “opera to heavy metal in 16 bars,” which sounds like a party trick until you need a video score that shifts energy at the two-minute mark.
Embedded sound effects. Music v2 can weave non-musical audio (rain, crowd noise, a phone ring, a door slam) directly into the composition. This blurs the line between soundtrack and sound design, which is particularly useful for narrative content, podcast intros, and video creators who currently layer sound effects manually in their DAW.
How Section Editing Works in Practice
The workflow is closer to building a track in a DAW than typing a prompt and crossing your fingers.
Start with a prompt for the intro. Once you have a version you like, lock it. Then prompt the next section (verse, pre-chorus, chorus) and the model generates audio that flows from where the locked section ends. If the verse works but the chorus falls flat, you select just the chorus bars, give the model new direction (“make the vocals more breathy, add a counter-melody on the guitar”), and regenerate. The locked sections stay exactly as they were.
This solves one of the most frustrating problems in AI music generation: the “almost perfect” track. With Suno or Udio, getting 90% of a song right meant gambling on the other 10% every time you hit regenerate. With inpainting, you iterate on the weak section and leave the rest alone.
For creators producing recurring content (weekly YouTube intros, podcast bumpers, product launch scores), section editing also enables templated structures. Lock a signature intro, generate fresh variations on the body, and maintain brand consistency across episodes without using the exact same track.
Sound Effects Change the Scoring Game
Most video creators layer sound effects on top of music in their timeline. Rain goes on one audio track, the score goes on another, and you spend 20 minutes adjusting crossfades until they feel natural.
Music v2 collapses that step. Prompt the model to include ambient rain that intensifies during the chorus, or embed a crowd roar that builds into the drop, and the generation treats those effects as part of the composition. The mix comes out balanced because the model produced them together.
This matters most for creators who produce narrative content: documentary YouTubers, podcast storytellers, and course creators building cinematic module intros. The sound design and the score arriving as a single rendered file saves a production step that currently requires either a DAW or meticulous timeline editing.
The Licensing Advantage
Here’s where ElevenLabs has a genuine structural edge over the competition.
Suno generates roughly 7 million songs per day and has crossed $300 million in annual recurring revenue. It’s the dominant player. But Suno settled a lawsuit with Warner Music Group in November 2025, and Sony and UMG litigation is still active in federal court. Under Suno’s current terms, the company retains underlying ownership of generated songs.
Udio faces similar legal uncertainty.
ElevenLabs built Music v2 on licensed training data from day one. There are no pending or settled copyright lawsuits. Paying subscribers get clear commercial use rights with no sync fees, no clearance delays, and no restrictions on deployment. The company’s position: you own what you generate.
For creators who monetize through YouTube, this distinction has real financial consequences. Independent benchmarks show that ElevenLabs tracks carry materially lower Content ID risk compared to Suno and Udio output. If your revenue depends on AdSense staying active on a video, the licensing clarity alone might justify the platform switch.
YouTube monetization safety scores from independent testing: ElevenLabs 9/10, Suno 7/10, Udio 7/10.
Pricing and Plans
ElevenLabs runs music generation through its existing credit system:
- Free (ElevenMusic app): Up to 7 songs per day on iOS. No commercial rights. Attribution required.
- Starter ($5/month): Shared credit pool with voice and audio tools. Commercial use included.
- Creator ($22/month): Higher credit allocation, professional voice cloning, 192kbps output.
- Pro ($99/month): Full credit pool for heavy production workflows.
- Scale ($330/month): Team-level access.
The API (ElevenAPI) runs at approximately $0.40 per minute of generated audio after the 50% price cut announced alongside Music v2. ElevenCreative, the licensed music library for brands and agencies, dropped pricing by 40%.
For comparison, Suno Pro costs $10/month for 2,500 credits with commercial use. Udio Standard is $10/month for 1,200 credits. On a pure cost per song basis, Suno remains cheaper. The tradeoff is licensing clarity versus volume.
Where Suno and Udio Still Lead
Music v2 is not trying to compete on raw vocal quality, and honesty about that matters.
Independent ratings put Suno at 9.5/10 for vocal realism and ElevenLabs at 6.5/10. If you need a convincing lead vocal that sounds like an actual singer performing an original pop song, Suno is still the better tool. Suno also offers 1,200+ genre presets, a built-in DAW (Suno Studio), and stem export for mixing in external software. For a deeper look at what Suno v5.5 brought to the table, we covered the voice cloning and custom model features here.
Udio leads on technical audio fidelity with 48kHz output and has mature inpainting that predates ElevenLabs’s implementation.
If you want to run models on your own hardware instead, Stable Audio 3.0’s open-weight release is worth a look. And for a full comparison of every AI music tool available right now, our ranked guide covers the entire landscape.
Where Music v2 wins: commercially safe background music, video scores, podcast audio, multilingual tracks, and workflows where sound design and music need to arrive as a single file. It’s a production tool for creators who treat music as a supporting element of their content, not the content itself.
Who Should Switch Right Now
Three creator profiles benefit immediately.
YouTube creators who monetize with ads. If Content ID flags are costing you revenue, switching to a licensed-data model eliminates the most common trigger. The section editing means you can build signature channel music and iterate on it across videos without starting from scratch each time.
Podcast producers. The sound effects integration lets you generate intro music with embedded ambient textures in a single pass. No layering, no DAW required. If you’re building a podcast workflow from scratch, Google Flow Music is a free alternative for simpler scoring needs.
Video editors and filmmakers. Genre switching within a single track means your score can follow the emotional arc of a scene without stitching together multiple generations. Inpainting means you can adjust the energy at a specific timestamp without re-scoring the entire piece.
For creators who need lead vocals on finished songs (musicians, artists, singer-songwriters), Suno v5.5 remains the stronger choice. Music v2 is not competing for that use case.
ElevenLabs Music v2 is live now on ElevenMusic (iOS), the ElevenAPI (early access), and ElevenCreative.
Recent Posts
Luma Ray3.2 introduces 16-keyframe control, 8-face tracking, and HDR EXR output. A closer look at how the model changes AI video production for creators.
MWM's AI Mobile Squad Turns a Single Prompt Into a Native App Business
MWM launched three AI agents that turn a chat prompt into a native iOS app with built-in monetization, all in under three minutes. Here is what the platform actually delivers for creators.
