Stable Audio Open

Licensing and “Open Source” Debate

  • Model released under Stability’s non-commercial research license; commercial use requires a paid membership.
  • Several commenters argue this is misleadingly branded as “open source” and is part of a broader pattern in AI of abusing the term.
  • Some push back that open source has become overly corporate-friendly and that non-commercial or strict copyleft licenses can be desirable.
  • One commenter suggests the license may not be enforceable on model weights, claiming weights aren’t currently copyrightable, but others don’t endorse this and note legal risk.
  • Frustration expressed that many AI enthusiasts ignore OSS licenses (e.g., GPL) when building projects.

Training Data, Creator Rights, and CC Licenses

  • Model is trained on FreeSound and Free Music Archive. This is praised as “commons in, commons out” and as an ethical alternative to scraping.
  • Others note that Creative Commons licenses have conditions and are not “free of copyright issues.”
  • Clarification: the model’s Hugging Face page says only CC0, CC BY, and CC Sampling+ audio were used, with attribution lists; this alleviates concerns about more restrictive FMA tracks.
  • Some doubt that many original FMA contributors would have anticipated or welcomed this ML use, despite the license terms.

Ethics, Copyright, and “AI = Theft?”

  • One line of argument: training on copyrighted data is not theft; the core legal issue is models outputting copyrighted text/audio verbatim, which can be mitigated with filters.
  • Others counter that copyright law did not anticipate machines that can memorize and regenerate large volumes of content; they see training on unlicensed work as unethical or potentially infringing.
  • NYT v. OpenAI is cited: critics say GPT stores large archives; defenders respond that evidence shows only short, overfit snippets are reproduced, not whole articles.
  • Comparisons made between models and compression algorithms; disagreement on whether “ability to reproduce” means a copy is legally “stored” in the model.
  • Multiple commenters suggest current copyright and fair use doctrines are inadequate for ML and will need rethinking.

Model Capabilities and Comparisons

  • Stable Audio Open is generally seen as targeting sound effects, loops, and textures rather than full songs or vocals.
  • Some report decent quality but complain about harsh high frequencies and lack of speech/singing.
  • Udio and ElevenLabs music demos are repeatedly cited as higher quality; others dismiss current AI music as bland, structurally shallow, and easy to spot if you listen closely.
  • Disagreement on detectability: some claim listeners couldn’t reliably distinguish AI vs. human tracks; others are confident they could.

Use Cases and Creative Ideas

  • Proposed “AI 8-track” app: hum a melody on multiple tracks, convert each line into instruments by text prompt, then lightly mix for rapid song sketching.
  • Commenters note that Google MusicLM, Suno, and Meta’s MusicGen already do variants of “hum-to-style,” but a polished workflow app is still missing.
  • One user experiments with “promptmusic” where terse textual prompts generate entire unusual tracks, observing that most song information resides in the model, not the prompt.
  • Several wish for strong audio-to-audio features: e.g., giving a drum pattern and having AI compose around it; current workarounds involve chaining separate tools and research models.

Broader Reflections and Miscellaneous

  • Some see this as an “Ethereum-merge-style” ethical inflection point: proof that high-quality models can be trained on consented/commons data.
  • Others dispute the crypto analogy and dive into a long subthread debating Proof of Work vs. Proof of Stake, environmental impact, and whether PoS inherently enriches the already wealthy.
  • A few commenters express cautious optimism and are glad Stability is still shipping models despite reports of company troubles.
  • Questions are raised about reusing such models for audio restoration/denoising; no clear answer or recommended open tools emerges.
  • A hosted demo and landing page are noted for trying the model online.