Stable Audio Open
Licensing and “Open Source” Debate
- Model released under Stability’s non-commercial research license; commercial use requires a paid membership.
- Several commenters argue this is misleadingly branded as “open source” and is part of a broader pattern in AI of abusing the term.
- Some push back that open source has become overly corporate-friendly and that non-commercial or strict copyleft licenses can be desirable.
- One commenter suggests the license may not be enforceable on model weights, claiming weights aren’t currently copyrightable, but others don’t endorse this and note legal risk.
- Frustration expressed that many AI enthusiasts ignore OSS licenses (e.g., GPL) when building projects.
Training Data, Creator Rights, and CC Licenses
- Model is trained on FreeSound and Free Music Archive. This is praised as “commons in, commons out” and as an ethical alternative to scraping.
- Others note that Creative Commons licenses have conditions and are not “free of copyright issues.”
- Clarification: the model’s Hugging Face page says only CC0, CC BY, and CC Sampling+ audio were used, with attribution lists; this alleviates concerns about more restrictive FMA tracks.
- Some doubt that many original FMA contributors would have anticipated or welcomed this ML use, despite the license terms.
Ethics, Copyright, and “AI = Theft?”
- One line of argument: training on copyrighted data is not theft; the core legal issue is models outputting copyrighted text/audio verbatim, which can be mitigated with filters.
- Others counter that copyright law did not anticipate machines that can memorize and regenerate large volumes of content; they see training on unlicensed work as unethical or potentially infringing.
- NYT v. OpenAI is cited: critics say GPT stores large archives; defenders respond that evidence shows only short, overfit snippets are reproduced, not whole articles.
- Comparisons made between models and compression algorithms; disagreement on whether “ability to reproduce” means a copy is legally “stored” in the model.
- Multiple commenters suggest current copyright and fair use doctrines are inadequate for ML and will need rethinking.
Model Capabilities and Comparisons
- Stable Audio Open is generally seen as targeting sound effects, loops, and textures rather than full songs or vocals.
- Some report decent quality but complain about harsh high frequencies and lack of speech/singing.
- Udio and ElevenLabs music demos are repeatedly cited as higher quality; others dismiss current AI music as bland, structurally shallow, and easy to spot if you listen closely.
- Disagreement on detectability: some claim listeners couldn’t reliably distinguish AI vs. human tracks; others are confident they could.
Use Cases and Creative Ideas
- Proposed “AI 8-track” app: hum a melody on multiple tracks, convert each line into instruments by text prompt, then lightly mix for rapid song sketching.
- Commenters note that Google MusicLM, Suno, and Meta’s MusicGen already do variants of “hum-to-style,” but a polished workflow app is still missing.
- One user experiments with “promptmusic” where terse textual prompts generate entire unusual tracks, observing that most song information resides in the model, not the prompt.
- Several wish for strong audio-to-audio features: e.g., giving a drum pattern and having AI compose around it; current workarounds involve chaining separate tools and research models.
Broader Reflections and Miscellaneous
- Some see this as an “Ethereum-merge-style” ethical inflection point: proof that high-quality models can be trained on consented/commons data.
- Others dispute the crypto analogy and dive into a long subthread debating Proof of Work vs. Proof of Stake, environmental impact, and whether PoS inherently enriches the already wealthy.
- A few commenters express cautious optimism and are glad Stability is still shipping models despite reports of company troubles.
- Questions are raised about reusing such models for audio restoration/denoising; no clear answer or recommended open tools emerges.
- A hosted demo and landing page are noted for trying the model online.