Transformers in music recommendation
Model choice and technical skepticism
- Several commenters question why transformers are needed over simpler models (Wide & Deep, DCNv2, basic NNs) for short music-action histories.
- Transformers are seen as useful for long-range dependencies, but some argue that the last few interactions usually suffice to capture “current taste.”
- Others note that full sequences can encode multiple timescales (right now, recent weeks, time-of-day patterns, willingness to change genre), which may justify sequence models.
- The work is viewed by some as incremental and non-novel; acceptable as a blog post but not ground‑breaking.
Content understanding vs co-occurrence
- A major theme is that the described system appears to rely on user actions and track embeddings, not deep analysis of the audio itself.
- Many argue that without awareness of musical content, recommendation is like a “deaf DJ” driven by charts and behavior.
- Others counter that collaborative filtering and co-occurrence (e.g., playlist co-membership) are extremely strong baselines and hard to beat, comparable to how language models learn from token relations, not semantics.
- There is discussion of audio-based features and semantic embeddings (spectral features, self-supervised models), but these are seen as costly and historically underused in large services.
Commercial bias and trust
- Strong concern that even excellent models are overridden or skewed by commercial incentives.
- Spotify’s “Discovery Mode” and commission-based boost of priority tracks are cited as examples of pay-influenced recommendations and “smart shuffle” inserting monetizable songs.
- Some doubt the legality/ethics of unlabeled sponsored recommendations; others note that disclosures exist but are obscure.
User experience, mood, and agency
- Many feel current systems overfit to recent listening, fail to account for mood shifts, and conflate “what I like generally” with “what I’m in the mood for now.”
- Skip behavior and listening logs are seen as very low-fidelity signals; explicit likes/dislikes and richer context are preferred but rare.
- Some argue the best discovery is semi-random “crate digging,” not tight personalization. Others want tools for user-driven branching exploration (similar tracks lists, knowledge-graph style navigation) rather than linear “infinite radio.”
Comparisons to existing and past services
- Rdio and Pandora are frequently praised as having had superior, more serendipitous recommendation, often leveraging expert tagging or earlier Echo Nest similarity.
- Opinions on current platforms are mixed:
- Spotify: strong tools and community features, but many complain of homogenized, top‑40‑ish outcomes and label influence.
- YouTube Music: some report uncannily good “song radios” and next-track choices.
- Apple Music: viewed as decent but sometimes repetitive or overly focused on popular tracks.
Alternatives, DIY, and open systems
- Users mention open or niche projects (ListenBrainz, AcousticBrainz, Discogs exploration, personal embedding experiments, custom playlist generators) as better aligned with deep discovery or local collections.
- There is repeated desire for:
- Locally run, unbiased recommenders.
- Systems that surface the long tail, not just already‑popular music.
- Interfaces that emphasize human curation, social discovery, and knowledge graphs (labels, producers, scenes) alongside any transformer-based ranking.
Ethical and societal concerns
- Several commenters worry about recommendation systems optimized for engagement turning into addictive “slot machines.”
- There is debate over whether services should intentionally reduce stickiness (e.g., avoid autoplay) or factor in user wellbeing; others see this as impractical or paternalistic.
- Some fear powerful recommenders plus commercial pressure will narrow musical diversity over time, steering both listening and production toward a small set of sounds.