I recreated Shazam’s algorithm with Go
Overall reaction & project scope
- Many commenters find the Go implementation impressive, especially as a first substantial Go project.
- The repo is seen as a demo of the Shazam-style fingerprinting algorithm, not a production app or hosted service.
- Some note that similar reconstructions have existed for years; this is viewed as a fun/new iteration rather than novel research.
Data sources, coverage & “useless without all songs” debate
- The system currently uses Spotify links only to fetch metadata, then locates and downloads matching audio from YouTube for fingerprinting.
- Several people raise legal and ToS concerns around YouTube/Spotify ripping.
- One view: the algorithm is “useless unless you have all songs on earth”; the counterpoint is that the open-source algorithm is valuable for anyone who has or can build their own dataset, including non-music uses.
- Suggestions include: using local files or WAVs, and community or shared fingerprint databases. MusicBrainz/AcoustID is mentioned as an existing open fingerprint ecosystem.
Algorithm & technical discussion
- Summary of the approach: FFT → derive sparse audio fingerprints → index → similarity search.
- Links and references surface short-time Fourier transform, spectrograms, and the time vs frequency resolution tradeoff.
- Discussion touches on whether fingerprints are closer to hash-like features or something that could support clustering (e.g., artist identification); the author says current fingerprints are not designed as clusterable vectors, and ML would likely be needed for that.
Shazam vs SoundHound vs others
- Mixed reports on accuracy: some say SoundHound has long been better, especially with humming or noisy input; others find Shazam more reliable in specific tests.
- A SoundHound employee notes that failures are often due to the song not being in the database, and that all services have coverage gaps.
- One user reports mic conflicts between Shazam and SoundHound on iOS.
- Google’s “Now Playing” / “hum to search” and Pixel’s low-power always-listening feature are cited as strong alternatives.
Patents, IP, and naming concerns
- Multiple comments highlight that Shazam’s core algorithm is patented (at least in the US) through ~2025, with worldwide filings.
- Patent vs copyright is clarified: patents cover methods/algorithms, not just copied code; open source is not exempt from patents, and even personal use can infringe in the US.
- There’s debate over prior publication vs patent filing dates, but a provisional patent from 2002 appears to cover the core work.
- Questions are raised about liability for open-source authors outside the US when US users download or run infringing code; consensus is that large-pocket defendants are most at risk, but the law is complex and unclear in detail.
- Several people urge an urgent name change away from “Shazam”-derived branding to reduce trademark and legal risk; alternative names are proposed and “SoundScout” is popular.
Implementation feedback & polish requests
- Setup is seen as rough: confusing
cdinstructions, MongoDB requirement without clear configuration guidance, and a vulnerable npm dependency tree. - Suggestions include:
- Replace or optionally swap MongoDB with SQLite or another embedded DB.
- Provide Docker/Docker Compose for easy startup.
- Add direct support for fingerprinting local/WAV files (the author commits to doing this).
- Improve rate-limiting logic instead of hardcoded sleeps, though some defend sleeps as a simple way to avoid third‑party bans.
- A leaked Google/YouTube API key is spotted; the author disables it.
Miscellaneous points
- Commenters share references to earlier Shazam-related papers, talks, and prior HN threads.
- Some note the broader historical role of audio as an early driver for computing and electronics applications.