2024-08-01

I recreated Shazam’s algorithm with Go

Overall reaction & project scope

Many commenters find the Go implementation impressive, especially as a first substantial Go project.
The repo is seen as a demo of the Shazam-style fingerprinting algorithm, not a production app or hosted service.
Some note that similar reconstructions have existed for years; this is viewed as a fun/new iteration rather than novel research.

Data sources, coverage & “useless without all songs” debate

The system currently uses Spotify links only to fetch metadata, then locates and downloads matching audio from YouTube for fingerprinting.
Several people raise legal and ToS concerns around YouTube/Spotify ripping.
One view: the algorithm is “useless unless you have all songs on earth”; the counterpoint is that the open-source algorithm is valuable for anyone who has or can build their own dataset, including non-music uses.
Suggestions include: using local files or WAVs, and community or shared fingerprint databases. MusicBrainz/AcoustID is mentioned as an existing open fingerprint ecosystem.

Algorithm & technical discussion

Summary of the approach: FFT → derive sparse audio fingerprints → index → similarity search.
Links and references surface short-time Fourier transform, spectrograms, and the time vs frequency resolution tradeoff.
Discussion touches on whether fingerprints are closer to hash-like features or something that could support clustering (e.g., artist identification); the author says current fingerprints are not designed as clusterable vectors, and ML would likely be needed for that.

Shazam vs SoundHound vs others

Mixed reports on accuracy: some say SoundHound has long been better, especially with humming or noisy input; others find Shazam more reliable in specific tests.
A SoundHound employee notes that failures are often due to the song not being in the database, and that all services have coverage gaps.
One user reports mic conflicts between Shazam and SoundHound on iOS.
Google’s “Now Playing” / “hum to search” and Pixel’s low-power always-listening feature are cited as strong alternatives.

Patents, IP, and naming concerns

Multiple comments highlight that Shazam’s core algorithm is patented (at least in the US) through ~2025, with worldwide filings.
Patent vs copyright is clarified: patents cover methods/algorithms, not just copied code; open source is not exempt from patents, and even personal use can infringe in the US.
There’s debate over prior publication vs patent filing dates, but a provisional patent from 2002 appears to cover the core work.
Questions are raised about liability for open-source authors outside the US when US users download or run infringing code; consensus is that large-pocket defendants are most at risk, but the law is complex and unclear in detail.
Several people urge an urgent name change away from “Shazam”-derived branding to reduce trademark and legal risk; alternative names are proposed and “SoundScout” is popular.

Implementation feedback & polish requests

Setup is seen as rough: confusing cd instructions, MongoDB requirement without clear configuration guidance, and a vulnerable npm dependency tree.
Suggestions include:
- Replace or optionally swap MongoDB with SQLite or another embedded DB.
- Provide Docker/Docker Compose for easy startup.
- Add direct support for fingerprinting local/WAV files (the author commits to doing this).
- Improve rate-limiting logic instead of hardcoded sleeps, though some defend sleeps as a simple way to avoid third‑party bans.
A leaked Google/YouTube API key is spotted; the author disables it.

Miscellaneous points

Commenters share references to earlier Shazam-related papers, talks, and prior HN threads.
Some note the broader historical role of audio as an early driver for computing and electronics applications.

Related topics