2025-05-06

Launch HN: Exa (YC S21) – The web as a database

Core idea & positioning

Websets is presented as “web as a database”: search returns entities with structured properties, verified and enriched by LLMs, rather than a traditional SERP.
Commenters compare it to: Perplexity, Gemini Deep Research, Clay, Diffbot, BI tools, Databricks-like “backend for web search,” and “a better interface to the web for LLMs.”
Exa frames its evolution: embeddings-based consumer search → web search for AIs → Websets built on that infrastructure.

Capabilities & current limitations

Strongest on people, companies, research papers, and high‑quality written content (blogs, news, GitHub repos).
Weak or currently unsupported: products/e‑commerce, authenticated/permissioned content, non‑English, images/vision, YouTube/video, and tweets.
JS rendering is supported, but some tags (e.g. Adobe Analytics) may be stripped during parsing.
Geospatial queries may work if locations are treated as enriched columns, but there’s no built‑in map viz.
Enrichments can add arbitrary columns via agents; users want this available standalone (given their own entity lists).
Some queries show semantic understanding gaps: OR logic, numeric filters (“letter R”, price caps, subscriber ranges), and precise geolocation.

Performance and reliability

Multiple reports of searches stuck on “Verifying…”, multi‑minute or hour‑scale waits, and downtime during the HN traffic spike.
Team repeatedly acknowledges being down and manually runs sample queries, quoting ~1–3 minutes for tens of verified results.
Users request clearer progress indicators and better handling of partial failure.

UX & developer experience

Praise for the Airtable‑like table and the general concept; criticism of fixed, narrow columns, truncation, and poor tablet layout.
Preview vs full Websets behavior is confusing; unauthenticated users see only a “preview table” that doesn’t showcase verification/enrichment.
Issues raised: laggy/meaningless WebGL globe, unhelpful errors (especially with JS disabled or browser errors), broken feedback form, hard to share/copy results on mobile.
Developer feedback highlights API quirks (incorrect cURL example, unclear “preview search” semantics) and desires HTML “cruft cleaning” for LLM consumption.

Pricing & access

Credits get consumed quickly; several commenters find $49/month and 8k credits too expensive for light or exploratory use, requesting under‑$10 tiers or pure pay‑as‑you‑go.
Exa cites 2025 AI compute costs as the reason for gating and current pricing.

Ethics, crawling & data

Some worry about AI crawlers ignoring best practices, server load, and broader personal‑data exploitation.
Questions about robots.txt and blocking Exa aren’t substantively answered in the thread.

Overall reception

Strong enthusiasm for the concept and its usefulness for lead gen, research, and tabular “best X for me” queries.
Simultaneously, many note slowness, early‑stage rough edges, and quality limits outside the best‑supported domains.

Related topics