Launch HN: Exa (YC S21) – The web as a database

Core idea & positioning

  • Websets is presented as “web as a database”: search returns entities with structured properties, verified and enriched by LLMs, rather than a traditional SERP.
  • Commenters compare it to: Perplexity, Gemini Deep Research, Clay, Diffbot, BI tools, Databricks-like “backend for web search,” and “a better interface to the web for LLMs.”
  • Exa frames its evolution: embeddings-based consumer search → web search for AIs → Websets built on that infrastructure.

Capabilities & current limitations

  • Strongest on people, companies, research papers, and high‑quality written content (blogs, news, GitHub repos).
  • Weak or currently unsupported: products/e‑commerce, authenticated/permissioned content, non‑English, images/vision, YouTube/video, and tweets.
  • JS rendering is supported, but some tags (e.g. Adobe Analytics) may be stripped during parsing.
  • Geospatial queries may work if locations are treated as enriched columns, but there’s no built‑in map viz.
  • Enrichments can add arbitrary columns via agents; users want this available standalone (given their own entity lists).
  • Some queries show semantic understanding gaps: OR logic, numeric filters (“letter R”, price caps, subscriber ranges), and precise geolocation.

Performance and reliability

  • Multiple reports of searches stuck on “Verifying…”, multi‑minute or hour‑scale waits, and downtime during the HN traffic spike.
  • Team repeatedly acknowledges being down and manually runs sample queries, quoting ~1–3 minutes for tens of verified results.
  • Users request clearer progress indicators and better handling of partial failure.

UX & developer experience

  • Praise for the Airtable‑like table and the general concept; criticism of fixed, narrow columns, truncation, and poor tablet layout.
  • Preview vs full Websets behavior is confusing; unauthenticated users see only a “preview table” that doesn’t showcase verification/enrichment.
  • Issues raised: laggy/meaningless WebGL globe, unhelpful errors (especially with JS disabled or browser errors), broken feedback form, hard to share/copy results on mobile.
  • Developer feedback highlights API quirks (incorrect cURL example, unclear “preview search” semantics) and desires HTML “cruft cleaning” for LLM consumption.

Pricing & access

  • Credits get consumed quickly; several commenters find $49/month and 8k credits too expensive for light or exploratory use, requesting under‑$10 tiers or pure pay‑as‑you‑go.
  • Exa cites 2025 AI compute costs as the reason for gating and current pricing.

Ethics, crawling & data

  • Some worry about AI crawlers ignoring best practices, server load, and broader personal‑data exploitation.
  • Questions about robots.txt and blocking Exa aren’t substantively answered in the thread.

Overall reception

  • Strong enthusiasm for the concept and its usefulness for lead gen, research, and tabular “best X for me” queries.
  • Simultaneously, many note slowness, early‑stage rough edges, and quality limits outside the best‑supported domains.