Launch HN: Exa (YC S21) – The web as a database
Core idea & positioning
- Websets is presented as “web as a database”: search returns entities with structured properties, verified and enriched by LLMs, rather than a traditional SERP.
- Commenters compare it to: Perplexity, Gemini Deep Research, Clay, Diffbot, BI tools, Databricks-like “backend for web search,” and “a better interface to the web for LLMs.”
- Exa frames its evolution: embeddings-based consumer search → web search for AIs → Websets built on that infrastructure.
Capabilities & current limitations
- Strongest on people, companies, research papers, and high‑quality written content (blogs, news, GitHub repos).
- Weak or currently unsupported: products/e‑commerce, authenticated/permissioned content, non‑English, images/vision, YouTube/video, and tweets.
- JS rendering is supported, but some tags (e.g. Adobe Analytics) may be stripped during parsing.
- Geospatial queries may work if locations are treated as enriched columns, but there’s no built‑in map viz.
- Enrichments can add arbitrary columns via agents; users want this available standalone (given their own entity lists).
- Some queries show semantic understanding gaps: OR logic, numeric filters (“letter R”, price caps, subscriber ranges), and precise geolocation.
Performance and reliability
- Multiple reports of searches stuck on “Verifying…”, multi‑minute or hour‑scale waits, and downtime during the HN traffic spike.
- Team repeatedly acknowledges being down and manually runs sample queries, quoting ~1–3 minutes for tens of verified results.
- Users request clearer progress indicators and better handling of partial failure.
UX & developer experience
- Praise for the Airtable‑like table and the general concept; criticism of fixed, narrow columns, truncation, and poor tablet layout.
- Preview vs full Websets behavior is confusing; unauthenticated users see only a “preview table” that doesn’t showcase verification/enrichment.
- Issues raised: laggy/meaningless WebGL globe, unhelpful errors (especially with JS disabled or browser errors), broken feedback form, hard to share/copy results on mobile.
- Developer feedback highlights API quirks (incorrect cURL example, unclear “preview search” semantics) and desires HTML “cruft cleaning” for LLM consumption.
Pricing & access
- Credits get consumed quickly; several commenters find $49/month and 8k credits too expensive for light or exploratory use, requesting under‑$10 tiers or pure pay‑as‑you‑go.
- Exa cites 2025 AI compute costs as the reason for gating and current pricing.
Ethics, crawling & data
- Some worry about AI crawlers ignoring best practices, server load, and broader personal‑data exploitation.
- Questions about robots.txt and blocking Exa aren’t substantively answered in the thread.
Overall reception
- Strong enthusiasm for the concept and its usefulness for lead gen, research, and tabular “best X for me” queries.
- Simultaneously, many note slowness, early‑stage rough edges, and quality limits outside the best‑supported domains.