Reverse geocoding is hard
Scope of the Problem
- Several commenters say reverse geocoding is less a pure “coordinates → text” task and more a messy UX / data-modelling challenge.
- Real-world geography is irregular: rivers, parks, complex sites, borders and unofficial local names all break naive assumptions.
- One commenter notes that mapping real coordinates to meaningful features is a sparse, highly non‑linear inference problem; some consider it “AI‑complete” and note LLMs still perform poorly here.
Distance, Routing & UX
- Straight-line distance often fails (e.g. benches or schools across a river with long ferry waits); travel time or routing distance would be more meaningful but is costlier and can still mislead.
- Tools suggested: OSRM, ArcGIS, Google Routes API, Graphhopper, Valhalla, isochrones.
- Some argue many apps could sidestep textual descriptions entirely with “drop a pin” UX, as seen in ride‑hailing apps.
Data Sources & Technical Approaches
- Heavy use of OpenStreetMap; approaches include:
- Local reverse geocoders over OSM (SQLite + radius search; ES/Elastic + polygons; PostGIS point-in-polygon).
- Global admin-boundary datasets (e.g. GADM) to derive country/region/city labels.
- Geonames praised as a longstanding POI/cities dataset with stable IDs, but criticized for slow updates and lack of fine-grained, current places.
- Other techniques: S2 cells, geohashes, bounding boxes/polygons, bespoke hierarchies (e.g. Disneyland lands/rides, UK-wide isochrones).
Address Models & Human Meaning
- Huge variation in address formats (grid streets vs. China-style hierarchies vs. amenity-based directions in India).
- People often give directions using landmarks and paths, not formal street names, especially in parks or complex venues.
- Genealogy and delivery/logistics highlight ambiguity: same name in multiple places, overlapping administrative and postal systems, and multiple “official” street names/aliases.
Time, Datums & Moving Earth
- Coordinates themselves change due to tectonic drift and evolving coordinate reference systems (e.g. Australian and Japanese datum shifts after continental motion/earthquakes).
- Suggestion: always store time and/or CRS with coordinates; otherwise old data becomes ambiguous.
- Debated database strategies: validity ranges per record, temporal tables, audit logs vs. periodic snapshots; time-interval queries are noted as hard at scale.
Proprietary Schemes & Alternatives
- What3Words is discussed: compact but proprietary, with homophone and offensive-word concerns; not seen as solving the “human-understandable context” problem.
- Google Plus Codes and services like map.name mentioned as coordinate encodings, but these still need translation into meaningful descriptions.
“Good Enough” vs. Perfection
- Some argue most users only need neighborhood‑level clarity and can rely on maps for exact points; edge cases can be fixed later or with user feedback.
- Others push back, saying global map data is highly dynamic, edge cases are pervasive, and “good enough” often means broken behavior in real applications (e.g. emergency dispatch, school allocation, disputed territories).