All the data can be yours: reverse engineering APIs

Ethics and “Internet Citizenship”

  • Strong split between “ask permission first” vs “if it’s on the public internet, it’s fair game.”
  • Pro‑permission side: reverse‑engineered clients can unintentionally hammer expensive endpoints; operators pay the infra bill; unauthorized use is seen as abusive unless negotiated; ToS and rate limits matter.
  • Pro‑interoperability side: users should be free to choose clients; HTTP is a general interface, not tied to a specific app; responsibility is on operators to rate‑limit, return proper status codes, and design resilient systems.
  • Many argue that adversarial interoperability and user‑controlled agents are core to the web, and overly locked‑down APIs are socially harmful and often anticompetitive.
  • Some mention a pragmatic middle ground: be gentle (rate limiting, caching, “human‑like” traffic), respond to complaints, and avoid clearly private data.

Technical Techniques and Tools

  • Common entry points: browser devtools, replaying network requests, looking for JSON/GraphQL/OpenAPI specs, and treating first‑party clients as just another consumer.
  • For mobile: strings, nm, MITM proxies (Charles, Burp, mitmproxy, HTTP Toolkit), Frida, apk‑mitm, pre‑rooted emulators, and tools like dockerized Android setups.
  • For tougher targets: bypass TLS pinning, use full browsers (Puppeteer/Playwright) to survive JS challenges, or rely on phone farms / residential IPs to evade datacenter blocks.
  • For websockets: wsrepl, websocat, Burp’s websocket tools, and binary‑format DSLs like Kaitai Struct or ImHex’s language.

Arms Race and Operational Concerns

  • Many note an ongoing cat‑and‑mouse: bot detection, device attestation, fingerprinting, and WAFs vs increasingly sophisticated scrapers and “browser‑as-a-bot” setups.
  • Some operators prefer blocking or throttling; others favor poisoning scraper data, though this is acknowledged as harder.
  • Posters stress that poorly written scrapers without backoff cause real pain for on‑call teams and can trigger expensive incidents.

Real‑World Anecdotes and Use Cases

  • Numerous stories: student portals, grade systems, sports leagues, streaming services, trading platforms, conference apps, government APIs, and deprecated online games.
  • Motivations include: building better UIs, getting structured data, enabling research, keeping dead services alive, and unifying fragmented vendor ecosystems.
  • Several people report eventual rate‑limit changes or legal threats, but also occasional cooperation or even eventual “officialization” of their tools.