Tracking supermarket prices with Playwright
Personalized pricing and coupons
- Commenters note that app-based, individualized coupons in supermarkets create huge price differences and make tracking harder.
- Tools like browser extensions and simple bookmarklets auto-clip coupons, occasionally producing “surprise” savings.
- Some see this as deliberate price obfuscation that undermines transparent comparisons.
Scraping strategies and tooling
- Many are doing similar projects for groceries, beauty products, contact lenses, etc., across multiple countries.
- Approaches include Playwright/Puppeteer, plain HTTP+JSON, mitmproxy/HAR capture, NodeRED, curl-impersonate, and Cloudflare’s Browser Rendering API.
- Several prefer sniffing JSON APIs via the browser’s network activity to avoid brittle DOM parsing, falling back to HTML or even images+OCR when needed.
- Some separate “scrape raw data” and “parse/transform” into different stages; storing raw HTML/JSON lets them fix parsers later.
Bot blocking, anti-scraping, and data poisoning
- Bot protection (Akamai, Cloudflare, residential-IP checks) is a major challenge; people randomize timings, use proxies, Tailscale, or even give up due to cost and maintenance.
- Some sites quietly “poison” scrapers (e.g., fake repeated items, wrong prices, stale snapshots) instead of hard blocking.
- There is discussion of sanity checks (price change thresholds, product count ranges) to detect obviously bad results.
Matching products and measuring value
- A recurring hard problem is mapping “the same” product across retailers: inconsistent names, brands, sizes, and even different SKUs made to foil comparisons.
- Approaches range from fuzzy string matching plus brand extraction, to manual curation, to tentative use of LLMs or embeddings, with mixed results.
- Some systems track both shelf price and price per unit to reveal shrinkflation, though not always shown by default.
Legal, ethical, and regulatory angles
- In some countries (e.g., Australia/NZ), legality is described as “rocky”; large chains allegedly pressure scrapers to stop.
- Others point out that ToS violations aren’t necessarily illegal, but threats still deter projects.
- People debate whether such sites are “for consumers” vs. just another commercial play, and whether selling scraped data back to businesses changes the ethical feel.
- There’s interest in mandated open pricing APIs or laws limiting IP-based blocking, but skepticism about political feasibility.
Market effects and consumer behavior
- Commenters describe “sawtooth” and rotating-brand discounts as tactics to segment time-poor vs. price-sensitive shoppers and to exploit brand loyalty.
- There’s concern that fully transparent prices could also facilitate algorithmic collusion or duopoly price coordination.