The time the x86 emulator team found code so bad they fixed it during emulation
Platform / Driver Workarounds for Bad Code
- Many comments draw parallels between the emulator “fixing” bad code and modern layers like Proton/Wine, GPU drivers, and browsers, which ship extensive per‑app quirks and hacks.
- GPU drivers commonly detect games by executable name or explicit APIs, then apply game-specific optimizations or correctness workarounds, sometimes even reducing quality for benchmarks.
- This is seen as fragile: optimizations for one title (e.g., “hl2.exe”) can boost performance, break other games, or rely on undefined behavior.
- Browser engines have similar per‑site “quirks” tables; OSes and runtimes have shipped app‑specific shims (e.g., for SimCity).
Benchmarks, Cheating, and Competitive Optimizations
- Several examples mention drivers “cheating” for popular benchmarks (Quake 3, DirectX demos) by lowering quality or skipping work when they detect a benchmark path.
- Vendors generally try to inform game developers about correctness bugs, but often ship driver workarounds first; big studios get more attention.
- Performance-only tweaks are sometimes kept in drivers as a competitive advantage rather than pushed upstream.
I/O and Library Pathologies
- Multiple stories highlight pathological I/O usage: tiny reads (
fread/ReadFilein 1‑byte or 4‑byte chunks), OS components issuing many small system calls, and apps effectively defeating buffering layers. - One commenter notes that a game’s use of
fread(data, 1, 65536, f)exposed a bad stdlib implementation or unbuffered mode, causing 65k byte-level reads; a caching layer in the hook was needed to fix performance. - There is debate over whether such
freadusage is semantically wrong (most say it’s fine; the implementation is at fault).
Stack Probing, Guard Pages, and Uninitialized Memory
- Discussion covers stack probing on Windows: large stack frames must touch each page sequentially to trigger guard pages and detect overflows.
- Some note that many real-world stack allocations simply “hope for the best,” but Windows compilers add probes for big frames.
- Several comments describe custom or system features that prefill stack memory with canary patterns or zeros (e.g., compiler switches, automatic stack initialization) to detect uninitialized-variable bugs.
Loop Unrolling and Code Bloat
- Loop unrolling is defended as a valid optimization, but many point out diminishing returns and code-size costs, especially when huge unrolled blocks bring cache penalties.
- The “256 KB of code to init 64 KB” example is widely seen as an extreme case of over-optimization and poor compiler behavior.