What's up with all those equals signs anyway?
Context: odd “=” characters in released emails
- Several commenters had noticed garbled text and stray “=” in recently released Epstein-related email PDFs and initially blamed OCR or government print–scan workflows.
- The thread clarifies these are encoding artifacts, not intentional redactions or secret codes.
Quoted-printable & line endings
- Core issue: quoted‑printable encoding uses
=\r\nas a soft line break and=XX(hex) for non‑ASCII or special bytes. - At some point,
\r\n(CRLF) appears to have been converted to\n(LF) without removing the preceding=, leaving lone “=” and dropped characters. - There’s minor nitpicking over using “NL” vs “LF”, with clarification that U+000A has multiple historical names.
Why email enforces line-length limits
- RFCs recommend wrapping lines at
78 characters and require a hard limit (1000 bytes) to:- Fit 80‑column terminals and simple displays.
- Allow line‑oriented, fixed‑buffer processing on low‑memory systems.
- Avoid denial‑of‑service via extremely long lines.
- Quoted‑printable and Base64 both introduce line breaks partly for these reasons.
How these artifacts likely arose
- Several suggest the emails passed through multiple mail systems (e.g., third‑party servers, Outlook PSTs, Apple Mail archives) that each did “helpful” transformations, possibly even double QP-encoding.
- Legal/evidentiary workflows are described as deliberately low‑skill, mechanical pipelines that mangle formats while prioritizing chain‑of‑custody and minimizing exposure.
- Result: raw quoted‑printable leaked into PDFs, then partially and incorrectly “cleaned up”.
Encoding vs “inserting characters”
- One camp sees servers modifying message bodies as “hacky” and UI‑layer business.
- Others argue it’s standard encoding/escaping (like HTML entities or bit‑stuffing in link protocols); done correctly, it’s reversible and not a semantic change.
Legacy systems and CR/LF history
- Long subthread recounts why CR and LF were separate on teletypes (mechanical delays, overstriking tricks), and how this legacy persists.
- Line‑based protocols (SMTP, POP3, IMAP) and their constraints are revisited, along with POP3 vs IMAP usage patterns.
Email complexity & broader lessons
- Multiple commenters with experience writing mail clients/parsers note MIME and real‑world email are full of edge cases and bad headers.
- Email is cited as a rare “successful” messy standard that unified many incompatible systems.
- The incident is framed as an “abstraction leak” and “just enough knowledge to be dangerous”: like parsing HTML with regex, hand‑rolled QP decoding works until it catastrophically doesn’t.