Spotting base64 encoded JSON, certificates, and private keys
Recognizing Base64 Patterns
- Many commenters relate to “seeing” structures in base64 after enough exposure, especially JWTs, X.509 certs, keys, and Kubernetes secrets.
- Common telltale prefixes:
eyJ/eyJhbG→ JSON / JWT header (“{” +"and typically"alg").LS0/tLS→ sequences of-----(PEM headers/footers, YAML---).MI/MII→ ASN.1 DERSEQUENCEwith long length (certs, keys, CRLs).AQAB→ RSA exponent 65537.- Also listed:
R0lGOD(GIF),iVBOR(PNG),/9j/(JPEG),PD94(XML).
- Some note quasi-fixed points and “self-similar” base64 strings, and explain the bit-level mechanics behind
{"→ey.
Wastefulness of JSON + Base64 (Especially in JWTs)
- Strong criticism of stacking JSON + base64 (often twice) + HTTP headers:
- Base64 adds ~33% per encoding; double encoding ≈ 78% overhead before JSON.
- For security tokens, this bloat hits every request header or HTTP/2 connection.
- Example: a few fixed-size fields could be a compact binary TLV block, instead of kilobyte-scale JWT-like blobs.
- Some call embedding base64 inside JSON that’s itself base64-encoded “laughable” and “Russian nesting dolls.”
Alternatives to JSON/Base64 for Structured/Binary Data
- Suggestions:
- MessagePack, CBOR, BSON: JSON-like but binary and support native binary blobs.
- Simple TLV / IFF-style formats (AIFF/RIFF/PNG-like) as easy, efficient, schemaless encodings.
- ASN.1 and protobuf for structured data, albeit with schema overhead.
- Several argue binary formats are underrated and far faster to parse than JSON.
Security and Misuse of Base64
- Repeated reminder: base64 is an encoding, not encryption or obfuscation.
- Storing secrets base64-encoded in repos or JWT payloads is unsafe unless separately encrypted.
- Some suggest light obfuscation (even ROT13-level) can reduce obvious leak visibility, but others implicitly see that as weak “security by obscurity.”
Experience, “Obviousness,” and Curiosity
- Split reactions: some say these patterns are “obvious” to anyone who’s handled certs/JWTs; others appreciate the post as a new, useful heuristic.
- Anecdotes about reading ASCII from hex, EBCDIC from logs, or sendmail.cf / core dumps highlight how pattern recognition grows with experience.
- Minor debate about whether the author should have explained why the patterns arise, and whether this reflects broader “incuriosity” in modern CS education.