Hackers use ZIP file concatenation to evade detection
Non-malicious and historical uses of ZIP concatenation
- Technique predates the current wave of attacks: used for hybrid files (e.g., JPEG cover + ZIP of eBooks; ZIP in JPEG ICC profiles).
- Some communities reportedly abandoned it after being abused for illegal content, leading platforms to block ZIP-looking images.
- Related ideas go back to at least the 1990s (zip bombs, JAR/GIF hybrids).
Bypassing scanners in real-world workflows
- Encrypted ZIPs are a long-standing way to evade email/content filters.
- Workarounds include: embedding payloads in DOCX/XLSX (ZIP-based formats), base64-encoding binaries, and compress+split+encrypt pipelines (“shred/unshred”-style).
- Corporate security often blocks “dangerous” extensions but allows opaque or split archives, leading to security theater while still being easy to bypass.
ZIP format ambiguity and parser behavior
- Core issue: two structures (local file headers vs central directory) can disagree.
- Some tools scan local headers; others treat the central directory as the sole source of truth. Behavior differs between WinRAR, 7-Zip, and Windows Explorer.
- Debate over what the spec “really” intends:
- One side: only central directory entries are valid; extra headers are garbage except for recovery.
- Other side: spec implicitly allows “islands” of opaque data and append-only modification, for media spanning and streaming.
- This ambiguity has already led to real vulnerabilities (e.g., hidden add-on files, APK modification without breaking signatures).
- Several argue for a “strict ZIP” spec with explicit parsing rules.
Format design, splitting, and philosophy
- Some criticize ZIP for violating single-source-of-truth principles, preferring simpler formats like tar (+ separate compression).
- Others defend integrated features (central directory, file spanning) as historically necessary and still useful for large or unstable transfers.
- There’s a Unix-style argument for separating archiving, compression, and sharding vs a pragmatic argument for combining them for random access and usability.
Defensive strategies and limitations
- Suggested defenses include recursive unpacking vs simply rejecting “weird” archives that don’t match a straightforward forward-scan/central-directory view.
- Some warn that making tools “smart” (deep recursive unpacking, auto-processing) increases attack surface; only AV should unpack deeply, regular tools should stay “dumb.”
- Email/HTTP perimeter scanning is justified as defense in depth, but multiple commenters note that trivial transformations (encryption, base64, simple XOR/ROT) already defeat signature-based detection.
- VirusTotal and many AV products reportedly struggle with nested archives and complex ZIP structures, often for performance reasons.