Intel Unveils Lunar Lake Architecture
Intel, TSMC, and Foundry Trajectory
- Many see Intel’s full reliance on TSMC for Lunar Lake logic/IO as a major shift and a sign of continuing trouble after the 10nm “debacle.”
- Some hope this is the bottom and that Intel’s own nodes (Intel 3, 20A, 18A) will still recover; others point to Arrow Lake mixing Intel 20A and TSMC 3nm as a bad sign for Intel’s foundry business.
- Competition in high-end CPUs is widely seen as socially and geopolitically important.
Core Configuration and Hyper-Threading (SMT)
- Lunar Lake drops SMT, going 4 performance + 4 efficiency cores.
- Debate:
- Pro-SMT: helps unoptimized, memory-bound, and especially compilation workloads; older generations saw 20–25% gains.
- Anti-SMT: small or negative benefit on many modern multicore laptops; can hurt rendering and games; E-cores are argued to be a better, lower-overhead way to increase throughput.
- Some note Apple’s lack of SMT as evidence that SMT is less critical in mobile-focused designs.
On-Package LPDDR5X Memory and Soldering
- Lunar Lake integrates up to 32 GB LPDDR5X on package (128-bit bus).
- Pros cited: power savings from lower-voltage, short-trace DRAM and better perf/W, especially for thin-and-light laptops.
- Cons: no user RAM upgrades; compared to Xeons-with-HBM and Apple, some see this as customer-hostile with modest real power gains.
- 32 GB is seen as enough for many local AI assistant workloads (e.g., Windows Copilot) but not for all LLM use cases.
- Broader frustration with soldered components and e-waste; if soldered, people want higher default capacities.
Apple / AMD / Intel Efficiency Debate
- Multiple explanations for Apple’s perf/W:
- Early access to leading TSMC nodes.
- High IPC via very wide cores and large caches.
- Willingness to ship large, expensive dies and optimize for perf/W over cost.
- Counterpoints:
- On the same node (e.g., TSMC 5nm), some AMD CPUs match or beat Apple in specific benchmarks, especially when allowed high power.
- TDP is criticized as misleading; real efficiency comparisons require measured power under load, which reviews rarely provide.
- Consensus: ISA (RISC vs CISC) is minor; process, microarchitecture, and product goals dominate.
GPU, Memory Bandwidth, and Small-Form Systems
- Hope Lunar Lake’s iGPU advances carry over to desktops; integrated GPUs are still seen as bandwidth-limited versus discrete GPUs with higher-bandwidth memory.
- Some are more excited about future AMD APUs (e.g., “Strix Halo” rumors: wider memory bus, larger iGPU) for ITX-sized gaming without a dGPU.
Design Complexity and Microarchitecture Details
- The “sea of FUBs” → “sea of cells” shift is discussed as moving from many small, latch-heavy partitions to larger, flop-dominated partitions, improving physical design flow at the cost of tool complexity.
- General agreement that modern superscalar OoO CPUs are extremely complex; even advanced university courses often study much older designs.
Security, Enclaves, and Democracy
- One tangent argues for hardware-enforced private-state “actors” (enclave-like units) as a foundation for secure democratic infrastructure, instead of today’s shared-state designs.
- Others respond that:
- Secure computing can just as easily empower authoritarian regimes, since they control keys.
- High “security” via opaque enclaves may conflict with the transparency needed for democratic trust.
Windows on ARM, Emulation, and CPU Commoditization
- Some see Microsoft pushing toward ISA-agnostic Windows (x86, ARM, etc.) as a way to commoditize CPUs and strengthen Microsoft’s position.
- Historical note: NT has long run on multiple ISAs; market demand, not Microsoft, kept x86 dominant.
- Windows on ARM already exists (with x86 emulation “Prism”), and Snapdragon X laptops are coming; past attempts (e.g., RT, early WoA) had weak uptake, so it’s unclear if this wave will succeed.
NPUs, Local AI, and User Control
- Lunar Lake adds an NPU with ~45 TOPS; Intel markets “120 TOPS” including GPU/CPU.
- AMD’s NPUs are cited as more power-efficient per TOP; both vendors claim large gen-on-gen efficiency gains, but real-world comparisons remain unclear.
- Some worry about always-on “AI coprocessors” as potential surveillance units and ask for true hardware kill switches; no clear answer from the thread on whether such switches will exist.
Scheduling and OS Support
- Lunar Lake’s efficiency gains are said to rely heavily on Windows 11’s improved heterogeneous-core scheduler.
- Concern that Linux may lag in exploiting P/E cores optimally, especially on client devices, though Intel has incentives to improve Linux scheduling for servers as well.
- Lunar Lake is simpler than Meteor Lake (no SMT, fewer core types), which may ease scheduling.