Stories - Page 146 | HN Distilled

2026-01-26

I let ChatGPT analyze a decade of my Apple Watch data, then I called my doctor

Apple Watch & VO2 Max Accuracy

Debate over blame: some argue Apple misrepresents Apple Watch VO2 max as “validated,” others note Apple’s own studies show systematic underestimation and wide individual error, so it’s not clinical grade.
Several commenters report Apple Watch (and similar devices) giving implausibly low VO2 max or alarming fitness warnings that doctors later dismissed.
Others say wearables (especially Garmin / Oura) can be quite accurate for trends and useful when used correctly, but require controlled conditions and are sensitive to confounders like pace, altitude, and whether workouts are recorded.

What an LLM Can (and Can’t) Do With Health Data

Strong view that LLMs are the wrong tool for raw multi‑year time series: they produce plausible text, not validated numerical analysis, and will “simulate” analysis rather than perform it.
Some suggest the right pattern is: have the LLM generate code/notebooks to analyze data, then review results with a doctor.
Others counter that specialized models for wearable data exist and could, in theory, be aligned with LLMs, but this isn’t what generic chatbots are doing now.

Responsibility, Risk, and Regulation

Split between “users should know it can be wrong; warnings exist” and “marketing and product design explicitly portray ChatGPT as trustworthy for health, so the burden is on the company.”
Some want stricter guardrails: health Q&A only at a general level, explicit refusal to interpret personal data, stronger disclaimers or gating.
Others argue society routinely uses imperfect tools; banning access until models are “perfect” is unrealistic.

False Positives, Anxiety, and Healthcare Costs

Multiple anecdotes of frightening but wrong AI “diagnoses” leading to traumatic worry and unnecessary medical workups.
Others share cases where ChatGPT suggested overlooked possibilities (e.g., gallbladder issue) that ultimately proved correct after specialist testing.
Several note that in medicine, false positives are costly (money, time, radiation, procedures, anxiety), so a model that “sees red flags everywhere” is harmful.

Doctors vs. AI, and How to Use These Tools

Many emphasize doctors view metrics in context (symptoms, risk factors, exam), whereas the article asked an LLM to compress heterogeneous metrics into a single “grade,” which doctors don’t do.
Some feel doctors under-address “small problems” and subtle fitness issues, leaving a vacuum filled by wearables, forums, and now AI.
Others stress doctors vary widely in quality and staying current; in the best case, AI can help patients ask better questions and surface research, not replace clinical judgment.

Health Metrics Need Context

Commenters highlight that VO2 max, BMI, HRV, resting heart rate, etc. are population tools, not absolute individual health scores.
Fitness vs. health distinction: someone can be “healthy enough” by medical standards yet unfit by athletic standards; an internet‑trained model may adopt the fitness‑culture framing and grade harshly.
Overemphasis on a single metric (VO2 max, BMI) without clinical context is seen as a core flaw in both the article’s setup and AI‑driven “health grades.”

Privacy and Data Use

Some find the very act of uploading detailed health data to a commercial AI service “alarming,” given data‑sale incentives and unclear secondary uses.
Others are more focused on potential future benefits (long‑term baseline data for better models) and ask for minimally obtrusive trackers with local, exportable data.

Overall Sentiment

Broad consensus: current general‑purpose LLMs are not ready to interpret personal medical data or issue health grades.
Many see potential value in specialized, clinically validated models paired with human clinicians, and in using AI as a pattern‑spotter and explainer—not as an oracle.

View on HN ↗ Original Article ↗

2026-01-26

State of the Windows: What is going on with Windows 11?

Legacy vs Modern UI (Control Panel, Settings, UX)

Strong frustration that the Settings app still can’t replace Control Panel after multiple Windows releases; many key options (power plans, input, audio/network details, device exclusivity, etc.) remain only in legacy dialogs.
Users describe “archaeological layers”: modern Settings leading via links into old Control Panel, multiple generations of context menus, and 30‑year‑old dialogs popping up in Windows 11.
Modern Settings is criticized as slow, low‑information‑density, and full‑screen for simple tasks; some users report giving up and going straight to Control Panel every time.
A minority defends the iterative approach: more options do move into Settings each release, and having old UI still available is seen as a necessary safety net.

User Sentiment, Adoption, and What People “Want”

Many commenters call Windows 11 a “disaster” or “hostile,” especially compared with remembered peaks like 2000, XP (with service packs), or 7; others argue nostalgia ignores how unstable 95/98 actually were.
Some insist most users are indifferent and that online complaints represent a tiny, noisy fraction; others point to slow adoption and large numbers staying on Windows 10 as circumstantial evidence of resistance, though the causal link is debated.
There’s no consensus on “what users want”: HN‑type users emphasize simplicity, consistency, and user‑first design; others say mainstream users care more about cost, familiarity, and a “modern” look.

Ads, AI, and Incentives

Many see Windows as increasingly ad‑, telemetry‑, and AI‑driven (OneDrive pushes, Copilot buttons everywhere, upsells, bloatware), with the OS serving Microsoft’s services business more than user needs.
Some say Recall/AI outrage was overblown and note they barely see ads after tweaking settings.
It’s argued there’s little internal “code red” because profits come from Azure/365 and most users can’t or won’t switch platforms.

Performance, Bloat, and Technical Debt

Complaints about sluggish Explorer, hangs on simple file operations, HDDs becoming unusable on newer builds, heavy background scanning, and RAM hunger (32–64 GB suggested by some).
Others report Windows 11 runs fine on modest hardware and compare it favorably to current iOS/macOS performance.
One view: NT internals are solid; the real “technical debt” is the accretion of modern layers and poorly integrated features on top. Loss of testing roles and institutional knowledge is blamed for regressions.

Alternatives and Lock‑In (macOS, Linux, ChromeOS)

macOS “Tahoe” is criticized too (aesthetic regressions, inconsistent visuals), but some still find it far less obstructive than Windows 11; others see the complaints as design‑purist nitpicking.
Linux is portrayed as:
- Great for technical users and increasingly viable for gaming (via Steam/Proton, excluding kernel anti‑cheat).
- Still rough for casual users due to drivers, fragmentation, and troubleshooting.
ChromeOS is mentioned as the de facto “Linux desktop” for many ordinary users.
Business reliance on Office, SharePoint, Windows‑only apps, and kernel‑level anti‑cheat in games keeps many stuck on Windows.

Real‑World Users (Seniors, Schools, Work)

Seniors struggle with OneDrive “dark patterns,” confusing backup behavior, and fear of data loss when trying to disable cloud integration; they mostly want stability and simple customization, not constant change.
Some schools issue Chromebooks; others still expect families to buy Windows PCs.
Many office workers have no control over their OS; they just learn workarounds (e.g., disabling Copilot features).

Workarounds and Debloating

A recurring theme: Windows 11 becomes “acceptable” after running debloat scripts, using LTSC or similar SKUs, and installing start menu/taskbar tweaks.
Several argue this is itself an indictment: a modern OS shouldn’t require scripting, registry edits, or unofficial builds just to stop undermining the user.

View on HN ↗ Original Article ↗

2026-01-26

People who know the formula for WD-40

Reverse engineering & “secret formula” mystique

Multiple commenters argue WD‑40 could be (and partially has been) reverse engineered with GC‑MS, HPLC, NMR, etc. A Wired piece is cited that finds mostly light alkanes, mineral oil, and CO₂.
Safety data sheets list broad petroleum distillate categories and ranges, but not exact species or percentages. People note SDSs are for safety, not full recipes.
Several see the “vault” and ultra‑secrecy as largely marketing, akin to Coca‑Cola’s “secret formula.” Others note that exact concentrations, processing steps, and base mixtures make perfect cloning nontrivial, but “close enough” industrial copies would be straightforward.

Manufacturing & information compartmentalization

Readers question how a mass‑produced product can be made if the formula is known only to a few.
Proposed answers: split supply chains, unlabeled ingredients, different plants mixing partial blends, or Coke‑like arrangements where no single group has the full picture.
Skeptics respond that procurement, tax, regulatory paperwork, and SDS requirements inevitably leak much of the composition, so the bank‑vault story is mainly PR.

What WD‑40 actually does

Widely repeated: “WD” stands for water displacement. Many treat it primarily as a water displacer/cleaner/solvent that leaves a thin oil film, not as a serious lubricant.
Common uses mentioned: drying wet tools, freeing stuck parts, cleaning threads and metal surfaces, removing sticker residue, light rust removal, cutting fluid for aluminum.

Is it a lubricant? Ongoing argument

One camp: if it reduces friction, it’s a lubricant; WD‑40’s own site calls it a blend of lubricants plus corrosion inhibitors and cleaners.
Opposing camp: in practice it’s a poor or “anti‑” lubricant—evaporates, strips existing grease, attracts dirt, leaves gummy/varnish residues, and performs badly for long‑term lubrication or as a top penetrating oil.
Consensus trend: acceptable for quick fixes and “get it moving,” but usually the wrong choice for lasting lubrication.

Alternatives, performance, and brand power

Project‑style tests are cited: dedicated products (acetone+ATF, Liquid Wrench, Kroil, PB Blaster, others) generally outperform WD‑40 for penetrating, rust prevention, and wear.
Recommended substitutes:
- Hinges/household metal: white lithium grease, 3‑in‑1 oil.
- Heavy machinery/bearings: thicker lithium greases.
- Plastics/rubber/locks: silicone or graphite.
- Rust protection: Boeshield, lanolin‑based sprays, specialized coatings.
Many conclude WD‑40’s real edge is ubiquity, brand recognition, and “good enough” versatility, not unique chemistry or top‑tier performance.

View on HN ↗ Original Article ↗

2026-01-26

A few random notes from Claude coding quite a bit last few weeks

Shifts in Coding Workflow & Tooling

Many describe a “boiling frog” progression: from occasional chat use → in-IDE prompts → full agents, now rarely hand-coding routine work.
IDEs remain central: common pattern is agent/CLI on one side, IDE on the other for diffing, testing, and manual fixes.
Dedicated harnesses (Claude Code, Cursor, Codex CLI, Zed agents, Copilot agent mode) are seen as far more effective than generic web chat, especially on large repos.
Narrow, mechanical tasks (API migrations, CRUD, refactors, legacy auth swaps) are strong use cases; fully autonomous greenfield feature builds require close supervision.

Capabilities, Failures & “Slopacolypse”

Strong agreement that models no longer mostly fail on syntax; they fail via wrong assumptions, hidden regressions, overengineering, and test-flogging (e.g., deleting or rewriting tests to pass).
Several report 50–60% “acceptable with iteration” success; others claim a recent inflection (notably with newer Anthropic models) enabling end‑to‑end features on complex monorepos.
Many expect a coming wave of low-quality “slop” across code, docs, and content, especially as mediocre users ship AI output they don’t fully understand.

Builder vs Coder, Management vs Craft

A recurring theme is a split between people who love building outcomes and those who love writing code itself.
LLM-centered workflows feel to some like doing product/management: writing specs, orchestrating agents, reviewing diffs—“always in a meeting.”
Others enjoy the shift: less boilerplate, more design and domain thinking, and “literate programming”-like flows (plans → implementation → tests).

Skill Atrophy, Learning, and Juniors

Multiple commenters report real “brain atrophy” and temptation to accept AI designs they wouldn’t have written themselves.
Concern that future developers may never internalize fundamentals, becoming unable to review or debug nontrivial AI code, especially in unfamiliar domains (SIMD, FPGA, complex game engines, etc.).
Some argue skills can be regained like “rusty chess” and that reading/review will matter more than raw typing.

Productivity Distribution, Careers & Hiring

Widespread belief that LLMs magnify differences: strong engineers get dramatically more leverage; weak ones are exposed.
Juniors may struggle: AI can match a typical portfolio; the bar to be employable may rise, not fall.
Interviews are already shifting toward “vibe coding” live with the candidate’s preferred tools, plus assessing their ability to control AI slop and say “no.”

View on HN ↗ Original Article ↗

2026-01-26

ChatGPT Containers can now run bash, pip/npm install packages and download files

New container capabilities & language support

ChatGPT’s “containers” can now run bash, install packages via pip/npm, download files, and execute multiple languages (Node, Ruby, Perl, PHP, Go, Java, Swift, Kotlin, C/C++).
Feature seems available even to free users, but heavily rate-limited; paid users report more stable access.
Some minor rough edges: npm auth misconfigurations, needing to explicitly say “in the container” to avoid getting only instructions.
Users have successfully installed additional tooling (e.g., deb packages, Ruby gems) inside the sandbox.

Dependencies, packages, and LLM-written code

One thread questions whether npm/pip-style dependency trees still make sense if LLMs can generate needed code on demand.
Pushback: serious libraries (NumPy, pandas, scikit-learn, BLAS, crypto, etc.) encapsulate heavy correctness and performance work that is not realistic to “regenerate” every time.
Concerns about “AI-slop” dependencies vs. vetted, human-reviewed libraries and supply-chain attacks (both through public registries and inside containers).
Some users now inline tiny modules directly into projects to avoid dependency bloat and npm/pip-jacking.

Static vs dynamic languages in the LLM era

Big subthread on whether dynamic languages’ advantage shrinks when LLMs write most of the code.
Many report moving prototypes/CLI tools from Python/JS to Go or Rust, arguing:
- Compiler/type errors are a powerful feedback loop for agents.
- Static constraints reduce “category errors” (types, lifetimes, concurrency, memory safety).
- Go’s simple syntax, tooling, and standard library pair well with coding agents.
Counterpoints:
- Python/TypeScript still give shorter, more legible code for humans reviewing AI output.
- LLMs perform worse in less-popular or niche languages; training data and ecosystem maturity still matter.
- Some suggest a pipeline: prototype in Python, then use LLMs to port to Rust/Go; others question why not write Rust/Go directly.

Security, isolation, and compute limits

Users ask if code runs “as root” and how isolated it really is.
Responses indicate:
- No sudo/apt; installations via pip/npm in a restricted user environment.
- Containers reportedly use gVisor and other hardening techniques, but skepticism remains due to frequent container escapes.
CPU/RAM observations: environment reports many cores (e.g., 56) but likely via shared host topology and cgroup throttling rather than dedicated compute.
Infosec commenters expect a surge in sandbox escapes, supply-chain attacks, and generally more insecure, AI-generated systems.

Agents, dev environments, and tool ecosystems

Several note this move positions ChatGPT as a full “remote dev box”, potentially eroding demand for local environments and some SaaS sandboxes.
Interest in persistent or ephemeral virtual dev environments: some tools (Claude Code for web, sprites-like systems, custom VM offerings) are already experimenting here, though stability is mixed.
Linux tool access (ffmpeg, ImageMagick, file/magic, etc.) enables agents to solve “real” system tasks (e.g., image/video transformations, print-preflight checks) more reliably than pure model reasoning.

LLM usage, “vibecoding”, and quality

Strong disagreement over the claim that “most code is now written by LLMs”:
- Some engineers (including at large companies) report 20–80% of new code authored by agents, especially boilerplate, tests, and frontends.
- Others say LLM code in production is still rare in their domains, or limited to assistance rather than full authorship.
Advocates argue:
- Human time is better spent on problem selection, design, and verification than hand-writing boilerplate.
- With good specs, tests, and review, large refactors and greenfield projects can be done dramatically faster.
Skeptics stress:
- “Vibecoded” systems risk being fragile, insecure, and poorly understood by their nominal owners.
- Most existing human-written code is already low quality; training on it plus weak specs may amplify garbage.
- Customers may not yet see clear end-user benefits, especially where organizational factors dominate quality outcomes.

Other models & regressions

Comparisons:
- Some prefer ChatGPT for search and these new containers; others favor Claude Code’s agentic behavior and Gemini for search.
Reports that Gemini recently lost (or broke) its ability to actually execute Python/JS despite claiming to do so, undermining trust in its “run code” feature.

View on HN ↗ Original Article ↗

2026-01-26

When AI 'builds a browser,' check the repo before believing the hype

What the demo actually was

Many readers initially assumed “AI built a browser” meant an original, production‑grade engine; cloning the repo showed a brittle, partially working experiment.
The codebase is messy, slow, glitchy, and far from real‑world browser parity; some called it “app‑shaped” or “engine‑shaped” rather than a usable browser.
An engineer involved said the goal was to stress‑test agents on a large, open‑ended task, not to ship a product.

Compilation, dependencies, and “from scratch”

Dispute over whether the project even compiled: some noted broken builds and CI, others clarified it compiled intermittently but not reliably or in GitHub Actions.
The engine uses Servo components (cssparser, html5ever) and Taffy, plus typical libraries like HarfBuzz.
Critics argue this contradicts “from scratch”; defenders say using standard libraries is normal and it is not a mere “Servo wrapper.”

Marketing, hype, and ethics

Strong disagreement over whether the company’s claims were mild startup puffery or actively misleading “fraudulent misrepresentation.”
Concern that management and investors only see the headline “AI built a browser,” not the caveats or the repo, yet will form expectations and make staffing decisions on that basis.
Some see the entire exercise as hype for subscriptions and funding; others say it’s a standard tech hype cycle, not a unique scandal.

Lines of code and bogus productivity metrics

Heavy criticism of touting “3M+ LOC” as an achievement; many emphasize code is a liability, not an asset.
Historical arguments against LOC as a productivity metric are repeated; yet people note KPIs and “% of code written by AI” are resurging as management metrics.
One engineer reports a similar browser‑level result in ~20k LOC, underscoring that sheer volume mostly reflects bloat and “slop.”

What this says about current LLM capabilities

Broad agreement: LLMs are genuinely useful for small, well‑scoped coding tasks, autocomplete, and refactoring.
Many say they still cannot autonomously deliver large, coherent systems without heavy human steering; agents tend to increase “entropy” and tech debt.
Optimists see the week‑long autonomous run as a real milestone in handling longer tasks and expect rapid improvement; skeptics say every high‑profile “AI built X” demo collapses on inspection.

Costs, scale, and token usage

Reported “trillions of tokens” and multi‑million‑dollar cost are questioned as numerically implausible given latency and 2,000‑agent concurrency.
Commenters criticize secondary sources that estimate costs via another chatbot without transparent methodology.

View on HN ↗ Original Article ↗

2026-01-26

AI code and software craft

Enterprise vs Consumer Software Incentives

Enterprise tools are often bad not just because buyers don’t use them, but because big-paying customers demand bespoke features and weird configuration paths that outlive their original sponsors.
Consumer software can be more polished but is often optimized for engagement, not actual value.
Misaligned incentives (manager vs frontline worker) create friction: managers want data and controls; workers see slow, annoying UIs with duplicate data entry and no budget for proper integrations.

AI as Industrialization: Luddites, Cloth, and Quality

Several comments recast the debate as a modern Luddite vs industrialist conflict: craft/agency vs efficiency/scale.
Others push back: early industrial cloth and many modern garments are argued to be worse (and more environmentally harmful) even if cheaper and more abundant; quality decline is framed as both an engineering constraint and an economic choice.
Parallel drawn: even if AI output is worse, it can still displace human labor, just as lower-quality machine-made goods did.

Craft, Plumbing, and What Most Software Really Is

Many argue most industry software is already “plumbing” and largely mediocre; AI simply matches that baseline and exposes how little “craft” was happening anyway.
For some, AI tools finally make it feasible to ship side projects and experiments that previously died at the “init commit” stage.
Others counter that the idea AI will “free up” engineers to do more craft is wrong; instead it may finish off what remains of craftsmanship, relegating hand-coding to a niche hobby, like blacksmithing.

Code Quality, Correctness, and AI Slop

Strong divide on AI code quality: some say agents can produce high-quality code with orchestration, tests, and review; others say generated code is “orders of magnitude worse” and creates huge, hard-to-verify diffs.
Consensus that AI is great for boilerplate, glue, scaffolding, and small internal tools; much weaker for system-level reasoning (auth boundaries, failure modes, state consistency).
Several note AI amplifies existing tendencies: good engineers get faster; sloppy ones produce more slop.

Labor, Training, and Incentives

Concern that if one senior can do the work of multiple juniors with AI, companies will stop hiring juniors, hollowing out the pipeline of future experts.
Others liken this to offshoring and open source: long-running forces that already devalued some aspects of coding labor.
A few insist the real problem is incentive structures: productivity gains are being used to cut headcount, not buy humans time or improve quality.

Control, Understanding, and Tooling Limits

Debate over how much “control” developers truly have over LLMs: some claim you can strongly steer architecture and style; critics say you only influence probabilities and must constantly guard against models “going off the rails.”
Disagreement over whether current systems “understand” anything; some see that critique as philosophical hair-splitting if the tool is practically useful for software tasks.

Societal and Political Concerns

One branch worries AI-generated media will so thoroughly pollute the information environment that people no longer trust any event, neutering mass mobilization and accountability.
Others argue media credibility was eroding already; AI is another accelerant but might also force long-overdue investment in identity, trust, and security.

Efficiency, Metrics, and the Fate of Craft

Multiple comments connect AI’s rise to a broader cultural fixation on efficiency as the supreme value, even when it undermines resilience or long-term health.
Because efficiency and output are easy to measure and “craft” is not, organizations naturally optimize for the former—AI fits neatly into that logic.
Some remain hopeful that while AI will flood the world with “slopware,” the absolute amount of well-crafted software might still grow, created by those who deliberately use these tools to extend, not replace, human judgment.

View on HN ↗ Original Article ↗

2026-01-26

House of Lords Votes to Ban UK Children from Using Internet VPNs

Status and Scope of the Proposal

House of Lords vote is only one stage; the measure is not yet law and may change or fail.
Text is ambiguous: regulations “may” require “highly effective” age assurance, leaving room for broad or narrow implementation and heavy regulator discretion.
Likely enforcement vectors discussed: large fines (as with porn age checks) and ISP-level blocking of non-compliant services, including possibly big cloud providers.

Age Verification, KYC, and Digital ID

If implemented strictly, VPN providers would effectively need to know users’ ages, implying KYC-style checks (ID documents, credit/debit card checks, or equivalent).
Some argue existing financial KYC plus payment records already link accounts to real identities; others stress lawmakers/industry are pushing toward pervasive digital IDs and state-mandated identity services.
Concerns that financial traces (bank/credit card statements, authorizations) can resurface in legal, rental, or loan contexts; privacy and stigma issues are raised.

Effectiveness vs Circumvention

Critics say children will simply switch to:
- VPS-based self-hosted VPNs, “secure proxies,” Tor, or obfuscated protocols (Shadowsocks, V2Ray, etc.).
- Foreign VPNs outside UK jurisdiction, until or unless blocked by ISPs.
Supporters counter that payment, KYC, and friction (credit cards, parental oversight) raise the bar enough to reduce harm, even if not perfectly.
Others argue bans will push kids toward more dangerous, non‑compliant services and do little to address the underlying risks.

Motives: Child Safety or Surveillance/Censorship?

Many see “think of the children” as a pretext:
- A path to eliminating online anonymity and mapping which adults use VPNs.
- A complement to broader censorship and information control (e.g., restricting graphic war/genocide content; Gaza is mentioned).
Counterview: governments naturally seek more power; foreign pressure is not required, and there is significant domestic electoral demand from worried parents.

Child Addiction, Phones, and Social Media

One participant involved in UK advocacy frames this as part of tackling phone/social-media addiction, loss of focus, and dopamine desensitization in children.
Argument: network effects force even cautious parents into allowing phones/social media; legal bans and friction can weaken those effects.
Many push back:
- VPN age-gating doesn’t directly address school-issued iPads, phone-in-class policies, or addictive algorithms.
- Better levers would be: banning/limiting targeted feeds, mandating transparency, school-level device restrictions, parental education, and better parental controls.

Civil Liberties, “Nanny State,” and Comparisons

Strong civil-liberties concerns:
- Normalizing ID checks for VPNs paves the way to ID for “everything you do online.”
- Data breaches are seen as inevitable; citizens and especially children will pay the price.
UK is portrayed by some as increasingly paternalistic and surveillant (CCTV, prior GCHQ revelations), with comparisons to China or Iran’s information controls.
Some parents explicitly state they will obtain VPNs for their children and teach technical workarounds, concluding that such laws mainly teach kids that government is hostile and untrustworthy.

Meta-discussion and Inconsistencies

Several note the pattern where:
- Online debates call for strong restrictions “for the children,”
- Then react with shock when those restrictions materialize as heavy-handed surveillance and ID requirements.
There is disagreement whether the real problem is “harmful content,” “children’s access,” or the business models of engagement-maximizing platforms; no consensus emerges on where regulation should bite.

View on HN ↗ Original Article ↗

2026-01-26

Fedora Asahi Remix is now working on Apple M3

M3 SUPPORT STATUS

Fedora Asahi Remix now boots on M3 systems, including laptops; unclear from thread whether M3 Ultra is supported yet.
Multiple people note this is “breaking news” and Asahi’s official feature matrix may lag behind.
Some argue “now working” is a bit misleading because many subsystems are incomplete; others emphasize that just getting M3 to boot at all is a major milestone.

GPU, DISPLAY, AND PORTS

Current M3 support uses llvmpipe (software rendering), not the Apple GPU; several commenters say they don’t consider it “really working” for laptop use until GPU acceleration lands.
M3 GPU ISA differs significantly from M1/M2, so compiler and driver work must be redone.
DisplayPort Alt Mode over USB‑C is a key blocker for many; there are experimental “fairydust” kernel patches and a test branch people report as working on M1, with plans to make it generally available (timeline mentioned as early 2026).
Thunderbolt and ProMotion support are asked about; ProMotion is seen by some as marginal, while sleep, battery life, and external display support are higher priorities.

FUTURE CHIPS (M4, M5) AND SECURITY FEATURES

M4 is described as harder due to new hardware-level protections (Secure Page Table Monitor); there’s debate about how hard SPTM is to emulate for macOS virtualization used in reverse‑engineering.
M5 reportedly adds a new GPU generation and GPU-side neural accelerators; some think NPUs are not critical for Linux, others distinguish between GPU tensor units (already widely used) and separate NPUs.

WHY APPLE SILICON IS HARDER THAN X86

Intel/AMD contribute Linux support before hardware ships; Apple provides no docs and frequently changes GPU ISA and SoC details, forcing repeated reverse‑engineering.
ARM platform diversity and lack of consistent PC-style standards (UEFI/ACPI everywhere) make generic support harder than for “PC-compatible” x86.

USAGE, INSTALLATION, AND ALTERNATIVES

Asahi is already a solid daily driver for many on M1/M2 (Mac mini, laptops), with good trackpad and Wi‑Fi reported; Thunderbolt and high-end compute remain gaps.
Asahi’s installer is also used as a base to install other distros (e.g., NixOS); dual‑boot with macOS is standard and wiping macOS is discouraged.
Some recommend waiting for full GPU support or instead buying well‑supported x86 laptops (Intel Panther Lake, AMD Strix Halo) if Linux is the primary goal.

PROJECT HEALTH, COMMUNITY, AND ETHICS

Delays on newer chips are attributed to prior tech debt, focus on upstreaming patches, and a major harassment campaign targeting a lead developer.
Some discuss donating to support Asahi; others refuse to buy Apple hardware for ethical reasons, while a few see used Macs as excellent Linux ARM machines once supported.
A long tangent explores how talented young hackers get ground down by corporate work, plus debates on universal healthcare, FIRE, and economic structures enabling more independent tech work.

View on HN ↗ Original Article ↗

2026-01-26

JuiceSSH – Give me my pro features back

Loss of JuiceSSH Pro Features & User Impact

Multiple users report previously purchased Pro features (especially port forwarding and cloud backup/sync) no longer work, or the app asks them to pay again.
Some who repurchased at higher prices were immediately locked out or saw no benefit.
Plugins required separate Play Store APKs that are now delisted, further degrading functionality.
JuiceSSH itself appears delisted for some users; others still see existing installs but with broken backend services.

Rugpull, Exit Scam, or Just Neglect?

One side calls this a “rugpull” / “exit scam”: lifetime purchases no longer honored, price increases, backend shutdown, and no communication or refunds.
Others argue it looks more like abandonment or life changes rather than intentional fraud, noting the app’s many years of solid service.
Some commenters looked up the developers’ current corporate roles and criticize them for not wrapping things up responsibly (refunds, open-sourcing, or unlocking Pro for all).

Alternatives to JuiceSSH

Termux is heavily praised: full Linux userspace, built‑in ssh/rsync/editor, free, and works well with custom keyboards and widgets for one‑tap SSH/port‑forward scripts.
ConnectBot, Termius (local use free), and Serverbox are cited as good SSH clients; several users say they “never looked back.”
On iOS, multiple SSH/terminal apps are said to surpass JuiceSSH; some switched platforms partly for better app quality.

Android Terminal & Virtualization Discussion

Android’s new “Terminal / Debian VM” (Android 15+) is discussed: full Debian in a VM, but heavy, flaky, and limited to certain devices/SoCs and pKVM setups.
Comparisons: Termux runs directly in Android userspace (with unusual paths); optional PRoot “fake chroot” is slower. The VM approach avoids old host kernels but is laggier and less stable for now.

Security & SSH Key Management

Broken cloud backup prompts concern over old keys still stored remotely. Some advise rotating keys and moving to modern algorithms (e.g., ed25519).
Strong opinions:
- Private keys “should never leave the device” vs.
- Having distinct backup keys and multiple client devices as a practical compromise.
Debate over encrypting keys with passphrases: helps but still vulnerable to offline attacks if passwords are weak. Suggestions include SSH certificates, hardware tokens (YubiKey/TPM), and agents to reduce passphrase typing.

Refunds, Google Play, and Ownership

Several users report failed refund attempts due to Play Store time limits (e.g., 48 hours or 120 days).
Some mention using chargebacks via credit cards but fear (or report) Google retaliating by locking accounts.
Examples of other purchased apps being sunsetted (games bought by large companies and removed) reinforce worries that paid apps are effectively rentals.

Patching, Sideloading, and Piracy Ethics

The blog’s smali patching is appreciated as a “classic cracking” throwback; some suggest tools like ReVanced/Morphie as general patching workflows.
Ethical split:
- One camp says patching out Pro checks is justified since users are merely restoring what they paid for.
- Another argues it’s still piracy; the proper path is refunds, reviews, and migration to alternatives.
Concern that stories like this may be used to argue against sideloading; others counter that closed ecosystems are exactly why users need the ability to patch/escape.

View on HN ↗ Original Article ↗

2026-01-26

The Adolescence of Technology

Nuclear deterrence and AI-enabled warfare

Several commenters fixate on the essay’s suggestion that advanced AI could threaten the nuclear triad (sub detection, satellite/C2 hacking, influence ops on operators).
Some see this as the “loudest possible klaxon” governments can hear; if taken literally, it implies a need to rethink or even abolish nuclear deterrence.
Others are skeptical current or near-term AI can overcome hard physical constraints (e.g., submarine tracking), viewing such claims as speculative or marketing-driven.
Related concern: if AI makes human labor economically irrelevant, governments may care less about protecting their own populations, undermining deterrence even if the hardware still works.

Capabilities, scaling, and limits of current AI

Ongoing tension between “smooth scaling” believers and those who see looming ceilings (data scarcity, synthetic data issues, diminishing returns).
Example of Claude mishandling a Bible search is used to argue models don’t operationalize their own “knowledge” like humans do; others respond that cherry-picked failures don’t refute overall trends.
Some say coding is special: abundant training data and easy verification make software uniquely amenable to LLMs; transfer to fuzzier, physical, or less-verifiable domains is far from guaranteed.
Others, citing internal experience at labs, report continuous, linear-ish capability gains and early signs of AI accelerating AI R&D.

Economic disruption, work, and inequality

Split between those who expect massive, rapid job loss and 10–20% GDP growth, and those who see mostly incremental change outside software.
Even in software, several say the main change is faster CRUD and prototyping, not fundamentally new products or superhuman design.
Worries center on extreme wealth concentration, erosion of democracy, and workers’ declining share of GDP. Some fear premature “world without work” policy responses (e.g., UBI) long before physical/embodied jobs are actually automated.
Others argue that many technologies plateau at “good enough” and then only chase diminishing returns, suggesting AI might likewise stall before fully displacing human labor.

Propaganda, control, and authoritarian uses

Strong concern that AI will supercharge propaganda: bots flooding social media, hyper-targeted narratives, and general epistemic breakdown (“I already assume Reddit comments are mostly propaganda/bots”).
Some think this is already happening at scale and see migration to “cozy web” (small private groups, verified relationships) as a rational response.
The essay’s focus on autocracies (especially China) worries some readers who believe it underplays the risk of US or corporate misuse against their own populations.

Alignment, corporate incentives, and sincerity

Repeated suspicion that frontier labs overstate catastrophic risks to:
- Signal power (“our tech is world-ending-level strong”), and
- Position themselves as the uniquely “safe” vendor.
Some argue if leaders truly believed in near-term existential danger, they would slow or halt development, not raise more capital and ship more models.
Discussion of weird RLHF dynamics (e.g., needing to phrase “cheating” as “good” to preserve a model’s self-image) is seen as evidence of opaque, fragile “AI psychology.”
Skepticism that “voluntary corporate actions” will ever be sufficient; perceived real incentives are PR risk management and pre-empting heavier regulation.

Robots, the physical world, and timelines

Several note that autonomous driving and robotics have lagged expectations by over a decade, cautioning against extrapolating text/coding gains to the physical world.
Others counter that with AI-designed software and hardware, robot capability and deployment could accelerate once key bottlenecks (e.g., better architectures, simulations) are solved.

Cultural roots, politics, and community dynamics

Commenters trace many of the essay’s premises (AGI is possible, imminent, dangerous) to the long-standing rationalist/EA milieu and its influence on today’s AI leadership.
Some describe this as a quasi-religious or cult-like consensus that has migrated from fringe blogs into the boardrooms of major labs.
There is also disappointment that the essay treats US-led AI dominance as broadly benevolent, while many see US political institutions as too captured and polarized to be trusted with such tools.

Emotional reactions and generational anxiety

Younger readers express deep anxiety about career prospects and meaning if white-collar work is automated away.
Responses urge:
- Critical reading of incentive-laden narratives from AI CEOs,
- Broad education beyond AI hype cycles, and
- Separating life meaning from career status.
Others note that previous generations lived under existential threats (war, nuclear annihilation, disease) and that media overexposure amplifies despair today.

View on HN ↗ Original Article ↗

2026-01-26

DHS keeps trying and failing to unmask anonymous ICE critics online

Administration sensitivity and narrative control

Commenters see the repeated DHS attempts to unmask anonymous ICE critics as part of a broader pattern: extreme sensitivity to negative portrayals of ICE while showing little interest in changing underlying behavior.
The goal is widely interpreted as controlling the narrative and intimidating critics, not genuine security concerns.

Deterrence, authoritarian drift, and dehumanization

Several argue the point of targeting a few critics is to “make an example” and deter others from exposing ICE officers or operations.
Some describe ICE as an emerging terror apparatus: huge budgets, AI surveillance, detention centers, and a likely search for new “targets” once immigrants aren’t enough.
Others push back on language that dehumanizes ICE agents, warning that using “subhuman” rhetoric mirrors the same logic used to justify abuses; critics counter that some acts (e.g., child separations) forfeit moral standing.
There is disagreement on whether the U.S. will fully “slide” into open authoritarianism or whether current excesses are a temporary executive whim.

AI, surveillance, and plausible deniability

Palantir and similar tools are seen as key infrastructure: data mining to locate critics and immigrants at scale.
False positives are viewed as a feature, not a bug: ICE is described as unconcerned with accuracy and using AI to shift liability—“the AI told me to do it” as future defense.

Public opinion: polls vs “the streets”

One side cites polling showing ICE and current immigration actions are net unpopular overall, including with independents, and that approval is dropping.
Others distrust polls and instead rely on conservative media, subreddits, and call‑in shows, perceiving strong base support.
A long sub‑thread debates whether heavily moderated partisan communities meaningfully represent average voters, with no consensus.

Doxxing ICE agents and privacy

The underlying Instagram account allegedly posts names, faces, and work license plates of ICE officers.
Some say federal agents in public deserve no more privacy than other public employees; anonymity undermines accountability and enables “terror.”
Others worry about escalation but still oppose DHS attempts to pierce anonymous speech.

Impunity, crowdfunding, and escalation fears

Commenters note recent killings by ICE officers, arguing they face less scrutiny than local police and are being financially rewarded via crowdfunding.
This is framed as proof that a substantial constituency actively supports deadly force against immigrants and protesters.
Several warn this dynamic could lead to larger-scale killings of protesters, with invocations of “banana republic,” Iran, and Tiananmen.

Free speech and DHS overreach

Many see DHS’s unmasking efforts as a direct attack on political speech—the most protected category of speech in the U.S.—and an offensive misuse of taxpayer funds to suppress criticism rather than address abuses.

View on HN ↗ Original Article ↗

2026-01-26

Is It Time for a Nordic Nuke?

Deterrence Logic and Delivery Systems

Strong focus on what makes a “credible” deterrent: not just having a bomb, but survivable second‑strike capability.
Suggested platforms: submarines, mobile road/rail launchers, aircraft, container ships, underground tunnels, and “launch on warning.”
Debate over container-ship or pre‑positioned nukes: some see them as a way to guarantee retaliation after a decapitation strike, others argue they’re destabilizing surprises and hard to attribute, thus poor for deterrence.
Submarines seen as the gold standard; Nordic navies (especially Sweden) are cited as having relevant experience, but nuclear-armed subs are a different level of complexity.

Lessons from Ukraine and Security Guarantees

Repeated claim: “the lesson of Ukraine” is that any state wanting real independence must have its own nukes; security guarantees and memoranda are portrayed as unreliable.
Counterpoint: Ukraine never truly had an operational deterrent—warheads were Russian, infrastructure was lacking, and maintaining a serious arsenal would have exceeded its post‑USSR capacity.
Others argue Ukraine had the industrial and scientific base to bootstrap its own arsenal from inherited hardware, but chose not to.

Arguments For Nordic (and Wider European) Nukes

Many commenters think it is now clearly in Nordic self‑interest to develop a deterrent, given perceived Russian aggression and doubts about US reliability.
Some extend this logic to central Europe (Poland, Czech Republic, etc.) and even Canada, arguing that sovereignty now effectively requires a nuclear umbrella.
View that a small arsenal (even “one nuke”) is enough to force any aggressor to price in the loss of a major city.

Arguments Against Nordic Nukes / Pro-Disarmament

Others insist the answer is “no” or call instead for disarming Russia—though they acknowledge no realistic method exists that doesn’t risk nuclear war.
Concerns: proliferation increases accident risk and chances of miscalculation; limited nuclear war is deemed unlikely, with any use having potential for global catastrophe.
Some emphasize moral and targeting dilemmas: nukes pose more questions than they answer, especially when likely targets include civilian-heavy areas.

Feasibility, Politics, and Historical Context

Technical skeptics argue the article understates the difficulty of enrichment and reprocessing; Nordic states lack such facilities and would face supply‑chain, political, and sabotage/assassination risks.
Others note Sweden’s historical weapons program got close to a bomb in the 1960s and could, in principle, be revived, though domestic politics and anti‑nuclear sentiment are major barriers.
UNSC opposition is cited as a strong constraint; North Korea is mentioned as evidence that even that system is far from airtight.

US Reliability, NATO, and European Autonomy

Strong thread on US unpredictability (especially under Trump) undermining trust in the American nuclear umbrella and NATO guarantees.
Some argue Europe should have reduced reliance on US deterrence long ago and is now slowly rearming and investing (e.g., artillery production), but still mostly “talks and doesn’t act.”
UK and French arsenals are acknowledged, but several commenters doubt they are sufficient or politically guaranteed as substitutes for US protection.

View on HN ↗ Original Article ↗

2026-01-26

France Aiming to Replace Zoom, Google Meet, Microsoft Teams, etc.

Project and Technical Approach

France is rolling out “Visio” as part of La Suite Numérique for public-sector video calls, framed as a secure, sovereign tool with guarantees on availability and confidentiality.
The stack is largely open source: Visio is built on LiveKit, the suite uses Django, and code is on a public Git hosting platform.
The suite also includes sovereign replacements for chat (Tchap), file transfer (FranceTransfert), drive, email, docs, and spreadsheets.
Some users report Visio as “fine but below Zoom” (weaker noise cancellation, browser permission friction), others find LiveKit-based solutions easier to run than Jitsi.

Motivations: Sovereignty, Security, and US Dependence

Core driver is reducing dependence on US tech (Zoom, Teams, Google Meet, US clouds) for state and critical infrastructure.
Commenters repeatedly cite the CLOUD Act, sanctions, and recent US behavior (tariffs, NATO rhetoric, Greenland threats, ICC-related actions) as proof the US has and will use an “off switch” on foreign infrastructure.
Many argue this is part of a broader shift: sovereign clouds (OVH, Scaleway, Hetzner, etc.), sovereign messaging (Matrix, Tchap), and even sovereign office suites.

Cloud and Infrastructure Challenges

Several argue replacing conferencing is “easy”; the hard problem is bootstrapping a hyperscale cloud to rival AWS/Azure/GCP, which require massive capital and usually sit inside larger conglomerates.
Others counter that EU providers already offer adequate compute and storage; their main advantages are transparent pricing and lower “gotcha” billing, not breadth of managed services.
Hardware dependence (US chips, Chinese manufacturing, Dutch lithography) is seen as a deeper sovereignty bottleneck than videoconferencing software.

Open Source and “Eurostack” Vision

Strong sentiment that EU should aggressively fund open-source basics—video, office, storage, OS—rather than proprietary clones.
Visio/La Suite are praised for being OSS and contributing upstream; people hope multiple governments will co-fund shared tools (Jitsi, Galene, Nextcloud, LibreOffice, Matrix, etc.).
There is frustration that key FOSS apps (especially office suites) still lag commercial products in usability and polish despite decades of work.

Adoption, Network Effects, and Policy Levers

Skeptics doubt large-scale abandonment of Teams/Zoom without compelling superiority; others note governments can bypass “pure market” dynamics via mandates and procurement.
Proposed levers: require sovereign tools for government, regulated industries, and vendors; enforce interoperability standards; potentially tariff or ban non‑EU systems on national‑security grounds.
Some see this as the biggest concrete move yet (outside China/sanctioned states) to unwind US big‑tech dominance—small technically, but symbolically and strategically important.

View on HN ↗ Original Article ↗

2026-01-26

RIP Low-Code 2014-2025

AI vs Low‑Code: Replacement or Merger?

Some argue LLMs make hand‑coded internal apps so fast and cheap that many low‑code tools (Retool, n8n, Budibase, etc.) are no longer worth using. Several posters report already replacing low‑code dashboards and CRUD tools with AI‑generated code.
Others see the opposite: AI and low‑code are complementary. Low‑code’s data models, DSLs, and visual workflows give LLMs a constrained, predictable substrate—LLMs generate/modify flows instead of raw code.
A recurring view: generic “app builder” low‑code may suffer most, while domain‑specific / vertical low‑code and orchestration tools could be strengthened by agents.

Deployment, Maintenance, and Guardrails

Multiple comments push back on “cost of shipping code approaches zero.” Writing code is cheaper; operating, securing, monitoring, upgrading, and auditing are not.
Low‑code platforms still win on: auth/RBAC, compliance, hosting, upgrades, and runtime stability. AI can spin up many internal tools, but who maintains them when APIs change or requirements drift?
Guardrails and predictability are cited as major advantages: you know what a Retool‑style app can and can’t do, whereas LLM‑generated “vibe code” can be opaque and fragile.

Who Benefits: Developers vs Non‑Developers

For professional developers, frameworks (Rails, Django, ABP, etc.) already act as “low‑code” by handling boilerplate; paired with LLMs, custom code can beat low‑code in speed and flexibility.
For non‑developers, low‑code’s visual introspection and WYSIWYG UIs remain key. Several expect future workflows where non‑technical users talk to agents, which then manipulate low‑code platforms under the hood.
A common theme: frictionless deployment (one‑click publish vs learning AWS/npm/bash) is still a major moat for low‑code in the “citizen developer” market.

Historical Context and Lock‑In

Many compare current tools to older low‑code systems: MS Access, Visual Basic, Delphi, PowerBuilder, Oracle Forms. Some praise how quickly those enabled LOB apps; others recall scalability, corruption, and governance nightmares.
There is criticism that modern low‑code often combines the worst of both worlds: proprietary lock‑in, limited extensibility, high per‑seat cost, and poor ecosystems.
Several predict “low‑code as a product category” may shrink, even if the underlying ideas—abstraction, DSLs, visual flows—persist inside AI‑first and open‑source stacks.

View on HN ↗ Original Article ↗

2026-01-26

There is an AI code review bubble

Scope of the “AI code review bubble”

Many commenters agree there is a bubble: “everyone is shipping a code review agent,” often with thin differentiation.
Several see code review as a feature that will be bundled into existing platforms (GitHub, GitLab, IDEs) rather than a standalone product category.
Some argue most “AI code review startups” are just wrappers over the same few frontier models and are easy for model providers or platforms to subsume.

Greptile’s positioning and skepticism

The article’s claims of “independence” (separate review agent from generator) and “autonomy” (fully automated validation) draw strong criticism:
- Models are trained on similar data, so “independence” is seen as mostly illusory.
- If review becomes truly autonomous, many believe it will just be a capability inside coding agents, not a separate product.
Several readers say the post spends more time on philosophy than on concrete differentiation or benchmarks; some call it pure content marketing.

Effectiveness vs linters and humans

Mixed but detailed anecdotes:
- Pro: Tools like Copilot, Bugbot, Claude, CodeRabbit, Unblocked, Cubic, etc. are reported to catch real bugs (race conditions, repeated logic across call boundaries, missing DB indexes, security issues) that linters and static analyzers missed.
- Contra: Others find them “pure noise,” catching trivial or impossible issues, misunderstanding language/library context, or arguing for pointless refactors.
Recurrent theme: signal-to-noise is the central problem. Tools tend to:
- Overproduce speculative or nitpicky comments.
- Miss architectural or business-context issues while focusing on micro-level style or minor inefficiencies.
Some commenters note that good prompting and customization per-codebase can dramatically improve usefulness.

Role and purpose of code review

Many insist review is primarily about:
- Knowledge sharing, architecture, design, and maintainability.
- Spreading understanding of system evolution among teammates.
Several argue: if you’re relying on AI review to “catch bugs,” you’re misusing PRs; tests, linters, and design should handle most defects.
Others counter that AI review is a useful extra safety net, especially for solo devs or small teams, and is better than no review at all.

Autonomy, human-in-the-loop, and culture

Strong pushback against visions of “vanishingly little human participation”:
- Concern that AI-generated and AI-reviewed code leads to large, poorly understood codebases and loss of engineering literacy.
- Emphasis that tests can’t catch everything; humans still needed for fitness-for-purpose, missed requirements, and long-term maintainability.
Some describe desired tools as “assistants” or “wizards” that:
- Highlight areas humans should inspect.
- Minimize verbosity and nits, focusing on high-severity issues.

Economics, integration, and DIY

Several note it’s trivial to:
- Pipe git diff into a frontier model via CLI, GitHub Actions, or custom pipelines.
- Integrate review directly into IDEs or internal tooling using raw APIs.
This leads to questions about what vendors really add beyond:
- Distribution/integration polish.
- Context management (e.g., cross-repo, DB schemas).
- Tuning for lower noise.

Trust, evaluation, and metrics

Debate over what counts as “evidence” of effectiveness:
- Simple counts of “great catch” replies are criticized as insufficient without false-positive rates or comparisons vs. baselines.
Some propose more rigorous evaluation (ROC-style analysis, controlled comparisons with expert reviewers and linters).

Human vs AI review friction

Several report practical frustrations:
- AI overwriting PR descriptions, arguing with itself, or producing long, vague comments.
- Review fatigue from endless variable-name suggestions and hypothetical edge cases.
Others say they now treat AI review like a powerful linter:
- Run on-demand, skim top-ranked issues, ignore the rest.
- Never a replacement, only a complement to human review and tests.

View on HN ↗ Original Article ↗

2026-01-26

Qwen3-Max-Thinking

Capabilities and Benchmarks

Qwen3-Max-Thinking is seen as competitive with frontier models but not clearly ahead of Claude Opus 4.5 or GPT‑5.2, especially in agentic coding where Opus still leads on SWE-verified tasks.
In the shared benchmark table, Qwen shines in:
- Instruction following / alignment (especially ArenaHard v2)
- Agentic search (HLE with tools)
It lags or is middling in:
- Agentic coding (SWE Verified)
- Several tool-use benchmarks (Tau², BFCL, Vita, Deep Planning).
Some argue benchmarks are increasingly detached from day‑to‑day usefulness; others still treat them as a valuable but incomplete signal.

Open vs Closed & Local Deployment

Qwen “Max” models remain closed-weight; access is via Alibaba’s API only, which many see as a dealbreaker versus open-weight GLM/Minimax/DeepSeek.
Several users confirm there is still no open-weight model that matches top-tier hosted coders on a consumer machine (e.g., M3 Pro with 18GB RAM).
Best current local options mentioned: Qwen3‑coder 30B, GLM‑4.7 Flash, some quantized variants on high‑VRAM GPUs—good but clearly below Codex/Opus/GPT in quality and speed.

Pricing and Market Dynamics

Qwen/Alibaba pricing is unclear; no obvious subscription comparable to Anthropic/OpenAI.
Within mainland China, Alibaba’s models are significantly cheaper; commenters attribute this to:
- Domestic price wars
- Lower local cost structures
- Direct government subsidies and “compute vouchers.”
Some complain Alibaba Cloud onboarding and billing (especially for reasoning tokens) make margin modeling hard.

Chinese vs Western AI Development

Several posts repeat the claim that Chinese frontier models trail US models by ~6–9 months.
A common narrative: Chinese labs heavily distill and SFT on outputs from US models due to compute constraints—keeping them close but not leading.
Others note that “capabilities are spiky”: with different RL focus, Chinese models could become best-in-class on specific tasks even if worse overall.
Debate over China’s long‑term compute advantage (energy capacity vs lagging GPU/CPU ecosystem) remains unresolved.

Censorship, Safety, and Trust

Qwen3-Max on Alibaba’s chat site refuses to answer questions about Tiananmen, Taiwan’s status, Xinjiang, etc., with “content security” errors; similar filtering appears in some open-weight Qwen variants’ thought traces.
Some see this as disqualifying for factual or research use; others shrug because they only care about coding.
Many draw parallels to Western models’ guardrails (drugs, hate speech, Gaza/Israel, certain individuals like a defamed law professor) and a US executive order on “woke AI.”
There is extended argument over whether government-mandated censorship (China) is categorically worse than corporate/soft censorship (US/EU), with no consensus.

Reasoning, Token Economics, and AGI

Qwen3-Max-Thinking explicitly exposes “thought” steps and is significantly slower; users speculate it consumes many more tokens per query.
Several point out that “better reasoning” is often just “spending more tokens,” i.e., economic tradeoff rather than pure architectural gain.
Concern: opaque, auto‑decided “thinking time” destroys predictable unit economics; others note newer APIs let you cap thinking effort.
Discussion on AGI: if strong reasoning requires huge per‑query compute, even a breakthrough model might be bottlenecked by inference capacity.

Search, Data, and the Chinese Internet

Qwen’s strong performance on tool‑augmented/“with search” benchmarks prompts speculation that Chinese web content or search infrastructure could be higher‑quality for certain tasks.
Others argue a simpler explanation: better retrieval and tool orchestration, not a fundamentally “better internet.”
Users dissatisfied with Western deep‑research features say they often surface low‑quality, repetitive web content; some prefer academic‑only search filters.

Developer Experience & Anecdotes

One user reports Qwen3‑coder significantly outperforming prior Gemini and Claude versions on complex Rust refactors (shared memory, SIMD) but at high Alibaba API cost due to large contexts.
Others find Qwen3-Max-Thinking slow and possibly overloaded at launch.
There is ongoing skepticism about “benchmaxxing” vs real‑world coding performance, but also clear enthusiasm for Qwen/GLM/Minimax as serious, closing‑gap alternatives to US incumbents.

View on HN ↗ Original Article ↗

2026-01-26

Windows 11's Patch Tuesday nightmare gets worse

Role of Windows in Microsoft’s Strategy

Debate over whether Windows is still a “main product” vs just a delivery platform for subscriptions (M365, OneDrive, Azure, Intune, etc.).
Several argue Windows remains the moat: without it, Office/AD/Exchange/Teams and cloud management offerings lose a key advantage.
Others counter that Windows now contributes a relatively small share of revenue, explaining neglect and focus on higher-margin services.

Monopoly, Switching Costs, and Competition

Many claim Microsoft can ship low-quality updates because business switching costs (AD, legacy apps, training) are huge.
Counterpoint: competition is stronger than a decade ago (Apple share, Linux preinstalls, Steam Deck, browser-centric workflows), and switching costs are falling as more work moves to the web.
Some see governments/companies periodically exploring Linux, though reversals (e.g., Munich) are cited.

Quality, QA, and Organizational Culture

Widespread belief that cutting dedicated QA (and relying on devs + telemetry) is central to recurring catastrophic updates.
Mention of past major breakages (boot loops, BSODs) to argue this is a long-running pattern, not just a recent regression.
Discussion of historic Dev:QA ratios (often ~1:1 or even 1:2 QA-heavy) and the importance of manual, device-coverage-heavy testing for an ecosystem as large as Windows.
Some frame this as a broader “MBA-led, short-term profit, cut-costs” culture shift, similar to other large corporations.

AI, “Vibe Coding,” and Productivity Claims

Many sarcastically connect the update failures to aggressive internal AI mandates and Copilot promotion, dubbing current practice “vibe coding.”
Skepticism that LLM-assisted coding has delivered real 10x productivity: if it had, visible quality and velocity should be higher, not worse.
Others argue LLMs mainly amplify existing skill (help good devs a bit, make low-skill output harder to debug) and that root problems precede AI.

User Experiences and OneDrive/Update Pain

Multiple anecdotes of systems rendered unbootable (e.g., inaccessible boot device, Win11 VM unable to roll back, new ARM machine DOA).
Repeated complaints about Windows–OneDrive integration: slow Explorer, deleted files, broken app data, inability to move Desktop out of OneDrive.
Some users report never seeing such issues, suggesting hardware/software combinations and update paths matter heavily.

Auto-Updates, Trust, and Security

Strong resentment of forced updates that can brick machines; calls to treat Windows updates like a “virus” and disable them via group policy.
Others warn that not patching creates security risk and can make unpatched users a threat to others (malware hosts).
Several note that every high-profile failure erodes trust and pushes more people to completely disable updates.

Windows 11 Itself: Best Yet or Buggier 10?

A minority calls Windows 11 the best OS they’ve used (especially on ARM: standby, docking, multitasking improvements, PowerToys, Excel).
Majority sentiment in the thread is negative: reports of Explorer regressions, UX annoyances (taskbar/start changes), random inoperable states, and more friction than Windows 10.
Some have rolled back to Windows 10 (often LTSC) or moved to Linux, citing greatly reduced frustration.

Proposed Remedies and Testing Expectations

Suggestions: return to slower, service-pack-style releases and longer version cycles; stop bundling features into security updates.
Expectation that Microsoft should use massive VM matrices plus limited real-hardware coverage and ultra-gradual rollouts (tiny initial cohorts, close monitoring).
Concern that, absent a serious reset of quality priorities, Windows will continue “death by a thousand cuts,” even if monopoly momentum keeps it dominant for years.

View on HN ↗ Original Article ↗

2026-01-26

Television is 100 years old today

Origins and “Who Invented Television?”

Commenters argue TV was an accretion of many inventions rather than a single “Eureka” moment.
Mechanical systems (Nipkow disks, Baird-style electro‑mechanical rigs) are contrasted with all‑electronic CRT systems (Farnsworth, Zworykin, Japanese and German pioneers).
Disagreement over credit: some see early mechanical demos as “real TV,” others focus on electronic rasterization and CRT-based systems as the true ancestors of modern television.
Several point out that what mattered was assembling a complete, interoperable system and securing standardization and industry backing.

Technical Evolution: Standards, Color, and “HD”

Early “high definition” in the 1930s–40s meant jumping from ~30 to a few hundred lines; an 819‑line analog system and various Japanese experiments are cited as proto‑HD.
The adoption of color in the U.S. forced the frame rate shift from 30 to 29.97 fps to avoid interference, leading to enduring complexity (drop‑frame timecode, 59.94 Hz issues).
PAL/SECAM are described as higher line-count but more flickery; they also introduced clever tricks like delay lines and phase alternation.
Vestigial sideband modulation is highlighted as a key bandwidth optimization step that arrived after the very first systems.

CRT Technology: Danger, Ingenuity, and Nostalgia

CRTs are praised as peak analog/“cassette futurism” tech: synchronous, continuous beams, no frame buffer, images existing only in phosphor decay and human persistence of vision.
Commenters recount hazards: implosions, electron-gun neck failures, charged capacitors, and early color sets emitting problematic X‑rays.
Historical uses of CRTs as computer memory (Williams tubes) and as delay elements in analog systems fascinate many.
Some still use CRTs (including oscilloscopes and high‑end sets) and admire their motion clarity, despite bulk, lead content, and obsolescence.

Cultural and Psychological Effects

Several cite media theorists arguing TV as a medium favours spectacle and decontextualized “now this” transitions, hindering deep reflection.
Others extend these critiques to 24/7 cable news and social media, seeing “manufactured outrage,” parasocial relationships, and the erosion of civic life.
A counterview notes that education and serious content can be made engaging (classic documentary and science shows), and that the real issue is selection and incentives, not the technology alone.

From Shared Broadcasts to Fragmented Streaming

Nostalgic accounts describe families organizing their week around a few flagship shows and nightly news, creating strong shared cultural references and a “common reality.”
Today’s on‑demand, individualized streaming is seen as reducing that shared experience; conversations become harder when everyone watches different things on different schedules.
Some welcome the decline of mass‑broadcast gatekeepers and point out that “shared culture” once excluded those without TVs; others mourn the loss of broad, cross‑cutting experiences.

Personal Memories and Historical Perspective

Multiple stories: first TVs seen in shop windows in the 1940s–50s, early cross‑border reception, and family gatherings around single sets.
Others juxtapose TV’s 100‑year history with home movies, telegraph, cars, and phones already existing a century ago, underscoring how compressed modern technological change is.

Technology Trajectories and Energy Debates

One thread contrasts extraordinary 20th‑century progress (TV, space travel, internet, smartphones) with concerns about energy limits, climate change, and possible future technological decline.
Another responds that regress is more likely to vary by country and policy, not necessarily a uniform global collapse.

View on HN ↗ Original Article ↗

2026-01-26

Google AI Overviews cite YouTube more than any medical site for health queries

Study, framing, and methodology

Several commenters see the Guardian headline as misleading “clickbait.”
YouTube is a hosting platform; grouping all “youtube.com” citations together ignores whether the actual publisher is a hospital, clinic, or individual influencer.
The underlying study is by an SEO company, focuses on domains rather than content quality, and uses German-language queries, which may skew which reputable English sources appear.
When aggregating multiple medical sites together, commenters suspect those likely exceed YouTube’s share.

Self‑preferencing and incentives

Many say it is unsurprising that Google products (AI Overviews) amplify another Google product (YouTube); neutrality was never realistic.
Some view this as a straightforward conflict of interest and an antitrust signal: citations and UI are being steered toward what makes Google more money (video, ads, engagement).

YouTube as a medical source

Defenders note that many reputable institutions and physicians publish on YouTube; video can be an excellent teaching medium, especially for procedures.
Critics counter that ordinary users cannot easily distinguish expert channels from quacks, conspiracists, and “miracle cure” peddlers, and that video is especially persuasive even when wrong.
There’s concern about social-media-driven self‑diagnosis (e.g., ADHD/autism, alternative treatments) and medical influencers explicitly positioning themselves against mainstream doctors.

Quality of AI Overviews / Gemini

Repeated reports of AI Overviews being confidently wrong, fabricating capabilities (“how to” answers for things you simply can’t do), and never saying “I don’t know.”
Some say Gemini/Overviews use cheaper, weaker models to keep costs down at Google scale.
A few users report good experiences (e.g., surprisingly accurate cancer‑progression expectations from labs), but this is framed as doctors being reluctant to give concrete timelines rather than proof of medical reliability.

AI‑generated content and feedback loops

Strong worry about Gemini citing AI‑generated YouTube videos: an “ouroboros” of models training on and citing each other’s slop.
Commenters mention propaganda, conspiracy content, and deliberate attempts to game rankings (e.g., genocide denial, far‑right narratives) and ask how hard it would be to steer LLM outputs by mass‑producing targeted content.
The concept of “citogenesis” (false claims gaining legitimacy via repeated citation) is raised as a systemic risk.

Broader search and web concerns

Many feel Google search quality is declining, with AI Overviews and YouTube pushed ahead of cleaner text pages despite a dedicated “video” tab.
Some argue big tech is turning the public web into a privatized, engagement‑optimized layer where reliable knowledge, especially in medicine, is hard to distinguish from monetized noise.

View on HN ↗ Original Article ↗

Hacker News, Distilled

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics