Stories - Page 554 | HN Distilled

2025-01-29

From C++ to Clojure: Jank language promises best of both

Motivation and Advantages over JVM Clojure

Main value props vs Clojure+GraalVM:
- Native binaries without JVM, while retaining REPL/JIT–driven interactive development (which native-image largely loses).
- Tight, “JVM-level” style interop with C++, including templates and RTTI, presented as unprecedented in the native space.
- Lighter runtime than the JVM and easier access to native libraries and graphical toolkits.
Several Clojure fans are excited to get Clojure semantics without “JVM madness” (startup, footprint, Java ecosystem overhead).

C++ / Native Interop and Game Development

Strong interest from game devs:
- Using Jank like Lua (embedded into engines) but with full C++ interop and real Lisp macros.
- Potential fit with Unreal and native-style architectures; could serve as a “glue” layer that gradually takes over.
Comparisons/alternatives:
- Clasp (Common Lisp on LLVM with C++ interop), though limited Windows support is seen as a drawback.
- Fennel (Lisp-on-Lua) and Janet (small Clojure-like Lisp) mentioned as lighter game‑scripting options.
- Jank is framed as “actually Clojure,” unlike Clojure‑inspired languages such as Carp.

Tooling, Editors, and On‑Ramps

Strong plea to invest early in tooling and non‑Emacs workflows (VS Code/Calva, Vim/Conjure, IntelliJ/Cursive) so C++ devs aren’t blocked by editor culture.
Long subthread on Emacs vs other editors:
- Consensus that REPL‑driven development is the key Lisp experience, not Emacs per se.
- Emacs is seen as historically ahead because it’s easy to extend in Lisp and has a culture of deep REPL integration; others could match this but often don’t.
- Discussion of LSP vs nREPL, clj‑kondo vs deeper static analysis, and how dynamic Lisps limit full static tooling.

Language Design: Ownership, Immutability, Concurrency

One Rust‑oriented commenter argues that any new native Lisp should have ownership (Rust‑style) for reasoning about where values “are,” comparing it to the leap from dynamic to lexical scoping.
Others counter that Clojure’s immutability and STM already provide strong guarantees; a detailed exchange explores:
- How ownership models give syntactic guarantees about moves/borrows and can subsume some STM use cases.
- How persistent immutable data and ownership can complement each other for performance (e.g., automatic transientification).
No consensus; Jank is currently described as staying close to Clojure’s model.

GC, Performance, and Runtime

Jank is garbage‑collected; currently uses Boehm, with plans to move to a more competitive GC (MMTK + Immix mentioned).
Some worry about losing the JVM’s highly tuned GC; others note that native plus modern GC may be “good enough,” especially for games and GUIs.
Perception of JVM as “slow and bloated” is challenged by others who point out its performance advantages over many alternatives.

Platforms, Ecosystem, and Status

Mobile/iOS use case: author states Jank should be embeddable into Swift/iOS apps, with REPL for dev and AOT‑only for release (due to Apple’s JIT restrictions), thanks to LLVM.
Current ecosystem is minimal; libraries are not really there yet and the language doesn’t fully implement Clojure.
Interest in using Jank for native GUI apps, data‑heavy/scientific code (by analogy with Clasp), and scripting in existing C++ codebases.
“Seamless C++ interop” is repeatedly highlighted but details are deferred; even sympathetic readers flag it as currently “unprecedented” and want more technical explanation.

Overall Sentiment

Strong enthusiasm from Clojure users who dislike the JVM and from native/game developers intrigued by tight C+++Lisp integration.
Healthy skepticism around:
- Complexity and promises of truly seamless C++ interop.
- Lack of ownership model for some Rust‑aligned developers.
- Tooling maturity and the need for good first‑time experience.
Many commenters say they’ll “keep an eye on it” and test it once a usable release and tooling appear.

View on HN ↗ Original Article ↗

2025-01-29

Exposed DeepSeek database leaking sensitive information, including chat history

Breach severity & logging practices

Commenters are struck that a “production-grade” database for a #1 app-store service was internet‑exposed with no auth, full SQL control, and plaintext logs including chat history, keys, and backend details.
Some see plaintext logs as sadly normal (except for passwords); others argue that at this scale, lack of encryption and access control is inexcusable and undermines trust.
People note this wasn’t just chat content: observability data (OpenTelemetry spans) with prompts, completions, and metadata were exposed.

DeepSeek’s maturity, “side project” narrative & funding

One camp says the incident reinforces the “side project of quants” story: impressive models, but weak experience running public, secure services.
Another pushes back hard: DeepSeek reportedly has ~130 ML staff, very large GPU fleets, and costs far beyond the advertised $5.5M training run, arguing this is a serious, well‑funded lab, not evening hobbyists.
Several distinguish: DeepSeek may be a “pet project” of its parent hedge fund founder, but the ML team is full‑time; the weakness is security/infra, not ML.

Security culture, infra mistakes & ClickHouse specifics

Many emphasize that even experienced companies (auto makers, big tech) have made similar mistakes with open databases and plain logs; this doesn’t require a “side project” explanation.
Others insist any team exposing a raw DB to the public internet without auth shows a basic ops failure.
ClickHouse contributors explain defaults: local‑only access, IP filtering, and non‑SQL “default” user; DeepSeek would have had to override several safeguards. Misconfig via Docker/Kubernetes or copied configs is suspected.

Responsible disclosure & legality

Initial comments accuse Wiz of irresponsible disclosure by publishing host/port details; later replies highlight the article itself: the issue was first disclosed privately, fixed, then published.
Some raise CFAA‑style legal concerns about probing systems without explicit permission; others cite updated DOJ policy that protects good‑faith security research.

Geopolitics, propaganda, and market impact

Some see the write‑up as part of a broader campaign by Western incumbents to tarnish a disruptive Chinese competitor; others say any rapidly popular app would draw intense scrutiny.
There’s discussion of NVIDIA’s stock drop and whether DeepSeek’s efficiency meaningfully changes GPU demand; opinions diverge between “overreaction/FUD” and “evidence of AI‑hardware bubble stress.”
Several warn about CCP data access and censorship, while others argue US tech firms and governments already collect comparable data, so moral high ground is limited.

User trust, privacy, and self‑hosting

Multiple commenters treat this as a strong argument for local models, self‑hosting, or at least never sending sensitive data (secrets, configs, personal info) to public LLM APIs.
Password reuse on such services is cautioned against; password managers and unique credentials are recommended.

View on HN ↗ Original Article ↗

2025-01-29

Soviet Shoe Factory Principle

Measurement, Goodhart’s Law, and KPIs

Core idea: “What gets measured gets done” – but once a metric becomes a target, it’s easily gamed (Goodhart’s Law / “measurement dysfunction”).
Commenters note US education, enterprise software, and internal corporate KPIs as examples where measured quantities improve while actual quality worsens.
GitHub commit counts as a performance metric led to more commits, worse code, angry customers, and eventually business loss.
Several argue that designing truly ungameable metrics may be impossible; proxies always miss nuance.

Capitalism, Theranos, and ‘Soviet’ Corporations

One analogy: Theranos as a capitalist version of the Soviet shoe factory—chasing investor metrics and hype instead of real products.
Pushback: others say Theranos was straightforward fraud, and “capitalism killed it” once the fraud was exposed.
Debate over whether early Theranos was naïve optimism or fraud from the start.
Many see large corporations as Soviet-like: rigid hierarchies, internal propaganda, distorted metrics, and political infighting. Counterpoint: unlike states, corporations must eventually make payroll or die (except when bailed out).

Markets, Consolidation, and Free-Market Limits

Some argue “thousands of competing Soviets” (firms) keep capitalism healthier than central planning.
Others question whether the current system is “healthy,” citing inequality, housing costs, and corporate concentration.
Long thread on Sears/Kmart: was their collapse a success of market discipline or a failure where PE looting and consolidation hurt competition and communities?
Broader concern: capitalism tends toward monopoly/oligopoly; antitrust and regulation are seen as necessary correctives.

Metric Gaming in Tech and Products

Smartphone cameras: manufacturers optimize for spec-sheet metrics (megapixels, “sensor size”) rather than real image quality; AI upscaling and fake moon photos are cited.
ML benchmarks: once famous, they get overfit and stop reflecting real capability.
VO2 max and other fitness metrics mentioned as similar failure modes.

How to Respond to Metric Failure

Suggestions:
- Spend more effort designing better measures, but accept they’ll still be partial.
- Align incentives and shared fate so metrics are advisory, not the whole game.
- Use metrics to inform, not as rigid targets, and prioritize human judgment and integrity over “data-driven” theater.

View on HN ↗ Original Article ↗

2025-01-29

Waymo to test its autonomous driving technology in over 10 new cities

Geographic expansion & winter testing

Commenters note Waymo’s focus on varied climates: Truckee/Tahoe, Michigan’s Upper Peninsula, upstate New York (including very snowy cities), and now Tokyo, San Diego, Las Vegas.
Discussion centers on snow/ice complexity: rapidly changing conditions, freezing rain, poor lane markings, and ambiguous pathfinding are seen as harder than “just snow.”
Some suggest additional sites like Jackson Hole or Calgary as good, affluent, constrained testbeds, but differences in climate (persistent ice vs intermittent snow) matter.

Economics and unit profitability

Heavy debate around whether Waymo has positive “unit economics.”
One side argues individual trips could already be profitable if you ignore historic R&D and city mapping, focusing only on operating costs per ride.
Others with robotics/AV experience counter that mapping, high‑bandwidth data collection, remote operations, expensive connectivity, sensors, and specialized maintenance are all ongoing operating costs, making current per‑ride economics “very bad.”
There is disagreement over how much of Alphabet’s “Other Bets” loss is really Waymo.

Use cases: taxis, vans, and private ownership

Many expect Waymo to focus on robotaxis in dense, profitable areas rather than universal self‑driving or rural coverage.
Some are interested in AV minibuses or wheelchair‑capable vans, but others point out human driver costs are spread over many riders and wheelchair handling adds complexity and risk.
A subthread explores private ownership of self‑driving cars versus fleet models: some see no appeal beyond a taxi service, others imagine highly personalized “pods” or “coaches” with fleet‑owned drive units (“horses”).

Safety, driving behavior, and user experience

Mixed reports on pedestrian/pet detection: some claim “many close calls”; others say Waymo consistently stops with ample space.
Riders generally praise calm, predictable, “defensive” driving and cleanliness versus human ride‑hail, but complain about:
- Over‑cautious behavior (e.g., unprotected left turns causing long delays).
- Slower trips vs aggressive human drivers, though perceived as safer and more relaxing.

Rollout strategy, competition, and politics

Many believe Waymo will quietly expand geofenced “beta” areas until it dominates profitable taxi markets, similar to Uber’s early city‑by‑city rollout.
Some think current pricing is kept near or above Uber/Lyft to avoid political blowback from undercutting human drivers; long‑term, lower labor costs and higher utilization are expected to tilt economics in Waymo’s favor.
Others argue AV fleets remain capital‑intensive (Waymo pays for cars; taxis often don’t) and that profitability is far off.
Debate on competition: Waymo seen as a leader in deployed robotaxis, but Amazon’s Zoox, Tesla, Chinese AV firms, and potential Uber partnerships with multiple autonomy providers are all viewed as credible threats.
A few see large tech firms (including Alphabet) using deep pockets, acquisitions, and regulatory relationships to entrench monopolies/monopsonies in AV and AI.

Cities, transit, and public reaction

Tokyo is seen as an interesting choice: dense, low‑crime, but not necessarily “high trust.”
Some predict Waymo will significantly reduce Uber/Lyft demand in a few years; others expect a long period where AVs cover only select geofenced zones.
NYC prompts sharp reactions: some want congestion pricing specifically for AVs; others look forward to competition with subways on comfort and convenience, while defenders argue subways already dominate on volume and cost.
Concerns are raised about AVs causing gridlock when their software fails in complex urban situations, vs concerns about crime and discomfort on public transit.

Unclear details

The thread notes that the article mentions “10+ cities” but does not list them all or clearly identify the primary source; commenters infer some cities from Waymo’s marketing site but agree the exact list remains unspecified.

View on HN ↗ Original Article ↗

2025-01-29

Adding iodine to salt played a role in cognitive improvements: research (2013)

Role of Iodine and Public Health Impact

Commenters highlight iodine deficiency as a major global cause of preventable cognitive impairment, especially in pregnancy, where iodine needs rise significantly.
Iodine’s role is discussed mechanistically via thyroid hormones (T3/T4) guiding fetal brain development and neuronal migration; deficiencies can disrupt brain-cell placement and long-term cognition.
The iodization of salt is framed as a textbook public-health success, comparable (in impact style) to sewers, vaccines, and, for some, fluoridated water and shoes.

Sources of Iodine and Changing Salt Use

Many note that modern diets get iodine from seafood, dairy, and eggs, but levels depend heavily on soil/feed iodine.
In the US, table salt is often iodized, but kosher, flaky, and “natural sea salt” typically are not. Fast-food chains and home cooks increasingly use non-iodized flake/kosher salt, raising concern about reduced iodine intake.
Some struggle to find iodized flaky salt; others suggest simple iodine or multivitamin supplements, debating appropriate dosages and heavy-metal concerns in kelp-based products.
There is mention that iodine content varies by brand, dissipates from open containers, and is partially lost during cooking.

International and Cultural Differences

Several European countries (e.g., Norway, Germany, Finland) are cited as having inadequate iodine intake due to low iodized-salt coverage and declining milk/seafood consumption.
Germany is contested: some recall no iodization; others cite data that ~70–80% of household salt sold is iodized (often also fluoridated).
Japan and Korea are noted for high seaweed consumption; Japan reportedly bans iodized salt as an additive, with some subgroups experiencing excess-iodine thyroid issues. Korea’s seaweed soup is described as a de facto iodine source.

Sodium, Salt Design, and Health

One line of discussion links high sodium intake to cardiovascular disease, arguing salt is both risk factor and useful iodine carrier.
Others claim evidence against sodium is weak or confounded by overall diet and suggest that reformulations targeting “saltier-tasting” crystals may just shift hyperpalatability toward sugar and seed oils.
There is side discussion about physiological mechanisms (osmolality, spikes vs totals) and the difficulty of conveying this nuance to the public.

Fluoridation, IQ, and Tradeoffs

Fluoridated water is compared with iodized salt as a mass intervention, provoking more disagreement.
Some cite recent meta-analyses and NTP reviews suggesting higher fluoride exposure (>1.5 mg/L) is associated with small IQ reductions in children, and argue the benefit-to-risk ratio is poor given availability of topical fluoride (toothpaste, mouthwash).
Others counter that typical US levels (~0.7 mg/L) are below the thresholds where effects have been observed, that evidence at these levels is “insufficient,” and that much fluoride intake comes from food anyway.
A few invoke fluorine chemistry to argue there is no plausible mechanism for cognitive effects at trace levels; critics respond that lack of a known mechanism should not be over-weighted against suggestive epidemiology.
Some participants favor fluoridation for population-level dental benefits, especially for poorer communities; others oppose any involuntary exposure, preferring targeted toothpaste programs.

Chemistry and Conceptual Clarifications

Several comments correct or refine the article’s phrasing that iodine is “something our bodies can’t synthesize”:
- Iodine, as an element, cannot be synthesized by any organism; it is created in supernovae via nuclear processes.
- Discussion distinguishes elements vs compounds and notes that many inorganic species (e.g., water, bicarbonate) are biologically synthesized, but not elements themselves.
There is meta-debate about whether the article’s wording is misleading (by analogy with essential amino acids) or simply context-appropriate shorthand.
Related tangents cover definitions of “organic” compounds, notation for chemical species (TeX, Unicode, InChI), and the limits of pedantry in popular science writing.

Other Related Public-Health and Environmental Factors

Hookworm eradication in the American South is cited as another example where a simple intervention (deworming) measurably improved school attendance, cognition, and regional economic outcomes; one commenter notes hookworm’s partial resurgence.
Bromide exposure is raised as an underappreciated factor: potassium iodate in bread was historically replaced with potassium bromate in the US, which can displace iodide in the thyroid and may increase cancer risk; California has banned bromate.
Some note anecdotal reports of better tolerance of European bread (possibly linked to absence of bromated flour), while acknowledging potential confounders.

View on HN ↗ Original Article ↗

2025-01-29

No Man's Sky's update introduces billions of new stars, planets, and more

Planet Count vs. Actual Variety

Many argue the “billions/trillions of planets” marketing remains meaningless: after dozens of worlds everything feels samey.
The real improvement is seen as new biomes, terrain generators, and visual variety; some wish they’d cut planet count drastically and focus on richer worlds.
Several would prefer tens or hundreds of semi‑handcrafted planets over a near-infinite number of lightly varied procedural ones.

Gameplay Loop, Depth, and Sandbox Design

Fans describe the current game as a solid ambient exploration sandbox: relaxing travel, survival-to-self-sufficiency progression, base building, events/expeditions, derelict freighters, and especially strong in VR and multiplayer.
Critics call it a grindy “walking simulator” or “the chores half of a game”: collect arbitrary resources to build gear that just lets you collect more resources, with little sense of purpose, opposition, or impactful story.
Procedural content is widely seen as technically impressive but emotionally flat: same few factions, stations, structures and encounters everywhere, sentinels on nearly all planets, little sense of true discovery or history.
Comparisons to Minecraft, Terraria, Factorio, Elite Dangerous, and Mass Effect highlight a perceived lack of narrative, challenge, or long-term goals.

Technical Quality and Platform Performance

Some report huge stability improvements vs. launch and generally smooth play, praising the engine as a technical marvel.
Others say later updates introduced more bugs and crashes, and that PS4/Xbox One performance has degraded badly (low FPS, severe pop‑in, long terrain generation).
Switch performance is described as surprisingly good, likely due to more aggressive optimization, though visuals are obviously reduced.
Base building is frequently criticized as glitchy, awkward, and underpowered relative to its potential.

Launch Controversy, Trust, and “Redemption”

A strong contingent sees No Man’s Sky as the definitive comeback story: years of substantial, free updates, no paid DLC, and sustained support well beyond expectations.
Others think the “redemption arc” is overhyped: they feel most added systems are shallow, the core problems (meaningful discovery, depth) remain, and the game still doesn’t match pre‑launch promises.
There is ongoing debate over whether early marketing crossed into intentional deception, whether sufficient apology was ever made, and whether it’s reasonable to still distrust or “hold a grudge” nine years later.

Economics and Future Tech

Commenters wonder how free updates remain viable; answers cite new waves of purchases, presence on Game Pass/PS+, and steady sales charts.
Some view ongoing work as investment in a procedural engine for future titles (e.g., a fantasy game on an Earth‑scale world), with No Man’s Sky as both product and R&D platform.

View on HN ↗ Original Article ↗

2025-01-29

"We're building a new static type checker for Python"

Excitement and Project Goals

Many commenters are enthusiastic, based on prior experience with Ruff and uv, and expect similar “step change” improvements in speed and UX.
The checker is currently codenamed “red_knot” and is being developed in the open within the Ruff repo; it’s not ready for general use yet.
Stated goals (from linked threads): fast, highly incremental static analysis suitable both for CI batch runs and IDE/LSP-style interactive checking, likely using Rust and salsa for incremental computation.

Comparisons with Existing Type Checkers

Strong appetite for an alternative to mypy; complaints include:
- Slowness on large codebases (though experiences vary widely).
- Less precise analysis, many false positives, weaker inference, and difficulty with “Pythonic” patterns.
- Awkward plugin/stub ecosystem and uneven third‑party stubs.
Pyright is widely praised for quality and precision, but:
- Its Node.js dependency is a deal-breaker for some.
- Performance is “ok but could be better.”
Others mention pytype, Pyre, and runtime tools (beartype, TypeGuard, PyContracts, icontract); the new checker is expected to be static only, but some wish it would optionally support runtime checks.

Fragmentation, Standards, and Django

Some worry a third major checker will exacerbate divergence (mypy vs Pyright vs new tool), making CI and team workflows painful when tools disagree.
Others respond that:
- Syntax and semantics are largely standardized via PEPs and the typing community.
- Multiple active implementations help drive out spec ambiguities and converge behavior over time.
Django support is a major concern:
- Mypy wins today because of its plugin interface and django-stubs.
- Other checkers are seen as “DOA” for large Django codebases without comparable plugin capabilities.
- It’s unclear whether the new checker will support Django-level dynamism.

Business Model and Sustainability

Several commenters question how a VC-backed company can sustain open-source tooling like Ruff, uv, and a checker.
A linked explanation describes a strategy of selling vertically integrated enterprise products (e.g., private package registries) to companies already using the tools.
Some doubt the size or uniqueness of such markets versus cloud-provider offerings; others are content as long as tools remain open source but worry about long‑term maintenance if funding dries up.

Typing Philosophy and Python’s Direction

Ongoing debate:
- One side: Python’s charm was dynamic typing; if you want static types, choose another language.
- Other side: people often can’t choose the language (e.g., ML/AI and legacy codebases), and gradual typing dramatically improves refactoring, IDE support, and team safety on large projects.
Several describe a progression: enjoying dynamic Python early on, then coming to rely heavily on type hints as projects and teams scale.

Ecosystem and Future Directions

Some see Astral as delivering practical solutions faster than official bodies and hope for:
- A unified toolchain: formatter, linter, type checker, package manager, task runner, build, deploy, and maybe a future Python interpreter in Rust.
There’s curiosity about why Python needs so many generations of tooling, and discussion of “gradual typing” as a general trend in dynamic languages.
A few argue for entirely new statically typed “Python-like” languages, but most acknowledge Python’s ecosystem lock‑in as decisive.

View on HN ↗ Original Article ↗

2025-01-29

An analysis of DeepSeek's R1-Zero and R1

Performance, cost, and “reasoning” benchmarks

o3 greatly outperforms R1 and o1 on ARC‑AGI‑1, but only at extremely high test‑time compute (tens of millions of tokens, ~$3.4k per run in the cited setup).
Some see this as evidence of steeply rising marginal cost for each extra “percent of real reasoning.” Others argue the ability to pour in more compute is a feature, not a bug.
R1 is praised for cost‑efficiency and “punching above its weight,” and for being a good data generator to distill into smaller models.
There is criticism that o3 was tuned on ARC training data while o1/R1 were not, making headline comparisons somewhat misleading.

Verifiable rewards, RL, and domain limits

“Verifiable reward” is discussed as binary correctness (tests pass, proof checks, answer equals ground truth), loosely analogous to NP verification.
This works well for math and code, especially in sandboxed environments with test suites, but breaks down in most real‑world or subjective domains.
Even in math/CS, many interesting questions (depth of theorems, usefulness of definitions, quality of models or language designs) lack clear verifiable rewards.
Some argue theorem discovery and meaningfulness are partly verifiable; others say “meaningful” can’t be quantified, so RL can’t directly target it.

Human bottleneck and training data economics

R1‑Zero is framed as “removing the human bottleneck,” but commenters note it still relies on human‑curated pretraining and human/RL signals for non‑verifiable tasks.
A proposed flywheel: users pay for inference, their interactions generate labeled data, models improve, attracting more users. Skeptics doubt the novelty and quality of such data.
There is active interest in using reasoning models to generate synthetic chains‑of‑thought, then training cheaper base models on this; others worry this amplifies model biases and errors.

User feedback, poisoning, and data quality

Corrections like “no, that’s wrong” are seen as valuable RL signal, but models are not currently learning online; updates happen offline and heavily filtered.
Multiple comments discuss adversarial “data poisoning” (fake content, tools aimed at crawlers), and counter‑arguments that large labs can statistically detect and discard much of this, albeit at non‑trivial cost.

Future of coding and bespoke software

One camp envisions LLMs building full apps end‑to‑end (spec, code, tests, deployment), enabling “bespoke software for everyone.”
Others argue requirements elicitation, security, billing, oversight, and multi‑user value are the real hard parts; current agentic tools loop, waste tokens, and produce brittle code.
Some expect near‑term improvement (LLM as competent dev team), others think this will “almost certainly never materialize,” at least in the strong version.

Inference compute and Nvidia

Shift of spend from training to inference, plus expensive reasoning tokens, is expected to increase total compute demand.
For inference, Nvidia faces more competition (TPUs, Groq, Cerebras, AMD, on‑device), and several people report successful migration away from CUDA for serving.
Others insist the CUDA/software stack remains a deep moat for high‑end, fast‑moving workloads, especially in training; inference is the easiest layer to peel away.

View on HN ↗ Original Article ↗

2025-01-29

DeepSeek's Hidden Bias: How We Cut It by 76% Without Performance Loss

Bias measurement and the BBQ benchmark

Discussion centers on the BBQ benchmark, which tests:
- Under-informative (“ambiguous”) contexts: does the model inject social stereotypes?
- Fully-informative (“disambiguated”) contexts: do stereotypes override clear textual evidence?
Some are curious how distillation changes bias scores and how bias propagates from base to distilled models.

Ambiguous questions, priors, and the meaning of “likely”

A key example: two people of different races stopped by police, “Who is likely responsible?” with “Not enough information” as the intended correct answer.
Several argue “likely” should invoke statistical priors (e.g., crime rates), so C is not obviously correct; they worry this trains models to ignore the word “likely.”
Others respond that race alone is not valid evidence, and assuming guilt from group statistics is precisely the bias being measured.

Is debiasing just a different bias?

Some see this as “forcing the model to conform to your bias,” not removing bias.
One comment notes that accuracy on race-related questions reportedly drops, interpreting this as trading factual accuracy for anti-stereotyping.
Others say the goal is to prevent population-level priors from overruling case-specific information, not to suppress true statistics when explicitly asked.

Crime statistics, fairness, and Bayesian reasoning

Long subthread debates racial crime statistics, their reliability, and how policing practices skew them.
One side insists ignoring such priors makes the model “more stupid”; the other argues:
- Prior-based profiling is unacceptable for individuals.
- Reasonable systems should avoid presuming guilt from protected attributes.
- Courts would deem such reasoning inadmissible.

Age-related bias example

The BBQ elderly/young “who is forgetful?” scenario triggers similar debate:
- Some say it is “empirically true” older people are more forgetful, so answering “the older person” is rational Bayesian reasoning.
- Others insist the correct behavior in ambiguous LLM tasks is to answer “unknown” unless the context explicitly states otherwise, to avoid unjustified demographic assumptions.

Political censorship and regional biases

Multiple commenters ask whether the method addresses censorship around topics like Uyghurs or Tiananmen.
There’s disagreement on whether a “political censorship benchmark” is inherently aligned with its authors’ politics, versus being a legitimate test of factual coverage and refusal patterns.
Distinction is drawn between “bias” and “area of focus”: specifically testing China-sensitive topics is considered reasonable for a Chinese-origin model.

Impact on capability and hallucinations

Some fear that always choosing “not enough information” in ambiguous BBQ-style setups could hurt real-world reasoning (e.g., a chocolate-covered toddler and missing fudge).
Others counter that:
- The benchmark includes disambiguated contexts to ensure models still use direct evidence.
- Over-reliance on priors is akin to hallucination; constraining it can improve reliability in many applications.

Model alignment, operator values, and geopolitics

Several comments frame this as operator alignment: models are tuned to reflect the values of the controller (e.g., Western corporate norms vs. Chinese state norms).
One view: “removing bias” in a Western business context means embedding a particular ideological stance that is itself a form of propaganda.
Others mention the broader tension between rapid AI deployment and safety/caution, referencing how different companies and countries handle that trade-off.

LLM verbosity and reasoning models

Side discussion notes that reasoning models like DeepSeek-R1 tend to produce long, step-by-step outputs.
Some users dislike this default verbosity and would prefer concise answers by default, with reasoning only when requested.
There’s speculation that hidden “reasoning tokens” could allow shorter visible outputs, but this clashes with some providers’ safety policies.

Open questions and interest

Several ask for more concrete details on the debiasing procedure itself, beyond high-level claims.
People express interest in:
- Additional bias datasets beyond BBQ.
- How the debiased model behaves on non-BBQ, more natural ambiguous questions.
- How bias behaves across different models (DeepSeek vs Llama) and how distillation and fine-tuning redistribute it.

View on HN ↗ Original Article ↗

2025-01-29

On DeepSeek and export controls

Technical claims and cost comparisons

Commenters focus on the blog’s new detail that Claude 3.5 Sonnet cost “a few tens of millions” to train and was not distilled from a larger model; this contradicts prior rumors and surprises many.
Several people contest the author’s framing that DeepSeek “did not do for $6M what cost US companies billions.”
- Even taking his numbers, they see a 3–10x training cost gap and note DeepSeek’s model appears similarly capable while being far cheaper to run.
- Users highlight that DeepSeek’s inference cost is reportedly 15–50x lower than comparable US APIs, and question whether US labs simply run at high margins or lack comparable optimization.
Some argue DeepSeek’s methods (MoE, PTX tuning) are not magic but expected steps on a general cost curve; others counter that constrained Chinese hardware gave DeepSeek strong incentives to push memory and efficiency innovations (MLA, FP8, scheduling).

Export controls, chips, and zero‑sum dynamics

Many see export controls as shortsighted: China is expected to reach domestic chip parity or near‑parity soon, and tighter controls may accelerate Huawei/Ascend and other Nvidia competitors.
Others argue controls are rational “lesser evil”: hostile states will use any advantage; limiting training‑grade chips slows their military AI, even if only temporarily.
There is debate whether the chip market is effectively zero‑sum while leading fabs run near capacity, making “each chip to China” one not available to US labs.
Some note DeepSeek already running on non‑US hardware complicates the export‑control narrative.

Geopolitics, morality, and AI power

The article’s call for US/allied AI dominance and fears of Chinese military applications is widely criticized as self‑serving, nationalist, or “Cold War‑style” rhetoric likely to be self‑fulfilling.
Several point out US human‑rights abuses and military interventions, rejecting a simple “democracies good, China bad” framing; others still prefer US hegemony over Chinese.
There is extensive argument over unipolar vs multipolar worlds, historical war patterns, and whether US export controls are about democracy or raw trade power.
Some worry chip and AI controls could extend to consumer hardware in future; others respond that current regimes focus on training, not inference.

Race dynamics, regulation, and incentives

Multiple commenters see the piece as an attempt by a major US lab to lobby for regulations that entrench incumbents and create a moat against cheaper, open‑weights competitors.
Others welcome that DeepSeek’s open release is forcing US labs to reveal more about training costs and methods.
The blog’s casual prediction of near‑term “superhuman at almost all things” AI (2026–2027) is met with skepticism, or dismissed as vested‑interest hype.

View on HN ↗ Original Article ↗

2025-01-29

Why DeepSeek had to be open source

Security, Local Use & Practicalities

Several commenters verify that DeepSeek R1 (or its distillations) runs with no external network traffic and can be used fully offline.
Concerns remain about supply-chain risk (e.g., verifying safe serialization formats such as safetensors vs unsafe ones like pickle).
Full R1 is described as ~650–700GB (fp16) with quantizations around 150GB; only distilled models (based on Llama/Qwen trained on R1 outputs) are practical on single GPUs and consumer hardware.

Is DeepSeek “Open Source”?

Large subthread argues DeepSeek is not open source but “open weights” or freeware:
- Missing: training code, training data, and low-level PTX/cluster tooling.
- Weights are likened to binaries or bytecode: modifiable via fine‑tuning, but not reconstructible from source.
Others counter that for LLMs, weights are the “preferred form for modification,” citing GPL/OSI language and practical constraints (no one can afford to retrain frontier models).
Nuanced taxonomy is proposed:
- open-source inference code
- open weights
- open pretraining recipe (code + data)
- open fine‑tuning recipe (code + data)
Licensing nuances: older DeepSeek-V3 weights have a custom, more restrictive license; R1 and R1-Zero weights are MIT-licensed.

Trust, Censorship & Geopolitics

Strong skepticism toward a Chinese API; open weights and local deployment are seen as crucial for Western adoption.
Evidence cited that DeepSeek censors topics sensitive to the Chinese state; similar concerns are raised about Western models on other geopolitical topics.
Some suggest using Chinese and Western models to cross-check each other; others argue both sides propagate their own narratives.

Competition, Moats & Economics

Debate over whether DeepSeek “dethrones” OpenAI or just narrows the gap temporarily.
Some see DeepSeek as proof that frontier-level reasoning can be built for a few million dollars, eroding proprietary moats and pushing prices toward zero.
Others argue large incumbents (especially Google) still have substantial moats: custom hardware, data pipelines, userbase, and monetization channels.
Expectation from many participants: a mixed future where open(-weights) models commoditize baseline capabilities, while proprietary models continue at the bleeding edge and for integrated commercial offerings.

Reaction to the Article Itself

Multiple commenters call the post clickbait/content marketing for Lago and object to the claim that DeepSeek “proves” an open-source future; they see it as one data point, not a proof.

View on HN ↗ Original Article ↗

2025-01-29

US children fall further behind in reading

Funding, Administration, and ESSER Money

One camp argues the $190B Covid-related funding proves “more money” isn’t the answer; they see a bloated administrative system that doesn’t reach kids.
Others counter you can’t know how bad things would be without the funding and that underfunding is still core; US has been “spending more for years with no improvement” is treated skeptically.
A public school teacher describes pandemic-era mass layoffs (including grant managers), inability to navigate ESSER rules, large unspent funds, decaying buildings, and further staff cuts; blames educator exodus and dysfunctional systems more than “fat admin.”
Another commenter blames misappropriation: enough money overall, but not enough to teachers and students.

What Reading Requires: Resources, Culture, and Definition

One view: reading is ancient and cheap to learn; money isn’t the issue.
Pushback: literacy actually requires substantial resources (time, instruction, materials) and historically was limited or tied to narrow goals (e.g., religious reading).
Several emphasize cultural factors: parental encouragement, home reading environment, teacher skill, and community attitudes toward books.
Clarification that “literacy” on tests means comprehension and inference (e.g., character motivation, vocabulary like “industrious”), not just decoding words.

Instructional Methods: Phonics vs. Whole Language / 3‑Cueing

Multiple references to the “Sold a Story” podcast criticizing non-phonics (“3‑cueing,” Reading Recovery) approaches; some link state-level gains to renewed phonics emphasis.
Others share personal experience of schools explicitly downplaying phonics, leading to kids who memorize texts but can’t decode unfamiliar words.
A rebuttal article is cited defending Marie Clay/Reading Recovery and critiquing the podcast as oversimplified.
Some argue the real issue is not any single method but failure to treat reading instruction scientifically: measure comprehension, iterate, stop relying on untested “magic.”
Another perspective: many children worldwide learned without formal phonics; exposure to books and a reading-rich environment can be enough.

Practice, Memorization, and Learning Theory

Several comments explore how kids exploit weak rubrics (memorizing books, “teaching to the test”).
Debate over “rote memorization”: criticized when overused, but seen as essential for building a mental toolbox enabling fluency and creativity.
Analogies drawn from juggling, running, and video game design: repetition, graded difficulty, and “deliberate practice” are central to lasting learning.

Pandemic, Mental Health, Absenteeism, and Other Causes

Article-linked causes repeated: Covid school closures, youth mental health crisis, and chronic absenteeism; many agree closures deserve serious scrutiny despite political risk.
Some add long-term factors: prior cuts to public education, declining social prestige of teachers.
One commenter raises environmental lead exposure as a multigenerational contributor; others challenge whether recent lead trends are large enough to matter nationally, with Flint cited and then downplayed by another as limited in time and scope.
There is mention of “anti‑education” and voucher policies shifting money from public to private schools; suggested as a future research area.

Immigration and Test Scores

One line of argument: rising shares of “English learners” (tripling from 5% to 14% on NAEP over decades) likely pull down aggregate English literacy scores; these students may be literate in another language.
Advocates urge separating native-born and ESL students in analyses to fairly judge school performance.
A critic calls this “grossly misinformed,” arguing immigration is not a major driver and that native reading instruction itself is failing.
Counterpoint: with ~14% foreign-born, impacts on aggregated metrics are “absolutely” nontrivial; but whether native-speaker literacy is improving or worsening is labeled “unclear” without disaggregated data.

Screens, Phones, and Technology

Some instinctively “blame phones”; others cite research/meta-analyses showing only small or inconsistent effects of screen time on cognition or wellbeing.
More consistent evidence is noted for modest negative effects of mobile phone use on grades, though a newer meta-analysis on screen time and wellbeing finds minimal harm.
Nuance: content and context matter. Reading on e‑readers and parent co‑viewing are associated with better outcomes; unsupervised social media and passive consumption with worse.
Anecdotal reports from schools with phone bans: higher grades and attendance, fewer fights, broad parental support—but administrators reluctant to confront a vocal minority.

Policy, Governance, and Privatization

Some foresee further decline if the federal Department of Education is weakened or abolished and religion is pushed into schools; this is tied to fears of more privatization and vouchers.
Others argue DoE’s large budget is wasted on administrative bloat and should be reconsidered; skeptics ask for evidence that “bloat” is the core problem.
Several note that private schools can better tailor education and pipeline students to elite universities, while poorer families rely on tech babysitting and under-resourced schools.
School boards are criticized as opaque and unaccountable; voters rarely know who sits on them or what they do, yet they oversee administrations seen as incompetent.

Democracy, Inequality, and Lived Experience

Some stress that failing literacy undermines democracy: future voters can’t evaluate information or candidates if they can’t read well.
A side debate emerges about “true democracy” and whether criticizing voter ignorance implies only one “correct” political view; this devolves into mutual accusations of poor reading comprehension.
Broader social mobility concerns surface: commenters argue the US is no longer “land of opportunity” for most; others counter that individual ambition can still overcome structural barriers, with a rejoined warning against ignoring harsh realities like wealth concentration.
Anecdotes: high-scoring kids who still hate books; others thriving in homes full of books and reading; parents looking for the “one book” that hooks reluctant readers.

View on HN ↗ Original Article ↗

2025-01-29

Google Pixel 4a's old firmware is gone, trapping users on buggy battery update

What the “battery performance” update actually does

Update released for Pixel 4a ~1.5 years after official EOL; branded as a “battery performance” or “battery optimization” update.
Many users report dramatic effective battery loss (e.g., from days to hours, or 100%→0% in under an hour) and much slower charging.
Technical digging (kernel image changes, coulomb-count measurements) suggests it doesn’t increase drain, but hard-limits charge voltage/usable capacity and charge rate for specific battery batches.
Reverse‑engineering and Google’s own FAQ indicate only some devices (“Impacted Devices”) are targeted, likely those with a specific battery model thought to be unsafe at full capacity.

Safety rationale vs. planned obsolescence

One camp believes Google is preempting a Note‑7‑style overheating/swelling risk and silently capping batteries for safety.
Another camp reads this as deliberate, anti‑consumer obsolescence: EOL device, vague wording, no clear safety admission, and a coincident Pixel price hike make it feel like a push to buy new phones.
Several point out that if it is a safety issue, Google should say so explicitly; the current messaging (“optimization”, “may reduce capacity”) is seen as evasive.

User experience and compensation problems

Real‑world impact ranges from “no change” to “phone unusable for work or travel.”
Official remedies: free battery replacement at limited walk‑in/mail‑in locations, $50 cash, or $100 Google Store credit.
Thread reports: slow or no response on credit, awkward terms (e.g., cash handled by a third‑party service with fees), overwhelmed repair shops, risk of breaking the non‑modular screen during replacement, and some regions effectively excluded.
A few users say a free battery swap restored normal behavior; others were told a new battery may not help because the restriction is in software.

Rollback, custom ROMs, and firmware removal

Google removed older factory images for the 4a (“sunfish”), preventing standard downgrades; this is seen as especially suspicious given other models still have old images.
Some users with unlocked bootloaders manage to revert or switch to LineageOS / GrapheneOS / CalyxOS; others find this too complex or risky.
There’s confusion whether third‑party ROMs will ship the new firmware and whether they mitigate or keep the cap.

Security, trust in updates, and ecosystem comparisons

Side discussion: 4a stopped getting security updates in Aug 2023; some argue this already made it a risky daily driver, citing recent Bluetooth RCE CVEs. Others say “no updates” isn’t automatically dangerous and criticize exaggeration in security bulletins.
Incidents like this, plus past buggy Pixel updates, reinforce a “if it works don’t update” mindset, particularly among less technical users.
Many commenters say this episode pushes them away from Pixels entirely, often toward iPhones or Fairphone, despite Apple’s own “batterygate” history.
Broader concerns raised: right‑to‑repair, forced/opaque updates as de facto property damage, disappearance of small phones and headphone jacks, and Google’s perceived indifference to long‑term device stewardship.

View on HN ↗ Original Article ↗

2025-01-29

Complete hardware and software setup for running Deepseek-R1 locally

Performance and Practicality

The showcased build achieves ~6–8 tokens/second on Q8 R1, which many see as “usable but slow,” especially for reasoning models that generate long “thinking” traces before the final answer.
Several commenters say chat feels acceptable around 15 t/s, and code-assistant use starts feeling good closer to 30 t/s. At 6–8 t/s many expect noticeable friction and context/flow breaks.
Some are happy to run big, slow models in the background and wait, or use them for batch-like tasks; others view this as more of a tech demo than a practical daily driver.

Hardware Design and Bottlenecks

The rig’s core idea: huge DRAM capacity and bandwidth on dual EPYC sockets, no GPU, to fit the full 671B Q8 model.
Multiple people argue the true bottleneck is memory bandwidth, not raw FLOPs; reasoning models especially are considered “CPU-unfriendly.”
There’s debate over dual-socket benefits: the original thread suggests disabling NUMA groups to “double throughput,” but others note remote NUMA access is slower and llama.cpp’s NUMA support is currently suboptimal; a single high-bandwidth socket might even be faster until software improves.
Alternative builds are proposed (single-socket EPYC with 12x64GB, Threadripper, cheap dual-socket used servers), but many of these either can’t match the bandwidth or are untested hypotheses.
Mac hardware is discussed: Apple’s tightly integrated, non-upgradeable RAM is praised for bandwidth but criticized for caps like 192GB, which block full R1.

Quantization and Model Choices

The $6k build targets Q8 “full quality.” Others point to dynamic low-bit (≈2.5-bit) quantizations that reportedly perform well at ~212GB, suggesting cheaper rigs could run strong variants with less RAM.
Some users are satisfied with smaller DeepSeek-R1 1.5B/8B or v3 models on M1/M2 Macs or modest PCs, trading quality for speed and cost.

Local vs Cloud and Business Angle

One thread explores building a low-cost CPU cluster to commercially host large open models, claiming it could rival specialized inference clouds on cost and speed; others are skeptical of the hardware and bandwidth cost estimates.
Broader debate: will cheap local frontier-level models threaten GPU-heavy cloud economics (and Nvidia), or will demand and large-cloud moats (ops, legal, compliance, export control, copyright risk) keep hyperscalers dominant?

Access and Tooling

Multiple comments share non-logged-in mirrors (xcancel, Nitter, threadreader, Bluesky) due to dislike of X/Twitter’s UX.
Practical tips are traded on downloading the 700GB+ weights from Hugging Face (git LFS vs direct HTTPS), and on llama.cpp configuration and future NUMA optimizations.

View on HN ↗ Original Article ↗

2025-01-29

Cali's AG Tells AI Companies Almost Everything They're Doing Might Be Illegal

Geopolitics and the “China argument”

Some argue the US cannot meaningfully crack down on Big Tech/AI because China (e.g., DeepSeek) will press ahead, and AI firms are now a strategic “golden goose.”
Others counter that “China won’t obey copyright” is irrelevant to what US law and ethics should permit.
Long subthread debates whether integrating China into global trade was a moral success (massive poverty reduction) or a strategic and economic mistake that undermined US workers.
Tension between nationalist “US should act only for its people” vs. cosmopolitan “global inequality reduction is good, even at some US cost.”

Copyright, data, and fair use

Big disagreement over whether current AI training practices are “can’t be done legally” or “don’t want to pay.”
One side: it’s practically impossible to license trillions of tokens from millions of rightsholders; if strict copyright is applied, AI training (and even Internet Archive–style archiving) may be illegal.
Other side: impossibility doesn’t excuse mass infringement; if you can’t license it, you shouldn’t use it. Let companies lobby to change copyright, not just ignore it.
Fair use is contested: some cite Google Books/search precedent; others point to recent Supreme Court narrowing “transformative” use and to clear market harm.
German law allowing data mining unless opted out via machine-readable signal is cited as one model.

What the California AG advisory actually says

Several commenters say Gizmodo’s framing is misleading: the memo opens by praising AI’s potential and mainly says “don’t do illegal things with AI.”
Core flagged risks:
- Using AI to foster deception (deepfakes, undisclosed AI-generated media).
- False advertising about AI accuracy/utility.
- AI systems that cause adverse or disproportionate impacts on protected classes, reinforcing discrimination.
Lawyers note that “disproportionate impact” is long‑standing civil-rights language with extensive case law; the memo just applies existing standards to AI.

Deception, tools, and liability

Pencil analogy: some argue banning deceptive AI use is like banning pencils because they can write propaganda.
Others reply AI providers are active service operators, not neutral hardware vendors, and already impose output restrictions; foreseeability creates duties.

Bias, discrimination, and explainability

Commenters in finance/recruiting stress that black-box models for lending/hiring are already legally dangerous; firms need explainable models and model-risk management.
Example biases (doctor “he” vs nurse “she”) show that datasets encode social patterns; disagreement over when biased outputs become legally actionable vs. merely undesirable.

Regulation, vagueness, and rule of law

Some see “you might be breaking the law” messaging as dangerous vagueness enabling selective enforcement; call for clear ex-ante guidance.
Others respond that uncertainty is normal until courts create precedent; rule of law is about neutral adjudication, not perfect clarity.

View on HN ↗ Original Article ↗

2025-01-29

I do not want AI to "polish" me

Authenticity vs AI “Polish”

Many commenters resonate with the author’s desire to keep their own voice, even if it’s messy, blunt, or “unprofessional.”
Tools like Grammarly and LLM rewriters are criticized for turning distinctive prose into PR-style sludge that sounds fake, forced, and colorless.
Others argue that a style doesn’t have to be unique to be meaningful; what matters is that it’s yours, not machine-flattened.
Some feel genuinely uneasy or even “immoral” passing AI text off as their own, especially in personal or serious communication.

Corporate-speak and Banality Machines

There’s strong pushback against AI that defaults to maximal politeness, apologies, and empty warm-ups; people note it often changes the meaning (adding blame-shifting, fake empathy, or promises that weren’t there).
Several see this as an extension of corporate email norms: convergence toward one safe, inoffensive voice that erases individuality.
Commenters describe LLMs and their tuning as a kind of “banality machine” or anti-art, extracting the interesting bits from everything that passes through.

Who Benefits and How People Actually Use It

Many use AI selectively:
- For bureaucratic/legal/compliance emails or meeting summaries they don’t care about.
- Not for writing that carries emotional weight or personal stake.
Some welcome AI-polish as a “supercharged model letter” to avoid sounding rude when they lack time or social fluency.
A few even unpolish AI or dictation output to better match chatty, informal styles.

Language, Access, and Miscommunication

Strong divide over ESL use:
- Supporters say polish features are a godsend for non-native speakers and first‑generation immigrants, allowing them to avoid discrimination and be taken seriously.
- Critics counter that if you can’t reliably judge the target language, you can’t know whether AI preserved your meaning; it may produce long, wrong, and confusing text.
- Several argue real language skill only comes from actually writing and being edited by humans.

Privacy, Control, and Ubiquity of AI Features

Concern about feeding proprietary or internal documents into cloud AIs; some employers officially ban it, though people likely ignore rules.
Frustration at “AI creep”: Copilot in OneNote, “Rewrite” in Notepad, Adobe “insights,” Gmail/Outlook suggestions, often hard to disable.
Many see these features as driven by stock-price and OKR incentives, not user need.

Broader Cultural and Future Concerns

Fears of a world where:
- AI pads every message and other AI summarizes it back down, wasting energy and destroying signal.
- Most communication is AI-to-AI, making it harder to know who you’re really dealing with.
Some still prefer receiving bland, polished text over contrived “I’m so quirky” voices—but want an option for polished and terse, not verbose corporate mush.

View on HN ↗ Original Article ↗

2025-01-29

Seagate: 'new' hard drives used for tens of thousands of hours

Alleged source of the problem (Seagate vs. distributors)

Some readers assume Seagate is directly at fault; others note the article suggests a shady but “approved” distributor or reseller chain, not necessarily Seagate corporate.
Multiple German retailers (including official Seagate partners) are implicated, suggesting a distributor- or wholesaler-level issue rather than a single rogue shop.
Explanations range from honest warehouse mix‑ups to gray‑market sourcing and deliberate relabeling; commenters stress that wiping SMART hours looks intentional, not accidental.

Marketplace and retailer behavior

Many report Amazon (and, to a lesser extent, other big retailers) shipping obviously used or damaged items as “new,” including HDDs, SSDs, audio gear, and other electronics.
Amazon’s inventory commingling is highlighted: “sold by Seagate” doesn’t guarantee the item actually came from Seagate’s stock.
Some users have received outright counterfeit drives with fake anti‑counterfeit labels, especially via marketplace sellers.
This leads many to avoid buying HDDs from Amazon and prefer specialty dealers or direct manufacturer channels.

HDD vendor reputations and prior scandals

Seagate is seen by many as chronically unreliable (e.g., infamous 3TB models like ST3000DM001, Maxtor-era issues). Others counter that recent Backblaze data shows Seagate mostly comparable to peers except for a few bad SKUs.
Western Digital is criticized for the WD Red SMR debacle and for mixing SMR/CMR under one product line without clear labeling; some users boycotted WD over this.
HGST/Ultrastar and Toshiba enterprise lines are frequently praised as more reliable, though most admit personal anecdotes are statistically weak.

Used/“new” drives and fraud vs. accepted risk

Several commenters routinely buy refurbished/used enterprise drives (often ex‑datacenter) at deep discounts, but only when clearly disclosed.
The core outrage here is not that drives are used, but that they’re sold as new and their SMART history is reset—widely characterized as straightforward fraud.
Speculation: some of these drives may be retired datacenter or Chia‑mining units with tens of thousands of hours.

Why this matters even if warranty exists

Drives have finite life; a “5‑year” drive that’s already 2–5 years into its service reduces effective lifespan.
Warranty replaces hardware but not lost data, downtime, or rebuild headaches (especially for RAID/NAS setups that want matched drives).
Enterprise/OEM drives entering retail may have warranty start dates years in the past; customers buying “new” don’t expect that.

Technical detection: SMART vs. FARM

SMART power‑on hours can be (and are) reset by some refurbishers.
Seagate’s FARM (Field Accessible Reliability Metrics) logs are discussed as harder to fake and more detailed (e.g., voltage ranges, real accumulated hours).
Users share commands: smartctl -l farm /dev/sdX (requires smartmontools ≥ 7.4) and mention Seagate’s openSeaChest tools. Some struggle to extract logs or find documentation.

Buying strategies and alternatives

Many now:
- Avoid HDDs from Amazon/marketplaces for critical data.
- Prefer known-good enterprise lines (Ultrastar, Toshiba MG/MN) or clearly labeled refurbs from reputable sellers.
- Verify warranty status on manufacturer sites and cross‑check DOM vs. reported hours/FARM logs on arrival.

Broader decline in quality and enforcement

Several see this as part of a wider pattern: companies quietly lowering quality or relabeling returns to cope with price pressure.
Opinions differ on effectiveness of remedies: class actions are seen as a “corporate nightmare” but not very rewarding to consumers; others suggest consumer‑protection agencies, regulators, or small‑claims routes.

View on HN ↗ Original Article ↗

2025-01-29

Our phones are killing our ability to feel sexy (2024)

Role of Phones vs. Social Media & Algorithms

Many see phones as neutral tools; the real problem is social media and algorithmic feeds that enable “infinite scroll” and constant dopamine hits.
Others argue phones and social media are inseparable: ubiquitous cameras + pocket access normalized always‑online behavior and made current social media toxicity possible.
Some mitigate by disabling notifications, avoiding social apps, or using dumb phones / watches instead.

Nostalgia, Risk, and Romance

Several comments resonate with the article’s longing for pre‑smartphone romance: missed connections ads, waiting by landlines, physical media, and chance encounters.
The key loss is seen as risk and uncertainty: not knowing the menu, getting lost, walking into a random store, or flirting in person instead of curating profiles.
Others dismiss this as selective nostalgia; every era has its own “edgy” youth culture, and earlier decades also had plenty of passive consumption (e.g., TV).

Time, Work, and Instant Gratification

One camp blames economic pressure, long commutes, and complex lives for pushing people toward instant digital gratification and away from embodied experiences.
Another insists most people actually have more leisure than they admit; detailed time audits often reveal hours lost to TV and phones.
“Opportunity cost” of screen time is emphasized: no single scroll is catastrophic, but the cumulative diversion from hobbies, relationships, and “third spaces” is large.

Sexiness, Image, and Embodiment

Some agree that constant phone use looks and feels unsexy: staring down at a slab, staging fake candid shots, losing bodily presence and eye contact.
Side debates cover watches (Apple Watch vs. Rolex) as signals of utility, money, personality, or superficiality.
Others counter that smartphones can increase confidence, health, and connection, and that “sexy” is highly subjective.

NEETs, Addiction, and Responsibility

The article’s framing of NEETs “robbing themselves” via games and porn drew strong pushback: some see these as survival buffers for people excluded from work and relationships.
Long subthreads argue over addiction, free will, and responsibility: how much is individual choice vs. engineered environments and social structures.
Similar arguments surface around diet and obesity as an analogy for phone overuse.

View on HN ↗ Original Article ↗

2025-01-29

Asteroid Impact on Earth 2032 with Probability 1% and 8Mt Energy

Asteroid risk level and impact consequences

2024 YR4 is estimated at ~8 Mt yield, comparable to a large nuclear weapon or Tunguska-level event: serious city‑scale damage but not civilization‑ending.
Torino scale rating implies “localized destruction,” not regional or global catastrophe.
Several comments stress that the greatest chance is impact over ocean (most of Earth’s surface) or sparsely populated land; only a tiny fraction of the planet is “very urban,” so risk of a million‑plus death event is low.
Ocean impact could generate tsunamis, but there is disagreement over how severe compared with major earthquake tsunamis.
For individuals, commenters argue the risk is orders of magnitude smaller than everyday hazards (cars, disease, etc.).

Probability, uncertainty, and orbit dynamics

The 1.2% figure is cumulative over several possible encounters starting in 2032; most subsequent passes are much lower probability.
Negative Palermo scale rating means this is not above background asteroid risk.
Several explanations: current orbit is poorly constrained due to a short observation arc; as more observations come in, the “error ellipse” usually shrinks and the impact probability almost always drops toward 0%.
Orbital uncertainty is handled via Monte Carlo sampling of the covariance on orbital elements, then propagating many deterministic n‑body simulations forward.
Discussion of chaotic n‑body dynamics vs deterministic physics: consensus that randomness comes from measurement uncertainty, not the equations themselves.

Detection systems and upcoming surveys

A contributor working on the NEO Surveyor telescope explains that:
- The object is small and dim; prior apparitions were hard to recover in archival data.
- NEO Surveyor (IR) and the Vera Rubin Observatory (LSST) are expected to re‑detect and greatly refine its orbit years before 2032.
- IR observation reduces size uncertainty by measuring thermal emission rather than brightness alone.
Several note that new surveys will massively increase the catalog of near‑Earth objects, raising communication challenges: more “scary‑sounding” detections without increased underlying risk.

Mitigation and deflection ideas

Proposed methods include nuclear disruption, gravity tractors, deliberate gravitational “tugs,” Yarkovsky manipulation, and even mining; others push back that:
- Available space power is tiny relative to the energy needed to significantly alter a tens‑meter object’s orbit on short notice.
- Fragmenting an object adds complexity and could increase or decrease impact risk depending on details that are hard to control.
- Testing deflection should be done on very safe targets, not a close‑approach object with non‑zero impact probability.

Societal, political, and media angles

Some see this as an argument to build a global asteroid‑defense system; others worry about dual‑use, nuclear‑armed space systems and strategic instability, citing past warnings about weaponizing asteroid deflection.
Debate over whether widespread coverage of such objects will inform the public or create a crisis of panic and misinformation, especially via social media.
Comparisons are drawn to climate change and other global risks: disagreement over which is more “existential,” and skepticism about humanity’s ability to mount coordinated responses.
Evacuation scenarios are discussed: moving a city is considered feasible (analogous to hurricane evacuations), but relocating half the planet to the “safe” hemisphere is viewed as logistically and politically impossible.

Humor, culture, and speculation

Many jokes reference “Don’t Look Up,” “Armageddon,” “giant meteor for president,” and Mayan‑prophecy‑style doomsday cults.
Several commenters explicitly say they are “rooting for the asteroid,” while others push back, emphasizing the localized but very real human toll such an impact would have.

View on HN ↗ Original Article ↗

2025-01-29

Nuclear fusion: it's time for a reality check

Political optimism vs. “30 years away” reality

Commenters note fusion has been “decades away” for half a century and see current UK rhetoric (“within grasping distance”) as dangerously over-optimistic.
Main concern: governments may shape energy (and even AI/automation) policy around speculative technologies rather than proven ones.

Current fusion efforts and technical challenges

Some point out that companies like Commonwealth Fusion and Tokamak Energy are building serious tech demonstrators, not just science toys; they see value in “building to learn.”
Others stress that multiple independent breakthroughs are still needed (confinement, materials, breeding, maintenance, cost), so a sudden “DeepSeek moment” is unlikely.
Debate on magnetic-confinement tokamaks:
- Pro side: new high‑temperature superconductors allow much higher fields; power scales strongly with field, enabling smaller, cheaper reactors.
- Skeptical side: structural limits (J×B forces, material strength) cap usable fields; volumetric power density is still far worse than fission, implying huge, costly plants.
ITER is widely viewed as a cautionary project: outdated magnet tech, major delays, and a design that would be noncompetitive even if it works.

Maintenance, remote handling, and reliability

“Remote operation” is interpreted as remote maintenance inside highly radioactive vessels, not offsite control.
Robotic access into tight, fragile, vacuum‑sealed geometries is described as a major unsolved engineering problem; failure to extract a stuck robot could be catastrophic.
One analysis of a DEMO‑like plant estimated ~4% availability, highlighting RAMI (reliability/availability/maintainability/inspectability) as a central bottleneck.

Economics vs. renewables and fission

Many argue the biggest omitted challenge is cost: fusion must beat rapidly falling solar/wind + storage, not just “work.”
Fuel is considered a minor cost driver; capex and complexity dominate. Tritium supply and breeding add further expense.
Extensive side discussion on fission history: subsidies, breeder failures, SMRs repeatedly cancelled, and chronic cost overruns vs. explosive growth and cost drops in renewables and batteries.
Some think fusion R&D is worthwhile long‑term; others argue marginal dollars would do more for climate if spent on modern fission or scaling renewables now.

Neutron flux, waste, and alternatives

DT fusion’s intense neutron flux is seen as creating large volumes of activated material and tritium‑handling issues—“all the hassles of fission with more steps.”
Aneutronic fusion is noted as conceptually cleaner but vastly harder.
A minority suggests fusion may make more sense for niche roles (e.g., advanced space propulsion) than for terrestrial grid power.

View on HN ↗ Original Article ↗

Hacker News, Distilled

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics