Stories - Page 749 | HN Distilled

2024-07-24

Ask HN: Am I crazy or is Android development awful?

Overall sentiment on Android development

Many describe native Android dev as frustrating, bloated, and fragile, especially for beginners and for anything “off the happy path.”
Several long-time Android devs say it has improved since the Eclipse/Ant days, but still “sucks,” just slightly less.
Some argue it’s mainly a steep learning curve comparable to modern web or Python ecosystems; others insist Android is qualitatively worse.

Tooling: Android Studio, Gradle, build systems

Android Studio is widely called slow and resource‑hungry, even on modern machines; just creating projects can feel sluggish.
Gradle is labeled the “single worst thing” by several: complex, hard to debug, yet unavoidable. Kotlin DSL is seen as only a minor improvement and still “just Gradle.”
A minority prefer Bazel or custom setups; others reminisce about Ant/Maven while acknowledging those were also painful.

APIs, architecture, and technical debt

Android is framed as burdened with deep, “baked‑in” technical debt from early rushed design and backwards‑compat decisions.
Complaints about inconsistent APIs, hostile or missing events (e.g., soft keyboard appearance), and esoteric, version‑dependent edge cases.
Deprecation churn and OS fragmentation force conditional code paths and tricky support for older devices.

Languages and UI frameworks

Kotlin is praised as pleasant and powerful, especially with Jetpack Compose, but some find it feature‑heavy and hard to learn.
XML‑based UI and data binding are considered legacy and slow; Compose is seen as a big improvement and roughly on par with React‑style declarative UIs.
Some prefer Flutter or React Native/Expo for faster iteration and cross‑platform support, though there’s skepticism about Google’s long‑term commitment to Flutter.

Native code, hardware access, and niche use cases

Accessing USB devices (e.g., UVC webcams) via libusb/libuvc inside Android apps is described as possible but painful and poorly supported.
Using NDK and native libraries can work but has sharp edges: build complexity, ABI quirks, page-size changes (Android 15), dependency hell.
For specialized hardware projects, several suggest small Linux SBCs instead of phones.

Comparisons with other platforms

iOS dev is portrayed as more coherent and consistent at the API level but with its own tooling and SwiftUI rough edges.
Web development is seen as easier for prototyping, especially via the browser’s camera APIs.
Some use WebView wrappers, Ionic/Capacitor, or Termux to escape parts of the native stack or to get a more “Linux‑like” environment.

View on HN ↗

2024-07-24

The secret of Minecraft (2014)

Impact of Microsoft’s Ownership

Strong split: some argue Microsoft “ruined” Minecraft (account issues, monetization, complexity, combat changes); others say stewardship has been “fantastic” and grew the franchise without breaking the core loop.
Microsoft praised for: cross‑platform Bedrock play, spin‑off games, education focus, keeping Java edition alive, and bundling Java+Bedrock.
Criticisms: lost or locked accounts during migration, more aggressive monetization (coins, skins, merch), chat moderation, and perception of prioritizing IP/merch over core gameplay or bug fixes.

“Secret Knowledge”, Tutorials, and Wikis

Original appeal for many: no in‑game instructions, emergent “secret” crafting knowledge, and social discovery via friends and wikis.
Others found that opaque design bad/annoying and welcomed the recipe book and prompts; most players used wikis anyway.
Recipe book and hints were added years after release; some feel this killed part of the mystery, others say secret knowledge now lies in quirks, mechanics, and community techniques, not recipes.

Simplicity vs Content Bloat

Early versions seen as “just enough” content: a true sandbox, few blocks, minimal objectives, strong survival tension, and room for imagination.
Later updates introduce RPG‑like progression, many new blocks, villagers, elytra, structured endgame, which some see as diluting survival and “mining” focus.
Counter‑view: more content, goals, and achievements give non‑builders reasons to keep playing and periodically restart worlds.

Java vs Bedrock, Mods, and Alternatives

Java edition valued for mods, openness, and long‑running servers; mod ecosystem (e.g., tech/automation, building mods) often seen as more innovative than official updates.
Bedrock praised for performance, mobile/console reach, budding scripting APIs, but criticized for weaker modding and different technical limits.
Some suggest alternatives like Minetest/Mineclonia or Vintage Story for a more “beta‑like” or FOSS experience.

Generational, Creative, and Educational Aspects

Many describe multi‑generational play: adults sharing worlds with their children, each adopting different playstyles (building, survival, minigames, mods).
Minecraft seen less as a game and more as a creative platform or “digital Lego,” enabling custom minigames, redstone contraptions, and even early programming/modding paths (with some shift of that role to Roblox).

View on HN ↗ Original Article ↗

2024-07-24

Will Figma become an awkward middle ground?

Role of Figma in Workflow

Many see Figma as a “middle ground” between rough sketches and production code; some call it a “cursed midpoint” that causes double work.
Others argue that middle ground is exactly its value: fast, visual, shareable artifacts without needing to touch code.
Several people say Figma is unnecessary for solo devs or very small teams who can go straight from sketch/paper to HTML/CSS.

Design in Code vs Design Tools

A number of “codey designers” and engineers prefer to design directly in the browser (React + Tailwind, plain HTML/CSS, etc.), finding it faster and more realistic than pixel-perfect Figma work.
Others strongly caution against designing and coding at the same time for non-trivial products, citing wasted effort, poorer UX, and conflation of user needs with implementation constraints.
Some workflows: paper or Excalidraw for ideation → code; others: low-fi Figma/FigJam/Miro → code.

Collaboration, Scale, and Design Systems

Figma is praised for multiplayer collaboration, alignment across large teams, and shared design systems; this becomes critical at scale and for enterprise products.
It’s seen as especially useful for stakeholder alignment, hi-fi mockups, and consistent component usage when many developers are involved.
Several note that Figma is better for UX flows and interaction mapping than as a pixel-perfect graphics tool.

Code Export, AI, and Future of Tools

Opinions on Figma’s CSS/HTML/JS export range from “good” to “useless,” especially for complex responsive apps and emails.
Some expect AI and codegen to blur or replace the Figma layer (wireframe → code, design system–aware generation); others are skeptical due to missing training data, integration complexity, and platform variability.
Alternative or adjacent directions mentioned: tools that design closer to code (Webflow, Framer, Judo, tldraw, new “design–frontend hybrid” tools), or even Figma evolving into an IDE / “export as final app.”

Limitations, Pain Points, and Overdesign

Complaints: Figma feels too technical, laggy on some setups, poor at complex interaction prototyping, and creates a “fantasy world” for things like HTML email where constraints are harsh.
Some blame modern design culture: over-investment in esoteric Figma features, pixel-perfect fantasies, and design systems that never fully translate to working software.
Others defend UX as a distinct, necessary discipline; the problem is not the tool but designers who lack implementation awareness.

Designers, Developers, and Hybrid Roles

Strong debate over “designers who code”: some see combining roles as ideal (especially early-stage), others note it’s rare, tiring, and often slower than having separate specialists.
There’s interest in “design engineer” roles that sit between UX and frontend, but their exact mandate (how vs implementation) remains somewhat unclear.

View on HN ↗ Original Article ↗

2024-07-24

X redesigns water pistol emoji back to a firearm

Technical implementation

Several comments explain X/Twitter uses custom SVG images (Twemoji) embedded as <img> tags, not OS-native emoji fonts.
Others discuss alternative approaches: full custom emoji fonts, single-glyph override fonts, and browser/OS font fallback behavior.
Concerns raised about font-based emoji: large file sizes, lazy loading complexity, color font complexity, and rendering inconsistencies.

Platform vs custom emoji

Some want sites to stop overriding platform emojis to maintain OS-native look and accessibility.
Others argue custom emoji are necessary to ensure consistent appearance and line-wrapping across platforms, and to avoid cross-platform misinterpretation (e.g., water pistol vs gun).

Censorship, politics, and “woke”

One camp sees the original shift from pistol to water gun as ideologically driven “censorship” or “Newspeak,” pushing anti-gun norms.
Another camp argues it’s a benign design choice, not censorship, and in some contexts a responsible attempt to reduce normalization of real guns.
Some see changing it back as similarly political, driven by culture-war signaling rather than user need.
Others frame the revert as restoring the Unicode-intended meaning (“PISTOL”) and resisting ideological changes by big tech.

Semantic consistency & Unicode

Strong concern that changing glyphs retroactively alters the perceived meaning of old messages, effectively “rewriting history.”
Multiple people argue emojis should be as immutable as possible; others note language and emoji meanings evolve regardless.
Debate over whether Unicode should have included emoji at all, given that their meaning is tightly coupled to specific artwork, unlike letters.
Some point out Unicode still names the character “PISTOL,” even though locale data short names it “water pistol.”

Safety, threats, and kids

Some argue a gun emoji enables or amplifies death threats; others counter that words or water-gun emojis serve the same purpose.
A few worry about children unknowingly sending something that reads as a real-gun threat on other platforms.

Product priorities and corporate behavior

Critics see the change as trivial, ideological, or attention-seeking while more serious X issues remain unresolved.
Defenders say it was a tiny effort with outsized marketing impact, and that companies can work on “small” and “big” things simultaneously.

Meta / culture-war fatigue

Several comments lament that an emoji design has become another culture-war battlefield and express exhaustion with such debates dominating technical forums.

View on HN ↗ Original Article ↗

2024-07-24

The rich history of ham radio culture

Morse Code Requirement & Accessibility

Some miss the Morse test as a historical “rite of passage” and backup skill.
Others argue it was an unnecessary barrier: hard for people with musical, motor, or hearing limitations.
Consensus from multiple comments: dropping the requirement increased participation without killing interest in Morse; CW is reportedly thriving in niches (QRP, SOTA, contesting, nostalgia rigs).

Ham as Technical / Experimental Hobby

Strong emphasis that ham radio at its best is about experimentation: building rigs and antennas, low-power (QRP) challenges, satellite and moon-bounce, digital modes, and software-defined radio (SDR).
Historically, magazines and kit culture taught practical RF and electronics; many see ham radio as intertwined with professional electrical engineering.
Some note today’s most popular mode is a digital weak-signal one (FT8), not voice.

Culture, Demographics, and Social Dynamics

Recognized stereotype: older men, health chatter, nostalgia gear, sometimes conservative or off-color humor; experiences vary widely by region and band.
Many report deep, long-lasting friendships forged through the hobby.
Ongoing irritation among experienced operators at terms like “HAM” (all caps) and “broadcasting,” seen as markers of outsiders.

Regulation, Ethics, and Censorship

Clear constraints: no encryption (with narrow exceptions in some countries), no broadcasting to the general public, no commercial use, content restrictions.
Several stress that ham radio is not a tool for mass uncensored communication; all traffic is inherently public and heavily self-policed to protect spectrum access.
Licensing costs and renewal fees differ by country; some regions have no ongoing government fees.

Emergency Communications & Public Service

Some are drawn by emergency communications (ARES, SKYWARN), particularly in disaster-prone areas.
Debate over relevance vs. modern portable cell infrastructure; defenders argue ham can come up faster and cover longer distances when infrastructure fails.

On-Ramps, Activity Levels, and Modern Use

New Technicians report quiet repeaters and difficulty finding engaging activity; suggestions include 10m, POTA/SOTA, satellites, DMR, and storm spotting.
Several note VHF/UHF activity has declined since the 1980s, with the internet displacing the novelty of long-distance free communication.

View on HN ↗ Original Article ↗

2024-07-24

CrowdStrike global outage to cost US Fortune 500 companies $5.4B

Outage Cost and Scale

Some think the $5.4B estimate is low, given multi‑day airline disruptions and knock‑on effects (hotels, car rentals, missed connections, hospital impacts).
Others argue that focusing on one airline’s revenue shows how hard it is for cancelled flights alone to justify that figure; overall damage remains uncertain.

Liability, Contracts, and Lawsuits

Many expect CrowdStrike’s contracts to cap liability to refunds or a small multiple of fees paid; without “gross negligence,” large payouts seem unlikely.
There is debate whether liability waivers hold when human life is impacted or when software is used in environments the vendor explicitly disclaims (air traffic control, hospitals, etc.).
Some predict the real outcome will be discounts at renewal, not massive judgments; class actions are expected but seen as mostly enriching lawyers.

Product Value vs. Compliance Checkbox

Several comments say tools like CrowdStrike are bought mainly to pass audits, not because buyers truly understand or value their capabilities.
Others, including people with operational experience, argue it’s a technically strong EDR product and widely respected in practice, despite compliance being the purchasing driver.

Root Cause, Testing, and Deployment Practices

Strong criticism that a kernel‑mode, boot‑critical component could be updated globally without staggered rollout, robust validation, or safe rollback.
Explanations include under‑staffing, management pressure, or policy‑only (not enforced-by-code) processes.
Some technical discussion suggests a subtle bug (uninitialized pointer / probabilistic crash) that automated tests might miss, but many still see this as systemic failure.

Enterprise Inertia and Business Outlook

Most expect limited customer churn due to switching costs, regulatory constraints, and organizational inertia; comparisons are made to other vendors that survived major incidents.
Views diverge on long‑term impact: from “eventually a fraction of current size” to “stock dip, then back to business as usual.”

Apology Gift Cards and PR Backlash

The $10 Uber Eats voucher (later reported canceled in some cases) is widely mocked as insulting and darkly comic, especially relative to the losses incurred.
Questions arise over who at a large client would even receive such a card and how it could be anything but a token gesture.

Platform, Architecture, and Resilience Debates

Heavy criticism that airlines and other critical operators allowed a single vendor’s update to become a global single point of failure.
Proposed mitigations: diversified platforms (e.g., mix of OSes and tools), stricter update gating, non‑auto‑updating critical systems, or “un‑brickable” architectures with simple, independent subsystems.
Arguments over Windows’ brittleness vs. Linux, and whether Microsoft should restrict third‑party kernel drivers or change Defender’s architecture.

Responsibility, Regulation, and Ethics

Some assign primary blame to CrowdStrike given the power of a kernel driver; others say organizations (and regulators enforcing checkboxes) are responsible for over‑reliance on such tools.
Calls appear for stronger regulation, higher engineering standards, and making vendors bear more of the true cost—though some warn this could discourage innovation.

View on HN ↗ Original Article ↗

2024-07-24

Anyone can access deleted and private repository data on GitHub

Scope of the Behavior

Not new: multiple commenters say they reported or noticed this behavior years ago; GitHub classified it as “known, low risk” and documented it later.
Affects GitHub fork networks: objects (commits/blobs) are shared across forks; refs are per-fork.
Key edge cases discussed:
- Deleted forks of public repos: commits remain reachable via the upstream by hash.
- Private forks of private repos that later become public: pre‑existing fork commits can be accessed via the now‑public upstream, even if forks stay private or are deleted.
- Purely private repos and forks that never become public appear unaffected.

Mechanics: Why Data Sticks Around

Git’s content‑addressable storage and GitHub’s shared storage for forks mean objects are retained until garbage‑collected.
Several users assert GitHub effectively never GC’s unreachable commits within fork networks.
Short SHA support (down to 4 hex chars) makes brute forcing commit IDs feasible.
Public events archives (including third‑party GH event mirrors) leak commit hashes, enabling targeted retrieval.

How Serious Is It?

One camp: only the “private fork becomes de facto public when upstream is made public” is a real vulnerability; everything that was ever public should be assumed permanently public anyway.
Other camp: this is a major Principle of Least Astonishment violation; “private” and “delete” strongly imply stronger guarantees than users actually get.
Several note real‑world impact: leaked API keys, proprietary algorithms, console SDKs, and internal forks unknowingly exposed.

GitHub’s Handling and Bug Bounties

Multiple reports through HackerOne were closed as “working as intended,” no bounty.
Debate over bug bounty ethics: companies don’t pay for known or architectural issues; researchers argue this discourages reporting systemic problems.
Some see GitHub’s documentation as insufficient UX; burying a surprising security property in help docs is viewed as user‑hostile.

Mitigations and Alternatives

Practical advice from the thread:
- Don’t use GitHub “private forks” for sensitive work; make a separate private repo (clone/template) instead.
- Never open‑source an existing private repo; create a fresh public repo and copy selected code.
- Always rotate secrets once committed, regardless of later deletion.
- For stricter control, consider other hosts or self‑hosted Git (GitLab, Gitea, etc.), though similar patterns may exist elsewhere.
Some note GitHub offers a manual path for full data removal via support for legal/privacy reasons, but this is not automatic.

View on HN ↗ Original Article ↗

2024-07-24

Intel confirms oxidation and excessive voltage in 13th and 14th Gen CPUs [video]

Scope of the Intel 13th/14th Gen Issues

Thread centers on Intel’s confirmation of oxidation and excessive voltage issues on recent desktop CPUs, especially unlocked 13th/14th gen parts.
Some say oxidation only affected early 13th gen runs and is already fixed; others highlight claims that it still contributes to current failures.
There is uncertainty over which exact SKUs and production windows are affected; at least one commenter explicitly asks for a clear list and notes none has been published.

User Impact and Buying Decisions

Multiple users report long‑running instability (BSODs, crashes under multi‑core load, RAM issues) on 13th/14th gen builds, sometimes after extensive and costly troubleshooting and multiple RMAs.
These experiences significantly damage goodwill toward Intel; several state they will avoid Intel for future systems.
Some potential buyers of Intel laptops (e.g., Legion 7i) are reconsidering, delaying purchases, or switching to AMD, though availability and pricing of Ryzen options vary by region.
Others plan to wait for the August microcode patch and post‑patch reviews before deciding.

Responsibility, PR, and Possible Recall

Several posts criticize Intel’s communication: delayed disclosure, shifting explanations (oxidation vs microcode/voltage), blaming partners, and inconsistent statements.
Some view the situation as evidence of corner‑cutting on voltage/safety margins to stay competitive with AMD.
There is debate over whether manufacturing defects vs design/firmware choices are primarily at fault; consensus is that public information is still incomplete.
Speculation appears about potential recalls, lawsuits, and stock price impact, but scale and likelihood are described as unclear.

Broader Context: Reliability and Competition

Discussion compares Intel’s recent reliability issues to historically robust CPUs, and to competitors (AMD/TSMC, Apple, ARM server chips).
Several note Intel’s long‑running fab struggles and see this as a “second Pentium 4 era” with high power use to chase performance.
Others stress that complex semiconductor manufacturing is inherently fragile and miraculous when it works, but also note that competing vendors are not seeing similar headline failures right now.

View on HN ↗ Original Article ↗

2024-07-24

Phish-friendly domain registry ".top" put on notice

Perceived Abuse of .top and Other TLDs

Many commenters say .top is heavily used in phishing and smishing; several report recent USPS/package and government procurement scams using .top.
Some note similar patterns with .xyz, .io, .site, .cc, .zip, .tk, etc., and say they block these entire TLDs at the DNS, SMTP, or firewall level.
Others argue this harms legitimate small or hobby users who pick cheap TLDs (e.g., homelabs, teaching domains, novelty names).
A few suggest that if a TLD’s phishing rate is much higher (e.g., .top 4.2% vs .com 0.2%), blocking is justified to minimize collateral damage.

Debate on Default Blocking and “Allow Lists”

One camp favors browsers/mail clients shipping with a “default allow” list of safer TLDs, with users able to opt-in others.
Critics argue browsers must stay neutral and such mechanisms would invite pay-to-play abuse (large providers charging registries for inclusion).
Some individuals already approximate this via custom DNS services that block most new gTLDs.

.zip TLD and Auto-Linking Risks

Multiple anecdotes about .zip domains being globally blocked by organizations due to phishing concerns.
Key risk: auto-linkification in email/Chats/Docs turning plain filenames like package.zip into clickable links to attacker-controlled .zip domains.
Commenters detail how users can be tricked into thinking they’re downloading an attachment rather than visiting a website, blurring trust cues.

Responsibility of Registries and ICANN

One side argues registries historically must act on abuse complaints; otherwise they risk losing accreditation.
Others see content-policing by registries/ICANN as a slippery slope toward censorship and believe ICANN should stay content-agnostic.
Some note that ICANN’s enforcement in practice is weak and slow, especially with overseas registrars.

Cost, Incentives, and Enforcement Limits

Cheap TLDs lower phishers’ costs and enable rapid domain churn; raising prices might only dent margins.
Suggestions include better anti-abuse teams, automated similarity/content checks, and stronger cross-border enforcement—but many doubt feasibility due to jurisdiction and geopolitical constraints.

View on HN ↗ Original Article ↗

2024-07-24

CrowdStrike offers a $10 apology gift card to say sorry for outage

Overall reaction to the $10 Uber Eats apology

Most see the $10 card as tone‑deaf and insulting relative to the outage’s scale and damages (airlines, hospitals, 911, etc.).
Many argue “no compensation” would be better than an obviously trivial one, as this signals “we don’t think this matters.”
People point out that $10 on Uber Eats often doesn’t even cover fees/tax/tip, so its real value is perceived as even lower.

Questions about authenticity and execution

Some suspect the whole thing could be a prank, a phishing attempt, or TechCrunch being trolled due to the weak sourcing (few tweets, some deleted).
Others believe it’s real but note that the article suggests the vouchers went to “partners” (MSPs, channel, etc.), not end customers.
Reports that some codes were canceled or failed to redeem trigger jokes that CrowdStrike “updated the card with a null value” or crashed checkout machines.
A few say canceling a widely shared multi‑use code would actually be reasonable security practice, but then question why such a code existed in the first place.

Perception of CrowdStrike’s judgment and PR

Commenters see the gift card move as evidence of poor crisis management, lack of internal review, and a repeat of “insufficient testing” but in marketing form.
There’s debate over long‑term impact: some predict massive legal liability and brand damage; others cite examples (Microsoft, Okta, SolarWinds, BP, Equifax, Boeing, etc.) to argue such crises rarely destroy entrenched vendors.
Several emphasize that leadership seems insulated from shame; frontline engineers likely worked extreme hours without corresponding compensation.

Harm, responsibility, and ethics

Some share personal or hypothetical stories of missed flights and degraded medical/911 services, arguing the outage likely had serious real‑world consequences.
Others counter that similar tragedies could arise from any disruption (weather, late bus) but agree this was a preventable corporate failure, not an “act of God.”
A few suggest the cards might be close to “bribes” for certain organizations in some jurisdictions.

Related tangents and analogies

Extended discussion on token “consideration” in contracts (e.g., $1–$20 payments, IP assignment, patent bonuses) used as analogy for how trivial sums are used to formalize or sanitize one‑sided arrangements.
Numerous anecdotes about laughably small corporate rewards (pizza parties, tiny bonuses) for huge employee contributions, used as parallels to the $10 gesture.
Some broader criticism of EDR as a business category and of OS ecosystems that require such tools.

View on HN ↗ Original Article ↗

2024-07-24

AI models collapse when trained on recursively generated data

Overall intuition about “model collapse”

Many see the result as intuitive: LLMs are lossy compressors of their training data; recursively training on their own outputs further erodes information, especially in the tails of the distribution.
Analogies used: photocopies of photocopies, VHS/JPEG re‑saving, echo chambers/navel‑gazing, inbreeding/incest, “breathing your own exhaust.”
From a control‑theory / Markov‑chain perspective, unconstrained feedback loops are expected to drift and lose diversity or stability.

Synthetic data: good vs bad use

Thread distinguishes “indiscriminate” reuse of model output from deliberate synthetic data generation.
Synthetic data is already used by major labs (self‑play, RLHF, prover–verifier setups, curated problem sets) and is reported to work when:
- There is a clear fitness metric or verifier (e.g., math correctness, human raters).
- Generated data is filtered, edited, or selected by humans or other models.
Without such external feedback, synthetic data can only rearrange existing information and tends to smooth away rare but important events.

Web scraping and AI contamination

Concern that future web corpora will be heavily mixed with LLM‑generated text, making “indiscriminate” scraping dangerous.
Detecting AI content reliably is seen as unsolved; rough filters and “AI detectors” may help at aggregate level but are imperfect.
Some argue high‑quality, licensed, and educational data are becoming more important than raw web crawl; others worry AI‑assisted writing will still quietly pollute even “professional” sources.

Critiques of the Nature paper and theory

Several commenters argue the experimental setup is unrealistic: repeatedly fine‑tuning on a fixed synthetic dataset from the same model resembles catastrophic forgetting, not how modern labs use synthetic data.
Statistical objections: claims that collapse is mathematically inevitable are challenged with counter‑examples (e.g., normal distributions), though there is debate about finite‑sample effects and variance drift.
Some criticize the publication venue’s ML track record, calling the work more of a warning about naive practices than a deep, general theorem.

Mitigations and open questions

Proposed mitigations: human‑in‑the‑loop curation, external ground truth, discriminators/verifiers, better quality filters, and maintaining a base of fresh human data.
Disagreement remains on how serious “model collapse” is in practice: some think frontier labs already control it; others see systemic risks, especially for uncontrolled web‑scale training.

View on HN ↗ Original Article ↗

2024-07-24

Large Enough

Perceived model quality & rankings

Many commenters say Claude 3.5 Sonnet is currently the best “everyday” and coding model, often “blowing away” GPT‑4/4o and Copilot in real workflows, especially for complex code reasoning and self‑correction.
Others report opposite experiences, finding GPT‑4o better or at least not worse; several suggest performance depends heavily on task type and prompting style.
Initial tests comparing Mistral Large 2 and Llama 3.1 405B against prior Claude prompts often rank them roughly tied and slightly below Claude 3.5 Sonnet.
Some see GPT‑4 as having degraded over time (more boilerplate, laziness, shallow outputs) while 4o optimizes more for cost/latency than raw capability.

Coding assistants & tooling

Claude 3.5 + tools like Aider or OpenWebUI is repeatedly praised as a highly effective coding partner with strong project‑/codebase‑wide context.
Cursor, Copilot, and other IDE tools get mixed reviews: good for inline suggestions but weaker on large refactors, continuity across edits, or complex reasoning.
Some users report massive productivity gains (e.g., shipping new apps or navigating complex Unreal C++), others find LLM code help too error‑prone to trust.

Benchmarks, evaluation, and “strawberry”

Commenters debate the value of leaderboards (e.g., LMSys, ArtificialAnalysis, Aider’s coding boards) vs “mass anecdata” from real use.
The “how many r’s in strawberry” question becomes a focal example: many top models answer incorrectly unless guided through step‑by‑step reasoning or via tools.
This sparks long discussion of tokenization limits, counting/math weaknesses, hallucination confidence, and the need for better tests of reasoning and long‑context competence.

Costs, scale, and possible plateau

Some argue frontier models are converging and we may be near the limits of scaling transformers; incremental benchmark gains are costly and may be marginal in practice.
Others think bigger or better‑trained models (and new architectures, internal reasoning, tools integration) still have significant headroom.
There’s concern that proprietary leaders are shifting from capability to cost/latency optimization and that open models plus local deployment will erode their advantage.

Licensing, openness, and deployment

Mistral Large 2’s open weights with a non‑commercial license are welcomed but viewed as less attractive than fully open Llama 3.1 for many use cases.
Anthropic’s restrictive commercial terms (no using Claude outputs to “compete”) worry some; others doubt such clauses are enforceable.
Many users now run multiple models via unified UIs (OpenWebUI, local Ollama, API multiplexing) and select per‑task based on speed, cost, and refusals rather than a single “winner.”

View on HN ↗ Original Article ↗

2024-07-24

Google is the only search engine that works on Reddit now, thanks to AI deal

Reddit’s robots.txt change and Google deal

Reddit’s robots.txt now disallows all generic crawlers, while serving Google a different, more permissive version.
Many see this as part of a broader strategy: close the API, block generic crawling, then sell data access (e.g., ~$60M Google AI deal).
Some argue this is driven by financial pressure: Reddit is still loss‑making despite licensing revenue.

Impact on search and users

Non‑Google engines (Bing, DDG, Mojeek, etc.) either lose fresh Reddit results or must buy access indirectly (e.g., via Google or licensing).
Users who relied on site:reddit.com for high‑signal answers feel pushed back to Google.
Others welcome the change, happy to see less Reddit in search results, claiming many threads are low‑quality or heavily moderated “hiveminds.”

Scraping, robots.txt, and legality

Several comments note U.S. case law that public pages can be scraped regardless of robots.txt, though copyright and ToS still constrain use.
Others point out technical blocks (datacenter IP bans, Cloudflare, anti‑bot features) make scraping costly even if legally permitted.
Some suggest a future of “data laundering”: independent scrapers repackaging Reddit content for AI or search under fair‑use arguments.

Competition and antitrust concerns

One side: blame lies almost entirely with Reddit; any search engine can pay too, so not anti‑competitive.
Other side: in practice only giants can afford many such deals, raising barriers to entry and reinforcing Google’s search monopoly.
Some think truly exclusive indexing deals could trigger antitrust scrutiny; whether this deal is exclusive is unclear.

Ethics of monetizing user-generated content and AI

Strong disagreement over whether Reddit is ethically entitled to sell access to user posts.
Some emphasize users hold copyright but have granted Reddit a broad license; others stress the moral problem of monetizing unpaid labor while restricting broader access.
Many tie this to “enshittification” of the web: platforms closing off, chasing short‑term profit, and reacting to AI scraping by becoming walled gardens.

Alternatives and broader web trends

Lemmy and federated “distributed Reddit” are mentioned, but network effects and moderation/spam burdens are seen as major obstacles.
Some hope this fragmentation pushes people back to independent forums and hobbyist sites; others think LLM‑driven scraping and spam will only worsen.

View on HN ↗ Original Article ↗

2024-07-24

Physicists may now have a way to make element 120

Element 120 and Naming

Element 120 currently has the systematic placeholder name unbinilium (“one-two-zero-ium”) and will be renamed if confirmed.
It would sit under radium; some prefer “eka-radium”–style naming that encodes periodic position.
Discussion of IUPAC’s conservative naming process; “fun” fictional names are seen as unlikely.
One comment notes element 121 would enter a new “g-block” region of the periodic table.

Experimental Method and Technical Challenges

The discussed approach: accelerate titanium ions to ~0.1c and collide them with a plutonium target.
This has already produced a few atoms of livermorium as a benchmark.
Main difficulty: compound nuclei are created “hot” and tend to break apart; lowering beam energy helps survival but cuts fusion rates.
Producing and accelerating titanium beams is itself hard: vaporizing or ion-sourcing Ti at high purity and temperature is a major materials-science challenge.

Stability, Island of Stability, and Nuclear Structure

Oganesson (118) is the heaviest confirmed element; only a handful of atoms have been made.
An “island of stability” is predicted around the low 110s, potentially giving half-lives up to seconds or longer, but models have lost confidence as data increased.
Even proton/neutron numbers tend to be more stable; several even-Z elements were discovered before neighboring odd-Z ones.
Binding-energy arguments suggest stability broadly peaks near iron; heavier nuclei rely on special “magic numbers” and quickly become more unstable.

Scientific Value vs. “What’s the Point?”

Enthusiasts see this as:
- A critical testbed for nuclear-structure theory and the strong force.
- Input to understanding early-universe and neutron-star nucleosynthesis.
- Possible path to longer-lived superheavy isotopes with future applications (e.g., medical).
Others are skeptical given millisecond lifetimes and atom-scale yields, comparing it to “playing with expensive toys” and noting national prestige and competition as drivers.

Limits of the Periodic Table and Extreme Matter

Debate over whether the periodic table is “infinite” centers on definitions: existence requires nuclei that live long enough (~10⁻¹⁴ s) to form an electron cloud.
Arguments highlight: growing proton repulsion vs short-range strong force; eventual unbinding to proton/neutron emission; relativistic electron effects at very high Z.
Some invoke gravity and neutron stars; others counter that gravitational effects are irrelevant at nuclear scales, though neutron stars can be viewed (loosely) as giant nuclear systems, not atoms, and do not support chemistry.

View on HN ↗ Original Article ↗

2024-07-24

You got a null result. Will anyone publish it?

Publication bias & incentives

Many see strong bias toward novel, positive findings, especially at elite journals; null results are rarely accepted, even when well done.
Career incentives (tenure, funding, h-index) push researchers to prioritize “sexy” results over careful nulls or replications.
Some argue this systemic behavior is now close to scientific misconduct; others frame it as predictable outcome of misaligned incentives rather than fraud.

Replication crisis & statistics

With only positive outcomes published, false positives are inevitable and often unrecognized; regression to the mean then makes replications “fail.”
Multiple commenters stress that one study is just one sample; robust knowledge requires different samples and replication.
Misunderstandings of p-values and “statistical significance” recur; some note that insignificant results don’t prove the null, and huge samples can make trivial effects “significant.”
Alternatives like confidence intervals, Bayesian methods, and higher significance thresholds in some fields are mentioned.

Value and risks of null results

Nulls can bound effect sizes, correct false beliefs, and prevent wasted effort, but not all nulls are equally interesting.
Concern: trivial or sloppy nulls could flood literature or be used strategically against rivals. Others respond that peer review and low citation impact limit this.

Who should do replications?

Suggestions: make replication and logging of nulls a standard part of PhD training, or pre-register methods before data collection.
Pushback: this is seen by some students as drudgery and bad for careers; others say it’s essential training but must be properly funded and recognized.

Alternative venues & formats

Mention of journals and workshops dedicated to negative or “unsurprising” results, plus arXiv, preprints, blogs, and “living papers.”
Objection: these outlets often carry less professional weight, so researchers deprioritize them despite personal willingness to share nulls.

Structural and process issues

Peer review often evaluates results, not just methods; proposals for “results-blind” review and conditional acceptance based on pre-registered protocols.
Cost and bureaucratic load of publishing, limited replication resources, and opaque reviewing further discourage null-result publication.

View on HN ↗ Original Article ↗

2024-07-24

We've built the Ultimate e-Bike Battery that you can Repair and Refill

Product concept & design

Modular e‑bike battery pack using standard 18650 cells, designed to be user-repairable and refillable.
Cells are held in slots (no welding) so users or shops can swap cells and even the electronics/BMS.
Target use: e‑bikes (including conversions) and fleet/shared mobility; current version is described as v3, with earlier prototypes used in France for several years.

Chemistry, cells & cost claims

Chemistry stated as classical NMC, with one cited model: DMEGC 32E (a “tier 2” supplier).
Company claims to buy cells around $1.20 each and suggests a full refill in the $48 range every ~3 years.
Several commenters doubt that high‑quality cells can be that cheap in small quantities and question the advertised cost-per-year.

Fireproof casing & safety

Strong marketing emphasis on a “fireproof” aluminum casing that contains thermal runaways and vents fumes.
Link to test documentation is provided; casing allegedly withstands full pack burn-down without external flames.
Critics note that “fireproof case” ≠ no fire risk to surroundings and question whether it can really contain the heat.
Design includes per-cell fusing and physical guidance to reduce risks from reversed cells; still concern about novices handling high currents and many cells.

Smartphone app, API & openness

Battery works without the app; app mainly for data, alerts, and firmware updates.
App is not open source today; protocol and possibly firmware may be opened later.
Some users strongly dislike dependency on a proprietary app and want documented, open protocols for long-term usability.

Kickstarter, legitimacy & marketing concerns

Product is slated for a Kickstarter; reason given is upfront capital for batch production, despite product already existing in B2B form.
Multiple commenters find the website “scammy”: vague pricing, percentage discounts without context, and testimonials that appear temporally inconsistent.
Company acknowledges testimonials are adapted from early beta feedback and agrees they were confusing.
Some remain uneasy, preferring batteries from large, heavily certified manufacturers; others see the idea as valuable but dislike the marketing tone.

Compatibility & integration

Battery is “compatible with major brands” via a supplied dock: users connect bare motor/controller wires into a dock PCB and select the correct CAN protocol.
Skepticism about how this works across diverse proprietary connectors and protocols; details beyond this are unclear.

View on HN ↗ Original Article ↗

2024-07-24

CrowdStrike CEO summoned to explain epic fail to US Homeland Security committee

Accountability for the Failure and Congressional Hearing

Many expect the CEO’s testimony to resemble past tech/political hearings, with limited concrete answers and mostly theatre.
Some see value in at least symbolically holding top leadership to account, rejecting the idea that only frontline engineers should face consequences.
Others argue consequences are often limited to resignations and PR damage, especially for private companies.

Business Continuity, Critical Infrastructure, and “Act of God” Risk

Strong criticism of hospitals, airlines, and other critical services for lacking effective business continuity / disaster recovery (BC/DR) despite likely having formal plans.
Debate over whether this incident was an unforeseeable “act of God” versus a foreseeable “all computers go down” scenario that should have been on risk registers.
Some argue critical infrastructure must plan for total IT failure and have manual or alternate workflows; others say planning for everyone being down simultaneously approaches nuclear-war-level contingency.

Vendor vs Customer Responsibility

Split view:
- One side: CrowdStrike bears primary responsibility due to grossly negligent testing and a global, simultaneous rollout; this should carry major financial and possibly legal consequences.
- Other side: Critical organizations also share blame for granting kernel/root-level access to a single vendor and not designing for vendor failure.
Discussion of contract terms that explicitly disclaim life-critical guarantees, and whether hospitals using such vendors are themselves negligent.

Endpoint Security, Kernel Design, and OS Monoculture

Many criticize kernel-level security tools as dangerous single points of catastrophic failure, with large attack surfaces and high privileges.
Debate on whether OS monoculture (primarily Windows) is itself the core problem versus misclassification of what should count as true infrastructure.
Some advocate more diverse or simpler systems (e.g., different OSes per function, legacy DOS systems) to reduce blast radius; others call that unrealistic due to complexity, cost, and fragmentation.

Financial and Structural Issues

Discussion of how large asset managers and institutional shareholders may blunt accountability for executives, since their incentives are fee-based, not strictly performance-based.
Some argue corporate and financial structures systematically diffuse responsibility, making large-scale negligence hard to punish adequately.

View on HN ↗ Original Article ↗

2024-07-24

More delays for Euston's HS2 station

HS2 Design, Scope, and Mismanagement

Many see HS2 as catastrophically mismanaged: excessive tunnelling in rural areas for political reasons, fragmented responsibilities, and poor organisational design.
Euston oversite development: costs sit on HS2’s books while property-development profits go to the Treasury, distorting incentives and making sensible station investment look like “overruns”.
Some works (e.g. extra unused platforms at Birmingham Curzon Street) are continuing because cancelling would cost even more, highlighting contractual lock‑in.
Disclosure of budget envelopes allegedly encouraged contractors to bid at the maximum, unlike HS1 where the internal budget was kept secret.

Political and Institutional Issues

Debate over Labour’s role: some argue they should have spent years preparing a detailed rail strategy; others say opposition lacks civil service resources and real‑time project detail.
Labour’s nationalisation plan for rail is criticised as vague and bureaucratic, not a clear British Rail–style, integrated strategy or HS2 reintegration.
Treasury accounting rules and Whitehall culture are blamed for short‑termism and for “deliberately” hobbling HS2.

Capacity, Connectivity, and the North

Strong view: HS2’s main purpose is capacity, not speed—removing express services from existing main lines to free up slots for regional and local trains.
Many Northern and “Northern Powerhouse” improvements were contingent on HS2 (directly or via released capacity), so cancellation of Phase 2 undermines wider upgrades.
Others argue HS2’s benefits were oversold and that it cannot fix all congestion; some question whether better east‑west links between northern cities might be higher priority.

Euston vs Old Oak Common

One camp: Old Oak Common termination would severely worsen many end‑to‑end journeys, adding changes and hassle (especially for connections to Eurostar, Thameslink, Northern/Victoria lines, and key central destinations).
Another camp: Crossrail and Overground at Old Oak Common are highly valuable; investments in those networks may offer better returns than forcing HS2 into Euston.
Some suggest deep‑level through stations or alternative alignments, but acknowledge the complexity and cost.

Costs, Debt, and Value for Money

UK’s high debt, high tax burden, and under‑funded public services are cited as constraints; big new spending must compete with basic repairs (schools, existing rail).
Counterargument: continued “austerity” has harmed growth; high‑impact infrastructure like HS2 and systemic rail reform could be exactly what improves long‑term prosperity.
One commenter notes HS2’s ~£100bn projected cost vs ~£10bn annual passenger revenue for the entire rail system as strikingly disproportionate.

Comparisons with Other Rail Systems

Examples from France, Germany, Italy, Japan, China, Indonesia, and the Netherlands illustrate that:
- Many countries struggle with city‑centre access, feeder links, and technical or procurement failures (e.g., Dutch Fyra).
- Others (notably Japan, some Chinese projects) have delivered extensive HSR with far less drama, feeding a perception of UK exceptional underperformance.

Cars, ULEZ, and Modal Priorities

“War on drivers” rhetoric surfaces; critics say policy changes (ULEZ, fuel, parking, insurance) fall hardest on poorer drivers who can’t easily upgrade vehicles or switch modes.
Others respond that:
- ULEZ compliance doesn’t require an EV for most cars.
- Drivers are heavily subsidised via road spending and unpriced externalities (pollution, crashes).
- Overabundance of cars harms those who can’t afford cars and rely on buses/cycling.
Consensus across sides: non‑London transport is poor, and alternatives to driving outside major cities remain inadequate.

Project Delivery and Specification

Some advocate “agile” infrastructure: open partial segments early, add stations later.
Counterpoints:
- Retrofitting stations and repeated resignalling can be more expensive than building once.
- The Elizabeth line already used a phased approach (TfL Rail staging, gradual through‑running).
- Its large stations and tunnels are defended as appropriately future‑proof; shrinking them 20% is seen as false economy.

View on HN ↗ Original Article ↗

2024-07-24

"Doors" in Solaris: Lightweight RPC Using File Descriptors (1996)

How Solaris Doors Work

Doors provide synchronous RPC between processes on the same machine.
Conceptually, a client “enters” a server’s address space via door_call, runs the service handler, then returns via door_return.
Implementation uses a kernel “shuttle” to hand off the scheduler’s thread/time slice from client to server and back, avoiding normal run-queue scheduling.
Server-side, a pool of user threads is created by the doors library; these wait via door_return for work and are woken by the kernel when a call arrives.
Arguments and return values are copied or page-mapped between address spaces; descriptors can be passed too.

Performance and Concurrency Characteristics

Main advantage: low latency and deterministic behavior compared to sockets/pipes, because the service runs immediately in the caller’s time slice.
Hardware assists on SPARC (ASIDs, register windows, TLB behavior) were used to minimize context-switch overhead; SpringOS fast-path calls were cited as very fast historically.
Benefits diminish when services perform slow or blocking I/O; then you’d rather use asynchronous I/O and more conventional IPC.
Some see concurrency complexity as no worse than normal threads if code is thread-safe; others worry about harder-to-reason-about cross-process call chains.

Relationship to Other IPC Mechanisms

Compared to classic message-passing RPC, doors are framed as a control-transfer primitive rather than a recv-based message queue.
Conceptually similar to CPU task gates or a specialized syscall that jumps into another user process.
Related ideas surface in Android Binder, BeOS/Palm IPC, scheduler activations, and Linux proposals like switchto/sched_ext.
Several commenters argue you could approximate doors in user space with Unix domain sockets, SCM_RIGHTS, and mmap, but without the same scheduling optimizations.

Real-World Use & Debugging Experience

One team implemented an in-memory ad-targeting server accessed from Apache via doors; reported it as “extremely fast,” though it never went to production.
A SmartOS user debugging hangs had to trace door calls across multiple processes, noting that caller threads were paused while separate server threads ran, confirming the server-thread-pool model.

Debate Over Semantics and Message Passing

Some participants initially claimed the calling thread itself continues executing in the server process.
Others, after reading Solaris/Illumos source and assembly, clarified that:
- The kernel transfers scheduling state/quantum, not the literal user-space stack.
- Separate server threads execute the handler, with door data copied onto their stacks.
There is disagreement over whether doors should be described as “message passing”; most converge on “RPC-like with direct control handoff and data copy.”

Solaris Zones, Jails, and Containers

Thread drifts into praising Solaris as “ahead of its time” (Zones, ZFS, DTrace, Crossbow, STMF) and FreeBSD Jails.
Linux container tech is seen as later and initially weaker, with Docker adding distinctive features (one-process-per-container, layered FS, pipeline-oriented tooling).
Zones were designed mainly for server consolidation; some admins found them powerful, others found Solaris administration painful.
HP-UX Vaults and older systems (CP-67, Plan 9 namespaces, chroot) are mentioned as related historical container/partitioning mechanisms.
Question raised whether Docker-like developer UX could have been built atop Zones; some believe yes, but it wasn’t pursued.

Other Notes

Some wish for Doors-like primitives on Linux/FreeBSD integrated with epoll/kqueue; others argue this conflicts with doors’ inherently synchronous, quantum-transfer design.
Brief side comments note COM/RPC as a very strong IPC design, the age of “staff engineer” titles, a naming clash with SideFX’s “Solaris,” and curiosity about surviving copies of the Spring OS.

View on HN ↗ Original Article ↗

2024-07-24

Preliminary Post Incident Review

Root Cause and Technical Design

Thread agrees that a malformed “Rapid Response Content” file (“problematic content” / Channel File 291) triggered an out‑of‑bounds read in a kernel‑space “Content Interpreter”, causing BSODs.
Some participants note reports of a zero‑byte file, others point out CrowdStrike later said the crash was not directly caused by all‑zero content, and that zeros likely came from a failed/partial download.
Debate over error handling: returning NULL / error pointers is normal for kernel C code, but many argue the interpreter should never crash on bad input, especially data fetched from the internet.

Validator vs Interpreter

Strong criticism that a separate “Content Validator” passed content that then crashed the interpreter.
Several argue the validator and interpreter should share the same parsing/execution path (or the interpreter should run in a mocked environment during validation: “parse, don’t merely validate”).
Others note that separate validators can still miss bugs or undefined behavior, so architectural hardening of the interpreter is essential, not just more checks.

Testing, Rollout, and QA

Central complaint: Rapid Response content was not actually executed in realistic environments before global rollout.
No apparent end‑to‑end smoke tests, canary fleet, or staggered deployment for this content type; some call this “using customers as QA.”
Many highlight missing fuzzing of the kernel driver and poor defenses against crash loops; suggestions include watchdogs, automatic rollback to last‑known‑good configurations, and timeouts.
Some see mention of “local developer testing” as evidence of amateurish process; others say the real failure is CD strategy, not absence of any validation.

Customer Control and Risk

Heavy criticism that customers had no ability to delay, stage, or roll back Rapid Response updates, especially for critical infrastructure (hospitals, government).
Some point to compliance regimes (PCI DSS, FedRAMP, insurers, large enterprises) as effectively forcing deployment of such agents, reducing customer choice.
Others argue organizations that accept auto‑updating kernel‑level agents without internal staging bear part of the blame.

Quality of the PIR and Organizational Issues

Many view the preliminary incident report as marketing‑heavy, vague (“problematic content”), and focused on minor technical mitigations rather than deep root causes or organizational failures.
A minority call it a reasonably written preliminary brief, not a full RCA, and appropriate for a mixed audience.
Broader worries center on incentives: speed vs safety, reduced QA, aggressive SLAs, and the risk that similar incidents will recur without cultural and process change.

View on HN ↗ Original Article ↗

Hacker News, Distilled

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics

Related topics