2025-12-31

2025: The Year in LLMs

Perceived progress in 2025 LLMs

Some see 2025 as a major step: coding agents and reasoning modes turned LLMs from “cute demos” into tools that can meaningfully assist experts.
Others describe the year as stagnant compared to earlier ML breakthroughs (RBMs, RNNs, early deep learning), arguing that most 2025 changes were tooling and distribution, not fundamental model advances.
Several note that people’s baseline differs: for many, LLMs are their first exposure to 20 years of ML progress, which amplifies the sense of revolution.

Creativity, “reproducing the past,” and thinking

One camp argues LLMs and diffusion models fundamentally sample from past data distributions, so they remix rather than create truly novel concepts; this is seen as a hard limit on scientific breakthroughs.
Others counter that humans also mostly recombine prior knowledge, that stochastic generation can still yield meaningful novelty, and that insisting on some “magic” non-derivative creativity standard is unrealistic.
There is ongoing disagreement about whether LLMs “think” or have any notion of truth, versus only modeling linguistic patterns.

Coding agents and developer workflows

Many developers report large productivity gains: agents that run code, observe failures, and iterate are said to handle a majority of minor code changes and refactors in some workflows.
Critics say generated code is brittle, architecture is poor, subtle bugs are common, and everything still requires expert review; claimed speedups are often vague or overstated.
Reliability is framed as “good enough to be a useful assistant, nowhere near replacing a competent engineer.”

Agents, MCP, Bash, and tools

Strong interest in architectures: MCP as a standardized tool interface vs “bash-as-universal-tool” in code execution environments.
Some foresee MCP fading as cheap, sandboxed shells become ubiquitous; others argue MCP’s auditability, security, and interoperability make it more like REST APIs—long-lived infrastructure.
Skills, CLIs, and custom MCP servers are all being used to connect LLMs to CRMs, JIRA, and other systems.

Economics, labor, and productivity

Fears center on junior developer hiring drying up and potential broader knowledge-work automation; some predict manual labor will outlast white-collar work, others dispute this based on verification difficulty outside software.
Several note that macro unemployment has barely moved, and that efficiency gains may translate into lower prices and new demand rather than mass job loss.
Debate continues about whether measured productivity reflects any “exponential” capability gains.

Environment, data centers, and hardware

Commenters worry about energy, water use, subsidies, and e‑waste from massive data center buildouts and GPU churn, especially in rural areas.
Some highlight that AI demand is heavily distorting DRAM/NAND markets and fear future bailouts or “enshittification” as a few hyperscalers dominate.
Others, especially hardware-focused participants, emphasize that AI capex is accelerating progress in semiconductors, memory, packaging, and interconnects, similar to the smartphone era.

Safety, “YOLO” practices, and harms

Concerns about “normalization of deviance”: running coding agents with broad system access, accidental destructive actions (like deleting home directories), and the lack of mature safety culture among web-style developers.
Various sandboxing strategies are discussed: Firejail, separate users, VMs, Docker-in-Docker, dedicated VPSs.
There is unease about LLM-linked self-harm and “AI psychosis” cases; some see genuine risk and note labs’ mitigation efforts, others think this is moral panic compared to underlying economic stressors.

UX, slop, and user backlash

Strong resentment toward intrusive AI chatbots on websites and in apps, which are seen as worsening UX to satisfy “we added AI” mandates and usage metrics.
“Slop” (low-value AI-generated media) is already saturating search, music, images, and video; some predict AI labels and filtering will be needed, others doubt platforms will resist content that drives engagement and ad revenue.

Polarization, hype, and community dynamics

The discussion reflects a wide spectrum: from “bigger than the internet” optimism to “marginally useful autocomplete” skepticism.
Many distinguish between real, narrow utility (coding help, search assistants, document analysis) and overblown AGI narratives and corporate hype.
Meta-discussion touches on distrust of corporate motives, previous tech bubbles (crypto, Web3, metaverse), and frustration with both LLM evangelism and total dismissal.

Related topics