2026-04-27

Talkie: a 13B vintage language model from 1930

Local hardware & deployment

Several commenters discuss VRAM needs. 20GB is borderline for 13B BF16 weights, though splitting layers across CPU/GPU via llama.cpp is possible but slower.
Some compare high‑VRAM GPUs vs large shared‑RAM desktops; consensus: GPUs give more “usable” local LLMs, but you won’t “make your money back,” so buy what you’re happy to pay for.
No GGUF is yet available; people note it should be convertible from the PyTorch checkpoint for use with tools like Ollama.

“Vintage” concept, data leakage & contamination

The authors frame “vintage LMs” as trained solely on pre‑cutoff data to avoid benchmark contamination and post‑date knowledge.
Commenters point out evidence of temporal leakage (e.g., anachronistic political facts, terminology, and future knowledge), arguing the model doesn’t fully meet its own “vintage” standard.
Distinction is drawn between contamination by benchmark answers vs generic post‑cutoff text; some see them as nearly the same issue.

Behavior, style & capabilities

Many are charmed by the 19th/early‑20th‑century prose: ornate, confident, discursive, and very different from modern LLM tone.
Examples show it:
- Treats “computer” as a human job and distinguishes “digital” as “using fingers.”
- Gives period‑typical takes on India, empire, American Civil War causes, women, yoga, industrialization, etc.
- Produces speculative future visions (2025/2026, moon travel, computers) that feel like historical futurism.
Users note a common pattern: first sentences may be accurate; then it drifts into plausible but wrong explanations, so it can “pollute your brain” if you don’t know the answer.

Historical bias, racism & ethics

Commenters report explicitly racist, colonialist, and sexist outputs and stress that these reflect the surviving texts and power structures of the era.
Some see this as historically honest and even desirable for future “uncensored” historical models; others find it troubling and question the value of partial moderation layered on top.

Epistemic snapshot & scientific testing

Strong interest in using such models as “time capsules” or “epistemic snapshots” of a given era, comparable to other history‑only LLM projects.
Several propose research uses: training models before key breakthroughs (e.g., relativity, nukes) to see whether they can rediscover them or predict events, though many doubt current LLMs could.

Speculation, simulations & future models

People imagine combining era‑locked models with VR or personal archives to simulate past periods or one’s younger self, edging toward “time travel” or “simulation” experiences.
Some are excited; others push back on simulation talk as philosophically dubious or psychologically risky.

Cost and practicality

Back‑of‑the‑envelope FLOP and cloud‑pricing estimates suggest pretraining costs on the order of tens of thousands of dollars, seen as impressively affordable for bespoke models.

Related topics