Using generative AI as part of historical research: three case studies
LLMs as Historical Research Tools
- Many commenters like the article’s concrete case studies and “layered” testing (OCR, translation, interpretation).
- Several see strong potential: rapid transcription of difficult manuscripts, first-pass translations, and surfacing possibly relevant secondary sources.
- Some working with Neo-Latin, German, and early modern texts report good but imperfect translations, especially when experts can validate samples and estimate error rates.
- Others note that a large share of historical work is reinterpretation of known material, where LLMs could function as powerful research assistants.
Trust, Expertise, and Hallucinations
- Persistent worry: non‑experts cannot reliably judge when an LLM is wrong, especially on nuanced historical questions.
- Experienced users say LLMs are very useful within domains where they already have deep knowledge, but not for evaluating “PhD‑level” work in unfamiliar fields.
- Suggested mitigations include: cross‑checking multiple models, keeping context short, asking for references and verifying them, RAG/search integration, and designing tools that highlight disagreement.
- Others argue this still fails novices: if you’re not already expert, you don’t know when to backtrack.
Impact on Humanities and Education
- Some fear LLMs will be used to justify cutting funding for history/humanities (“80% of a historian for a few chat queries”).
- Others think education can adapt, with LLMs as accelerators for learning if critical thinking and source literacy are emphasized.
OCR, Translation, and Existing Tools
- Debate over whether LLM-based OCR/translation is truly better than specialized tools (e.g., Transkribus, DeepL, Google Translate); critics note the article lacked systematic comparisons.
- Supporters counter that existing OCR struggles badly with early modern handwriting and that LLMs can handle at least “intermediate” paleography, dramatically speeding triage.
Bias, Consensus, and Rewriting History
- LLMs are described as “consensus distillation” or “median viewpoint” machines, which risks reproducing popular myths and institutional PR.
- Concern that centralized, opaque training and RLHF could make them tools for subtly rewriting history; others argue multiple competing models will make coordinated rewriting harder.
Creativity, Intelligence, and Art
- Long subthread debates whether LLMs show genuine creativity or just sophisticated remixing.
- Some compare them to cameras or instruments: value lies in the human using them; others insist lack of lived experience makes LLM‑generated literature/poetry inherently hollow.