Ask HN: COBOL devs, how are AI coding affecting your work?
Direct COBOL + AI Experiences
- Some banking/mainframe environments report success with fine‑tuned models on their own COBOL codebases, citing COBOL’s verbosity and English‑like syntax as a good fit for LLMs, but without much concrete detail on workflows.
- Others find generic LLMs notably worse at COBOL than at mainstream languages: useful for tedious tasks (file layouts, boilerplate, test data) and for “chatting with manuals” or PDFs, but limited by system‑specific context and huge legacy codebases.
- In COBOL‑to‑Java migration projects, models can occasionally help debug small issues or summarize business rules, but are frequently confidently wrong; without RAG/finetuning, the impact is “just OK.”
- Compliance and security constraints (no code leaving the bank, locked‑down VDIs) block use of cloud models in many financial shops; local models are too heavy. Strict COBOL formatting (columns, periods) also trips models and adds linting overhead.
COBOL Ecosystem, Legacy, and Workforce
- Multiple comments emphasize COBOL’s persistence in banking and large institutions, often running decades‑old critical code, sometimes on emulated mainframes. Java or other systems often generate COBOL, which then runs on emulators.
- COBOL’s tight coupling of logic and data structures is seen as a reason for its stickiness.
- At the same time, some European banks are running multi‑year programs to replace COBOL with cloud‑based Java/Spring because COBOL developers are aging out and not being replaced.
Model Training, Dialects, and Potential
- Much COBOL is proprietary and never public, and there are many dialects, so generic LLMs likely lack training on the exact variants used in big financial systems.
- This fragmentation limits transferability: a model fine‑tuned on one institution’s dialect may not generalize well.
- Several commenters think it’s only a matter of time before large banks/airlines fund serious COBOL‑focused models plus tooling; they see this as an augmentation, not a threat, with domain experts remaining essential for review and interpretation.
Broader AI Coding Debate
- Strong disagreement on how “good” AI coding is overall: some report substantial productivity gains (especially with Go, TypeScript, SQL, Python, C), others are repeatedly burned by hallucinations, subtle bugs, and security oversights.
- Common ground: AI is best for scaffolding, boilerplate, configs, and documentation lookup; poor at complex refactors, nontrivial type systems, or deep architecture without extensive prompting and human review.
- Many stress the danger of “vibe coding” in critical systems: AI can accelerate both good practice and sloppy, barely‑understood code, making rigorous review and clear responsibility even more important.