Ask HN: Is anyone doing anything cool with tiny language models?
Overview
- Thread surveys many real-world uses of “tiny” and small LMs (tens of millions to a few billion params), mostly running locally on CPUs or modest GPUs.
- Strong interest in privacy, low latency, and narrow, specialized tasks rather than general chat.
Training & Deployment of Tiny Models
- 0.1B–0.125B models (GPT‑2 scale, JetBrains’ autocomplete) are reported as trainable on consumer hardware; rules of thumb like ~8× params in VRAM are mentioned.
- Examples include training ~0.125B models on ~1B tokens in minutes on rented H100s using NanoGPT variants.
- Disk sizes around ~70MB zipped per model are seen as small enough for static web apps and browser-based inference (WebGPU, transformers.js, web-llm).
- Several projects use llama.cpp, Ollama, or custom C++ microservices to self-host small models with low latency.
Practical Use Cases
- Developer tooling: code completion, bash/sed/awk one-liners, git commit messages, variable naming, address and job parsing, cookie-banner detection, Excel-like formula suggestion/repair, email or SMS agents, and structured extraction (e.g., nutrition labels, job attributes).
- Productivity and research: Excel add‑ins and small classifiers to filter or prioritize scientific abstracts, detect urgent maternal-health messages, or categorize playlist songs/titles.
- System integration: wake-word detection on microcontrollers, Android text firewalls, on-device “activity analysis” assistants, robot/voice command interfaces, and Raspberry Pi personal RAG assistants.
Creative & Playful Uses
- Story generators on tiny displays, endless “office gossip” audio streams, Magic card generators, Tic‑Tac‑Toe opponents, NPC dialogue and bargaining in games, and persona-based rewriting (e.g., hiding personal style or imitating fictional characters).
- Local LMs used to anonymize code/questions before sending to large remote models, then reinsert real identifiers afterward.
Capabilities, Limits & Skepticism
- Consensus that small, fine‑tuned models excel at narrow, text-classification or transformation tasks (summarization, paraphrasing, filtering, translation in some cases).
- Reported weaknesses: complex logic, math, temporal expressions, truly diverse creativity, and safety-critical reasoning.
- Some argue specialized small models are more useful than giant generalists; others worry comparisons to legacy >100B models can be misleading.
- Noted gaps: fine‑tuning expertise, good training data, and a way to package and share small, task-specific models (plus prompts and parsers) cleanly.
Privacy, Security & Ethics
- Strong focus on keeping data on-device (HIPAA-sensitive EHR queries, SMS and call filtering, private editors).
- Applications to detect prompt injection, rewrite toxic or abusive text, or classify unethical stock suggestions.
- Some concern that playful uses (e.g., engaging spam) may inadvertently strengthen adversaries or raise new privacy risks.