Ask HN: Is anyone doing anything cool with tiny language models?

Overview

  • Thread surveys many real-world uses of “tiny” and small LMs (tens of millions to a few billion params), mostly running locally on CPUs or modest GPUs.
  • Strong interest in privacy, low latency, and narrow, specialized tasks rather than general chat.

Training & Deployment of Tiny Models

  • 0.1B–0.125B models (GPT‑2 scale, JetBrains’ autocomplete) are reported as trainable on consumer hardware; rules of thumb like ~8× params in VRAM are mentioned.
  • Examples include training ~0.125B models on ~1B tokens in minutes on rented H100s using NanoGPT variants.
  • Disk sizes around ~70MB zipped per model are seen as small enough for static web apps and browser-based inference (WebGPU, transformers.js, web-llm).
  • Several projects use llama.cpp, Ollama, or custom C++ microservices to self-host small models with low latency.

Practical Use Cases

  • Developer tooling: code completion, bash/sed/awk one-liners, git commit messages, variable naming, address and job parsing, cookie-banner detection, Excel-like formula suggestion/repair, email or SMS agents, and structured extraction (e.g., nutrition labels, job attributes).
  • Productivity and research: Excel add‑ins and small classifiers to filter or prioritize scientific abstracts, detect urgent maternal-health messages, or categorize playlist songs/titles.
  • System integration: wake-word detection on microcontrollers, Android text firewalls, on-device “activity analysis” assistants, robot/voice command interfaces, and Raspberry Pi personal RAG assistants.

Creative & Playful Uses

  • Story generators on tiny displays, endless “office gossip” audio streams, Magic card generators, Tic‑Tac‑Toe opponents, NPC dialogue and bargaining in games, and persona-based rewriting (e.g., hiding personal style or imitating fictional characters).
  • Local LMs used to anonymize code/questions before sending to large remote models, then reinsert real identifiers afterward.

Capabilities, Limits & Skepticism

  • Consensus that small, fine‑tuned models excel at narrow, text-classification or transformation tasks (summarization, paraphrasing, filtering, translation in some cases).
  • Reported weaknesses: complex logic, math, temporal expressions, truly diverse creativity, and safety-critical reasoning.
  • Some argue specialized small models are more useful than giant generalists; others worry comparisons to legacy >100B models can be misleading.
  • Noted gaps: fine‑tuning expertise, good training data, and a way to package and share small, task-specific models (plus prompts and parsers) cleanly.

Privacy, Security & Ethics

  • Strong focus on keeping data on-device (HIPAA-sensitive EHR queries, SMS and call filtering, private editors).
  • Applications to detect prompt injection, rewrite toxic or abusive text, or classify unethical stock suggestions.
  • Some concern that playful uses (e.g., engaging spam) may inadvertently strengthen adversaries or raise new privacy risks.