2024-06-07

Ask HN: Machine learning engineers, what do you do at work?

Reality of the ML Engineer Role

Work is far from “just training models.”
Many describe ~80–95% of time on data collection/cleaning, feature engineering, ETL, infra, and tooling; only a small fraction on fitting/tuning models.
Others say their title is ML engineer but the day-to-day is mostly backend/software engineering or MLOps for ML systems.
A minority argues that if you don’t spend most of your time on model development/research, it’s not really an ML role.

Role Boundaries and Team Structure

Frequent confusion between “ML engineer,” “data scientist,” “applied scientist,” and “data engineer”; in small orgs these often blend.
Some argue for specialization (research vs infra vs ops) due to limited time and deep expertise needs.
Others insist engineers should understand at least one layer above and below their stack (e.g., drivers and infra vs model math) to avoid Conway’s Law / coordination issues.

Data, Experimentation, and Model Work

Emphasis on being “knee‑deep” in data to discover patterns and ask the right questions.
Tasks include experiment design, A/B testing, metrics definition, model deployment, retraining pipelines, and long-running experiments with careful monitoring.
Classical ML required explicit feature engineering; with deep learning, architectural choices and data quality/diversity matter more.

Tooling, Environments, and Pain Points

Heavy frustration with Python environments, native/CUDA dependencies, and package managers.
pip, conda/mamba, venv, poetry, Nix, Docker, pyenv, and newer tools like uv are all mentioned; each has tradeoffs and failure modes.
ARM MacBooks are seen as problematic for cutting-edge local ML; many prefer Linux GPU servers or cloud images.
Dependency hell and constant breakage are seen as a systemic drag on productivity.

LLMs and Changing Work Patterns

Some roles shifted from training models to integrating LLM APIs, prompt engineering, and RAG; feels closer to standard SWE to some.
Observed that very few people work on LLM training itself compared to many “AI engineers” calling APIs.

Collaboration, Domain Experts, and Explainability

Collaborating with nontechnical domain experts (e.g., in healthcare, business units) is seen as highly valuable and rewarding.
Explaining stochastic model behavior and misclassifications is hard; expectations from traditional deterministic software often clash with ML reality.
Teaching Python and basic tooling to less-technical colleagues is a notable part of some MLE jobs.

Healthcare, Privacy, and Ethics

Several work on healthcare ML (claims, diagnosis from images, vital signs, etc.).
There is concern that sensitive health data is widely accessible to engineers despite HIPAA; “dead privacy” is discussed.
Insurance/claims ML can move millions of dollars (e.g., subrogation, upcoding detection); some criticize models used to increase billing.

Career Satisfaction and Demand

Mixed feelings about the GenAI hype: more demand for flashy LLM work, less focus on “boring” but valuable ML.
Some feel marginalized as “old-school” data/ML people while budgets flow to AI APIs and non-coding “AI scientists.”
Others report high satisfaction when given low-meeting time, good infra, and real ownership of ML products.

Related topics