MLOps is not optional if you want ML in production
MLOps is the operational layer that turns machine learning models into reliable production systems.

MLOps is the operational layer that turns machine learning models into reliable production systems.
MLOps is not a nice-to-have add-on to machine learning; it is the difference between a demo and a system that survives contact with production.
Thomas Nys puts the failure mode plainly: most ML projects never make it into production, and the blocker is usually not model quality. It is the inability to reproduce training environments, serve predictions at the right latency, monitor degradation, update safely, and govern the whole lifecycle. That is why a notebook success does not count as business value.
Production is where ML stops behaving like software
Get the latest AI news in your inbox
Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.
No spam. Unsubscribe at any time.
Traditional software is deterministic. If the code is the same, the behavior is the same. ML systems are not like that. The model, the data, and the features all shape the output, which means a passing test in development does not guarantee a stable result in production. That is the core reason DevOps alone fails here.

One concrete example is training-serving skew. A feature calculated one way during training and another way in production produces a model that looks correct on paper and fails in the real world. This is not a theoretical edge case. It is a common cause of drift, bad predictions, and hard-to-debug incidents. MLOps exists to prevent that gap from opening in the first place.
Reliability matters more than model accuracy
The article’s strongest claim is also the most operationally important: building the model is only a slice of the work. Nys argues that model creation is about 20% of the effort while operating it is the other 80%. That ratio matches how production systems behave. The expensive part is not the first successful run; it is keeping the system healthy after data changes, traffic grows, or business rules shift.
Monitoring makes that reality visible. A model can maintain good offline metrics and still fail because the input distribution changed after a product launch, a market shift, or a new user cohort. Without data drift, concept drift, and infrastructure monitoring, the team notices only after revenue, risk, or customer experience has already taken the hit. MLOps makes degradation observable before it becomes a business incident.
Reproducibility and governance are not bureaucracy
Another reason MLOps matters is that production ML needs traceability. If a model underperforms, teams must know which version is deployed, what data trained it, what parameters were used, and what performance it had in validation. That is not paperwork for its own sake. It is the only way to rollback safely, compare versions, and satisfy compliance requirements in regulated environments.

The same applies to retraining. Manual notebook reruns do not create dependable systems. Automated pipelines, pinned dependencies, experiment tracking, and model registries do. A scheduled or drift-triggered retraining process is only useful when it is tested, approved, and reversible. In other words, governance is what makes ML change safe enough to ship repeatedly.
The counter-argument
The best objection is practical: many organizations do not need a full MLOps stack. If the model is experimental, low stakes, or rarely updated, heavy infrastructure can slow teams down and waste money. For a small team, managed services and minimal automation are often the right first step. The article itself acknowledges this by recommending incremental adoption and warning against trying to build everything at once.
There is also a real cost to overengineering. Feature stores, orchestration platforms, model registries, and custom monitoring can become a pile of tools that no one owns. If the use case is not business critical, that complexity is unjustified.
That objection is correct about scope, not about principle. The answer is not to skip MLOps; it is to right-size it. Even a modest setup needs experiment tracking, versioning, and basic monitoring. Once a model affects revenue, risk, or customer experience, operational discipline stops being overhead and becomes the minimum cost of using ML responsibly.
What to do with this
If you are an engineer, start with experiment tracking, model versioning, and a simple deployment path before chasing advanced tooling. If you are a PM, treat monitoring and rollback as product requirements, not implementation details. If you are a founder, budget for the unglamorous work: data quality, ownership, governance, and maintenance. ML that cannot be operated is not a product; it is a prototype with a dashboard.
// Related Articles
- [IND]
MLOps Zoomcamp maps the path to production ML
- [IND]
Cloudflare Is Too Expensive to Buy After the Surge
- [IND]
TurboVec cuts 10M-vector RAM to 4GB
- [IND]
Midjourney V8.1 now ships as default model
- [IND]
Midjourney Free Methods vs Paid Access
- [IND]
Anthropic’s $35 billion buildout proves AI now runs on finance and ch…