November 22, 2022

Portfolio

Unusual

Your MLOps process is probably broken — here’s how to fix it

Editor's note:

Even with a perfect stack of tools, teams still struggle to deliver ML products. So if tools aren’t the only piece of the puzzle, what does that leave? In my last post, Another tool won’t fix your MLOps problems, I argued the remaining pieces are:

Culture
Process

‍

Let’s dive into the MLOps process — in particular, what I think most teams get wrong. What does a successful MLOps process look like, and how can individual ML practitioners help to build that process?

‍

TL;DR

‍

Start with a product, not a model
Survey the data in production, not in your warehouse
Start simple — with data and models
Partner with engineers

Start with an ML product

Maybe the most important practice that enables successful ML projects is to design a product, not a model. One of the biggest pitfalls I’ve seen across dozens of companies is to hand “projects” to data teams, instead of involving them in the product design phase.

To build a successful ML product, three stakeholders need to be involved in designing the product:
‍

PM / Business Stakeholder: What does success look like?
ML Person: What is (likely) possible with ML?
Product Engineer: What is feasible, what are the constraints?
‍

Most important: these three personas need to be highly aligned throughout the development of an ML product. When ML projects fails, it’s typically due to an alignment problem!
‍

Some examples of poorly aligned teams:

ML person optimizes for model accuracy (instead of business outcomes!)
Projects started that may not be feasible to solve with ML
The model doesn’t meet performance constraints in production
The features are challenging or impossible to compute in production
‍

Some examples of good alignment:

ML person understands the tradeoffs of accuracy vs. time to market
Monitoring built on day 0 to ensure business outcomes are consistently measured
Engineer helps ML person understand the production data landscape
Model SLAs are clearly defined and measured

‍

Production is never as easy to navigate as a data warehouse. (image generated by DALL-E)

‍

Survey the data in production

This is an example of “good alignment” in the last section, but it deserves its own section. Almost all ML builders I’ve seen start their ML project with a survey of the available data. The problem? They typically survey the data that is available for training, not the data that will be available in production.

Shouldn’t all of the data available for training be available in a production system?

Most of the time the answer is yes, but with a bunch of asterisks. How quickly is that data available? How fresh is that data? How much preprocessing needs to be done on the prod data to make it consumable? Who owns that data?

So many ML projects stall out because of issues with production data. I’ve seen over and over again that there is a huge disconnect between the ML person and product engineer. An example of two innocuous features with dramatically different requirements:

A user’s home zip code: probably dirt simple to use in production. Query a database.
A user’s average location in the last five minutes: probably a PITA! Streaming? How fresh does it need to be? Streaming aggregations are hard!

Spend extra time understanding what data is available in production, and what constraints apply to that data. It’ll save you months delivering a functional solution.

‍

Fast and simple is always the right starting point (image generated by DALL-E)

‍

Start simple

This is probably the most common ML advice, but it’s good advice. Start with a simple solution.

My contribution — most folks will tell you to start with a simple model, but it’s equally important to start with simple data! To play off of the example above:

A user’s average location in the last five minutes: hard
A user’s most recent location: probably much easier!

You may discover when building your model that a model built with “a user’s average location in the last five minutes” produces a more accurate model, but you may be able to get a model built with “a user’s most recent location” into production two weeks earlier.

You probably take that trade every time. You can always build a V2 with the fancier feature, and it will be a lot easier to build incremental improvements than to ship something complicated the first time.

‍

Partner with engineers

‍

Odds are, if you’re a data scientist, not all of the questions posed above are obvious to answer. When I was last hands-on building ML models, I had no idea what streaming data was (let alone how to think about it).

The solution is to become friends with engineers and work with them throughout the development of an ML model. Building any software project cannot be done in a silo, and ML in a silo is even worse. Engineers can help with “what data is available,” “what constraints should I know about,” “what SLAs are feasible,” and more.

Work with an engineer early and often. You’ll build projects way faster.

‍

Conclusion

These steps aren’t a comprehensive view of an MLOps process — there are a lot of moving pieces that lead to success (code reviews, CI/CD, monitoring, …). This is a starting point. As I mentioned above, most of the ML failures I’ve seen are alignment issues. These process guidelines are primarily meant to help you align your team for success.

You need a strong foundation to build an exceptional MLOps practice.

David Hershey is an investor at Unusual Ventures, where he invests in machine learning and data infrastructure. David started his career at Ford Motor Company, where he started their ML infrastructure team. Recently, he worked at Tecton and Determined AI, helping MLOps teams adopt those technologies. If you’re building a data or ML infrastructure company, reach out to David on LinkedIn.

Download

This is some text inside of a div block.

Download

All posts

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

All posts

November 22, 2022

Portfolio

Unusual

Your MLOps process is probably broken — here’s how to fix it

No items found.

Editor's note:

Culture
Process

‍

TL;DR

‍

Start with a product, not a model
Survey the data in production, not in your warehouse
Start simple — with data and models
Partner with engineers

Start with an ML product

To build a successful ML product, three stakeholders need to be involved in designing the product:
‍

PM / Business Stakeholder: What does success look like?
ML Person: What is (likely) possible with ML?
Product Engineer: What is feasible, what are the constraints?
‍

Most important: these three personas need to be highly aligned throughout the development of an ML product. When ML projects fails, it’s typically due to an alignment problem!
‍

Some examples of poorly aligned teams:

ML person optimizes for model accuracy (instead of business outcomes!)
Projects started that may not be feasible to solve with ML
The model doesn’t meet performance constraints in production
The features are challenging or impossible to compute in production
‍

Some examples of good alignment:

ML person understands the tradeoffs of accuracy vs. time to market
Monitoring built on day 0 to ensure business outcomes are consistently measured
Engineer helps ML person understand the production data landscape
Model SLAs are clearly defined and measured

‍

Survey the data in production

Shouldn’t all of the data available for training be available in a production system?

A user’s home zip code: probably dirt simple to use in production. Query a database.
A user’s average location in the last five minutes: probably a PITA! Streaming? How fresh does it need to be? Streaming aggregations are hard!

Spend extra time understanding what data is available in production, and what constraints apply to that data. It’ll save you months delivering a functional solution.

‍

Start simple

This is probably the most common ML advice, but it’s good advice. Start with a simple solution.

My contribution — most folks will tell you to start with a simple model, but it’s equally important to start with simple data! To play off of the example above:

A user’s average location in the last five minutes: hard
A user’s most recent location: probably much easier!

‍

Partner with engineers

‍

Work with an engineer early and often. You’ll build projects way faster.

‍

Conclusion

You need a strong foundation to build an exceptional MLOps practice.

Download

This is some text inside of a div block.

Download

All posts

Your MLOps process is probably broken — here’s how to fix it

TL;DR

Start with an ML product

Survey the data in production

Start simple

Partner with engineers

Conclusion

Your MLOps process is probably broken — here’s how to fix it

TL;DR

Start with an ML product

Survey the data in production

Start simple

Partner with engineers

Conclusion

Recent Blog Posts

Other articles

July 29, 2025

Why we backed FluidCloud: infrastructure freedom for the AI era

Portfolio

John Vrionis

July 11, 2025

Why we’re leading Helios $4m Seed Round

Portfolio

May 29, 2025

The beginning is beautiful – don’t rush to prove something

Portfolio

Lars Albright

May 28, 2025

Why we’re doubling down on Chalk

Portfolio

John Vrionis