MLOps [3] - What is MLOps?
This video and post are part of a One Dev Question series on MLOps - DevOps for Machine Learning. See the full video playlist here, and the rest of the blog posts here.
Now we get to the core question! What even is "MLOps"? Have we invented a new term for something everyone is already doing? Is it just DevOps applied blindly to ML projects? Surely there are some important differences!?
MLOps sounds like just another buzzword, but it presents some important concepts and practices for maturing your machine learning projects.
MLOps, or "DevOps for Machine Learning" is really just the practice of applying DevOps strategies, techniques, and ideas to machine learning projects rather than traditional software projects.
Over the past decade or so, DevOps has represented a way of ensuring high-quality software that is released into production systems as quickly and safely as possible. It's meant an increase in maturity in so many organisations, and has been shown to dramatically improve not just software delivery, but the an organisations own
But before talking about how we apply DevOps principles to ML projects, let's have a look at Microsoft's definition for DevOps:
DevOps is the union of people, process, and products to enable continuous delivery of value to the end user.
The emphasis is mine, but value is the core of this definition. Without knowing whether you're delivering value, you're just guessing whether all the work your doing is improving your product or not.
It's exactly the same for a predictive model you're deploying to a production system.
Let's take an image recognition model as an example. If you've made a change you think might improve the accuracy of that model in production, it's extremely powerful being able to make those changes, commit them to source control, and have an automated pipeline that kicks off a training run, model validation, packaging into a portable format, and deployment to a test or production system.
Most importantly, as a data scientist, your day to day focus can be on the changes needed to improve that model. You don't need to do the "how do we get this into production" dance!
But it's not just about automation. Having all your code in source control means you can work more closely with your team (especially if everyone works from home!), and with the right tooling, you have a historical record of what changed, and the results of that change.
Once a training run has finished, you can use techniques commonly used in traditional software development to ensure your changes actually provide value in a production environment. Gradual rollout techniques like canary deployments, rolling deployments, using deployment rings, and even A/B testing of models or shadow deployments can be extremely useful.
Of course, not everything we do in DevOps for traditional software engineering can be applied blindly to ML projects. There are some significant differences between ML projects and traditional projects that need some consideration. We'll talk about them in more depth in a future post.
One tool that's extremely good at managing the MLOps lifecycle is Azure Machine Learning. As well as features that make creating models easier than ever, it can manage pipelines, track experiment and model versions, and even package and deploy to production.
If you want to get started with Azure Machine Learning, the best place to get hands-on experience (for free) is Microsoft Learn. The "Build AI solutions with Azure Machine Learning" learning path is an awesome in-depth resource.
And of course, there's the usual collection of comprehensive documentation on Microsoft Docs to answer any additional questions you have along the way.