What is MLOps ? And why it's important ? Part 1

Introduction

MLOps means Machine learning Operations. It is an engineering discipline that aims to unify Machine Learning systems development and deployment to streamline the delivery of high-performing models in production.

Everything is all about deploying a model that brings value in the end! But we will see that doing this consistently, with as much precision and coordination as we manage to in software today with the establishment of DevOps, is a really hard subject.

But don't worry, today, many companies put a lot of effort into simplifying those processes for you to help you durably mature your machine learning work.

Speaking about maturity, it has been established that there exist 3 levels of MLOps maturity :

  • Lvl 0 is about building and deploying models manually for every step and iterations
  • Lvl 1 is getting more interesting and is mostly about Continuous Training by automating the ML pipeline
  • Lvl 2 is when your ML Pipelines look just like your Code pipeline with DevOps practices with things like Continuous Integration and Delivery.

What is NOT MLOps

Now that we have introduced what MLOps is, I just want to recall what MLOps is NOT.

Just two examples to illustrate this:

MLOps is not about adapting existing non-production-fitted tools to make them production-ready, such as trying obscure ways to put Jupyter Notebooks in production so data scientists can keep their everyday way of work.

Notebooks are a really important and unavoidable tool when you start dealing with data science in general, but even if some companies like Netflix tells amazing stories about how their processes are notebook focused and how they manage their pipeline with notebooks in it, it’s not for everyone and they have put a lot of work to own their tools and platform so they can do so.

So keep in mind that try to "productionize" notebooks is globally a bad idea.

The second one is monitoring; if you monitor a lot of things about your models, some you never look at and most of that are not related to your business Key Performance Indicators, and you just check that your model is alive, there are so many important insights that you don’t take advantage of! This is the “Act now, reflect never” way of thinking and it prevents you from going further into your MLOps journey.

MLOps lvl 0

Let’s start with the first level of MLOps maturity, do not worry if you recognize yourself in some of those elements, it’s normal and there are not a lot of people and companies worldwide that have attained the other levels. This article may help you to reflect on those points and see how you can improve your Machine learning game.

The pattern we saw in many companies, even those where we started as an ML Engineer, was that the business people came with a ‘Client Problem’ or ‘Brand new Project’ that had to involve AI. So they handed the requirements to the data science team that has to work in its corner for a long time to find and clean data, and then spend weeks trying to optimize a model that reaches a certain percentage of accuracy.

What defines this level 0 the most is the manual way of doing everything in the so-called ‘pipeline’.

But then what?

Usually what happens is that the clients or the projects have more complicated needs than those anticipated at the beginning and because of the iterative, test, and learn nature of AI, everything has to be done again from the beginning. This is indeed not really by the Agile practices that usually rule Project and Software management.

One other characteristic of this level is that usually AI models are deployed when they can be deployed, not when they should, which makes a huge difference in how they solve real-life business problems.

Of course, the testing of Machine learning scripts has always been way behind the one for software and is usually contained in the script if any.

Continuous Deployment is rarely considered or is considered unattainable, which is mostly because of the manual steps I was talking about and because there are no large-scale, high-frequency deployment needs. And finally, the global ML system is rarely monitored, or is monitored using Anti-patterns.

To wrap up on this level, when you deploy ML Models, the whole system can’t and doesn't adapt to real-world changes as they occur, and sadly the models are often non-relevant.

In our next articles we will talk about lvl 1 MLOps, that is to say continuous training :) We'll go in depth with Picsellia built-in features that could help you leveling up your MLOps game !

Test Picsellia
Go to platform ->
Want a live demo ?
Book a demo