Topics In Demand
Notification
New

No notification found.

ML Engineering and ML Ops – what’s the buzz about?
ML Engineering and ML Ops – what’s the buzz about?

November 24, 2021

1476

0

The term ‘ML Engineering’ has exploded in the past few years, touted as the ‘hottest job’ in Technology. However, the term ‘ML Engineering’ is nebulous, and is used in a variety of ways. Recently, it has become commonplace to throw in this term to make a job appear more attractive. This post will provide some clarifications and hopefully, help some of you understand what this term actually means and enable some of you to make decisions on developing your career in this domain.

Machine Learning (ML) is a branch of Artificial Intelligence involving algorithms to perform specific tasks/activities (like predictions or decisions) purely from data in an automated manner, i.e., without explicit human guidance on how to perform those tasks/activities. These algorithms are typically complex statistical algorithms involving very large data sets. So to make these algorithms successful in practice, we require fairly elaborate engineering. The engineering associated with ML is what we call "ML Engineering".

Machine Learning Project Lifecycle

However, this is too abstract a definition as there are different forms of engineering associated with ML. When it comes to exploring or developing a career in "ML Engineering", it pays to understand this term by splitting "ML Engineering" into two buckets that require fairly different technical backgrounds, and the nature of the engineering work also tends to be fairly different. Hence, we categorize "ML Engineering" into the following two buckets:

  • Engineering for Development of ML Models: This is the engineering work required to train an ML model, experiment with different choices of models and hyper-parameters (known as "hyperparameter tuning"), experiment with different choices of features used to train the model (known as "feature engineering"), and invest in the software engineering for model efficiency, reusability, and readability.

This engineering work is typically highly entangled with the mathematical and statistical aspects of ML Model Development. So, it's impossible to do this type of engineering work without a depth of understanding of the mathematics of ML (for example, cross-entropy, gradient descent, embeddings, etc.).

Fortunately, there are several good books and videos (plus open-source code) to learn about this topic. This educational content is typically a joint education on the mathematics and engineering associated with the development of ML Models (because of how entangled mathematics and engineering are).

  • Engineering for Deployment of ML Models: This is the engineering work required to deploy and support ML model training pipelines and inference in production. Sometimes this area is named as "MLOps". Although we used the somewhat-narrow term "Deployment", there are many aspects here involving real-time performance. This includes caching for immediate inferencing, ensuring the right model version is used, the ability to debug production errors reliably and quickly, and collect and assess performance metrics for model-feedback.

Much of the engineering work involved here has strong resemblances to traditional deployment of software, which an engineering without a background in ML should be very familiar with. In fact, this is the reason this area is a much easier entry point into the world of ML for an engineer. We'd argue that one can do the "MLOps" job with only a surface-level understanding of ML, as long as one has got significant experience in the traditional engineering world of "DevOps".

So now we’d like to provide some reading/coding content to learn about the world of “MLOps”. We will provide the content in 3 layers, starting with a quick, introductory read and ending with an entire hands-on course that will train a traditional software engineer in the world of “MLOps”.

  1. https://stackoverflow.blog/2020/10/12/how-to-put-machine-learning-models-into-production/ is a short blog post on deploying ML models in production (introductory content)
  2. https://mlinproduction.com/deploying-machine-learning-models/ breaks up the world of “MLOps” into its different aspects and is a series of blog posts explaining the different aspects in some detail.
  3. We all know that an engineer truly understands a subject only by “doing”. Hence, we recommend a wonderful course taught in the Computer Science department at Stanford (Disclaimer: We might be a bit biased here with the university choice): https://stanford-cs329s.github.io/index.html. The good news is that you don’t have to be a Stanford student to learn this material. All the lecture notes and slides are available openly. More importantly, this GitHub repo: https://github.com/mrdbourke/cs329s-ml-deployment-tutorial is the codebase that serves as the tutorial throughout this course. We want to emphasize that you can’t learn ML Ops by simply reading a textbook. You have to write code to actually deploy and test an ML model in order to truly grasp this subject. We hope you enjoy this coding experience! 

     

We hope this article has provided some clarity on ML Engineering and ML Operations and how you can get started on this journey. There is a world of information and resources available no matter which career path you take – it’s about understanding how this could make a difference to your engineering career and deciding what’s right for you.

Authors

Anupama Joshi, Senior Director, Technology, Target Ashwin Rao, Vice President, AI, Target
Anupama Joshi, Senior Director, Technology, Target Ashwin Rao, Vice President, AI, Target

That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


TargetinIndia

© Copyright nasscom. All Rights Reserved.