Podcast Episode Overview: What Machine Learning Engineers need to Know

Podcast Episode Overview: What Machine Learning Engineers need to Know

Last updated:
Podcast Episode Overview: What Machine Learning Engineers need to Know
Source
Table of Contents

The original podcast episode can be found here: O'Reilly Media: What Machine LEarning Engineers need to know

You may have noticed job ads regarding a position called Machine Learning Engineer (MLE). You may have wondered what it is; this is a new term so there's a bit of confusion around it.

In this episode, Ben Lorica interviewed two people on this topic. They discussed whether it's actually a thing or just a fad and in which cases it makes sense to hire (or be) a Machine Learning Engineer.

What are MLEs

There's no consensus as of now.

  • Most people think of them as a hyper-specialized role, someone whose sole attribution is that of putting ML systems in production, therefore bridging the semantic gap between data scientists and data engineers.

    The rationale is that they are needed because data scientists aren't suited for writing the systems side of products and data engineers lack the necessary skill to do that as well.

"I see a Machine Learning Engineer as someone who knows enough machine learning to get a product out; but they still need help from a data scientist for fine-tuning.". Ben Lorica

  • However, some people think of Machine Learning Engineers as a generalist role (much like Full-Stack Engineers in the world of Web Development), i.e. someone who knows enough about the whole data science pipeline to be able to get a working product out in the street.

    In other words, someone who knows some data science, some data engineering and some general software engineering to be able to work on all parts of the product.

Domain expertise

While the data scientist is the role that's closer to the actual business problems to be solved, Machine Learning Engineers are expected to be able to better translate business objectives to system engineers and programmers.

Role of uncertainty

It's been said that engineers have, in general, difficulty with dealing with the ambiguity and uncertainty common in many businesses, above all those where data science permeates the decision-making process.

"If you just have pure risk, you buy insurance. But uncertainty is where you're going to find the upside." Paco Nathan

Once again, MLEs are expected to bridge the gap between the precise definitions needed in systems engineering with softer and more abstract notions from the data science world (think likelihoods, confidence intervals and continuous distributions).

Pipeline maturity

The question of whether MLEs are an actual needed role or not will probably depend a lot on what scale you are working at and how mature your data pipeline is.

If you have a mature pipeline (e.g. model monitoring, automation, off-the-shelf feature stores, smooth and well-tested data engineering processes) then having an MLE (in the sense of someone who is just focused on operationalizing models) may make sense because there's actual demand for that specialized role.

"The scramble to get to the point of having a feature store; that's the game. Until you get to that point, your models are very ad hoc." Paco Nathan

On the other hand, if you work at a small startup, it probably is overkill to have someone focused on just productionizing ML systems (unless you think of MLEs as generalist engineers), and it hints at a dysfunctional team.

My 2 cents

I would add that building tools (be it shell scripts, command-line applications or web apps) is also a very important part of data science work in general and MLEs could be just the role to help with that.


Quotes

"I see a Machine Learning Engineer as someone who knows enough machine learning to get a product out; but they still need help from a data scientist for fine-tuning." Ben Lorica

"First make your batch processes solid, then move into streaming data." Jesse Anderson

"The scramble to get to the point of having a feature store; that's the game. Until you get to that point, your models are very ad hoc." Paco Nathan

"If you just have pure risk, you buy insurance. But uncertainty is where you're going to find the upside." Paco Nathan

This short post is part of the Data Newsletter. Click here to sign up.


References

Dialogue & Discussion