- What does it mean for a model to fail?
- Generally applicable approach: Multiple lines of defense
- Generally applicable approach: monitoring input and output
- My 2 Cents
This post is largely inspired by the Podcast Episode by O'Reilly: Managing risk in machine learning models
What does it mean for a model to fail?
Bad predictions due to bad input - Bad predictions are output because of missing and/or faulty input data.
Bad predictions due to model error - Input data is fine but the model itself is badly trained, overfits, etc.
Model drift - The quality/accuracy of a model's predictions tends to get worse over time, unless the model is re-trained.
Feedback loops - This can happen if the model predictions affect the data that is used to re-train the model itself.
Generally applicable approach: Multiple lines of defense
The three lines of defense are:
Development time (during the model's construction)
Validation time (after the model is ready)
Monitoring (over time, once the model is in production)
Generally applicable approach: monitoring input and output
Using statistical tests both on input and predictions can help avoid and detect model failures.
My 2 Cents
There seems to be a heavy focus on documentation (as judged from the references).
I myself think this adds unnecessary hurdles for everyone involved. Modern tools like Jupyter notebooks can be used for training, exploratory analysis and for documentation purposes. In addition, all model code should be in a repository such as Github anyway, and it keeps track of all changes and the current state of the codebase.
"The default notion within the world of engineering is that you should make your model as predictive as possible, as accurate as possible. That’s not always the right default. There’s a tradeoff between predictive power and interpretability." Andrew Burt
"We need machine learning to monitor machine learning" - Ben Lorica
- Document written by the talk show invitees.
- Suggests a specific approach to managing the entire model pipeline.
- Nice introduction to model intepretability with general research directions and practical advice.
- This is called SR 11-7 and it's a document written by the Fed on how to properly manage risks of models in banking organizations.
- It's very readable and quite short.