Practical Tips for more Robust Real-time ML models

Practical Tips for more Robust Real-time ML models

Last updated:
Table of Contents

WIP Alert This is a work in progress. Current information is correct but more content may be added in the future.

Feature vectors used during real-time ML scoring may break for some reasons:

  • Normal model degradation (as time goes by)
  • Operational problems (upstream feature sources break, services time-out, etc)
  • Adversarial attacks

There are many ways to make an ML model more robust in such a scenario. Some of them are trade-offs that incur some performance loss, some of them are full upside.

Let's see:

Tune feature_fraction, dropout and similar parameters

Decreasing feature_fraction in gradient-boosted tree algorithms make the model spread out the impact from very important features, which helps soften the blow if those features get attacked

Similarly, increasing the dropout rate in neural networks forces the model not to rely on specific features too much.


References

Dialogue & Discussion