paper-summary machine-learning-engineering
Paper Summary: Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift
23 Dec 2019 Summary of the 2019 article "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift" by Rabanser et al.
Read More ›Python Unittest Examples: Mocking and Patching
09 Dec 2019 Simple examples to help you understand when/where to use mocking and patching, so you don't need to skip testing any part of your code.
Read More ›Pandas Dataframe Examples: Duplicated Data
17 Nov 2019 Deal with duplicated data in pandas: drop, count, show and mark duplicates in pandas dataframes.
Read More ›paper-summary neural-networks sequence-learning
Paper Summary: Long Short-Term Memory
16 Nov 2019 Summary of the 1997 article "Long Short-Term Memory" by Hochreiter and Schmidhuber.
Read More ›Spark SQL Case/When Examples
09 Nov 2019 Case/when clauses are useful to mimic if/else behaviour in SQL and also spark, via when/otherwise clauses.
Read More ›Spark SQL Date/Datetime Function Examples
09 Nov 2019 Examples on how to use date and datetime functions for commonly used transformations in spark sql dataframes.
Read More ›paper-summary machine-learning-engineering
Paper Summary: 150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com
09 Nov 2019 Summary of the 2019 article "150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com" by Bernardi et al.
Read More ›gnu macos unix linux command-line data-science
Using Command-line Tools for Text Data Preprocessing: Examples and Reference
09 Nov 2019 Use native command-line tools for common tasks related to text preprocessing, like stripping bad characters, normalizing whitespace/newlines, replacing regular expressions, text normalization, etc. They're very fast and work surprisingly well.
Read More ›Matplotlib Errorbar Examples
06 Nov 2019 Examples on how to plot data including spread information such as standard deviations, variance, etc.
Read More ›Git Diff: Reference and Examples
24 Oct 2019 How to compare and show differences in files from different branches, different commits, etc.
Read More ›