queirozf.com

paper-summary machine-learning-engineering technical-debt

Paper Summary: Hidden Technical Debt in Machine Learning Systems

23 Mar 2020   Summary of the 2015 article "Hidden Technical Debt in Machine Learning Systems" by Sculley et al.

Read More ›

pandas scikit-learn

Scikit-learn Pipelines: Custom Transformers and Pandas integration

08 Mar 2020   Examples and reference on how to write customer transformers and how to create a single sklearn pipeline including both preprocessing steps and classifiers at the end, in a way that enables you to use pandas dataframes directly in a call to fit.

Read More ›

numpy statistics

Numpy Sampling: Reference and Examples

07 Mar 2020   Sample from probability distributions and from lists, with and without weights. Examples using Python, Numpy and Scipy.

Read More ›

spark scala

Spark dataframe Examples: Reading and Writing Dataframes

23 Feb 2020   Some examples on how to read and write spark dataframes from sources such as S3 and databricks file systems.

Read More ›

paper-summary machine-learning-engineering software-engineering

Paper Summary: Software Engineering for Machine Learning: A Case Study

25 Jan 2020   Summary of the 2019 article "Software Engineering for Machine Learning: A Case Study" by Amershi et al.

Read More ›

git

Git branching: Reference and Examples

23 Jan 2020   Common use cases and examples for using branches on git, locally and on remotes.

Read More ›

paper-summary attention sequence-learning machine-translation

Paper Summary: Neural Machine Translation by Jointly Learning to Align and Translate

11 Jan 2020   Summary of the 2014 article "Neural Machine Translation by Jointly Learning to Align and Translate" by Bahdanau et al.

Read More ›

paper-summary machine-learning-engineering

Paper Summary: Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

23 Dec 2019   Summary of the 2019 article "Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift" by Rabanser et al.

Read More ›

python testing

Python Unittest Examples: Mocking and Patching

09 Dec 2019   Simple examples to help you understand when/where to use mocking and patching, so you don't need to skip testing any part of your code.

Read More ›

pandas

Pandas Dataframe Examples: Duplicated Data

17 Nov 2019   Deal with duplicated data in pandas: drop, count, show and mark duplicates in pandas dataframes.

Read More ›