paper-summary machine-learning-engineering

Paper Summary: 150 Successful Machine Learning Models: 6 Lessons Learned at

09 Nov 2019   Summary of the 2019 article "150 Successful Machine Learning Models: 6 Lessons Learned at" by Bernardi et al.

Read More ›

gnu macos unix linux command-line data-science

Using Command-line Tools for Text Data Preprocessing: Examples and Reference

09 Nov 2019   Use native command-line tools for common tasks related to text preprocessing, like stripping bad characters, normalizing whitespace/newlines, replacing regular expressions, text normalization, etc. They're very fast and work surprisingly well.

Read More ›


Git Diff: Reference and Examples

24 Oct 2019   How to compare and show differences in files from different branches, different commits, etc.

Read More ›

paper-summary natural-language-processing

Paper Summary: TextRank: Bringing Order into Texts

16 Sep 2019   Summary of the 2004 article "TextRank: Bringing Order into Texts" by Mihalcea and Tarau.

Read More ›

pip conda virtualenv

Conda vs Pip and Virtualenv: Commands Compared

16 Sep 2019   Equivalent commands for conda on the one hand and pip plus virtualenv on the other.

Read More ›

compression gzip

GNU Gzip examples

15 Sep 2019   Examples on how to perform simple tasks using the gzip command (along with sister commands such as gunzip and zcat).

Read More ›

python files

Python Open File: Reference and Examples

15 Sep 2019   Quick reference for how to open files in Python, encodings, modes, etc.

Read More ›

paper-summary language-models

Paper Summary: Language Models are Unsupervised Multitask Learners

31 Aug 2019   Summary of the 2019 article "Language Models are Unsupervised Multitask Learners" by Radford et al.

Read More ›

spark dataframes scala

Spark Dataframe Examples: Window Functions

22 Aug 2019   Examples on how to do common operations using window functions in apache spark dataframes. Examples using the Spark Scala API.

Read More ›

pandas dataframes

Pandas Indexing Examples: Accessing and Setting Values on DataFrames

21 Aug 2019   Some common ways to access rows in a pandas dataframe, includes label-based (loc) and position-based (iloc) accessing.

Read More ›