Spark SQL Date/Datetime Function Examples
09 Nov 2019 Examples on how to use date and datetime functions for commonly used transformations in spark sql dataframes.
paper-summary machine-learning-engineering
Paper Summary: 150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com
09 Nov 2019 Summary of the 2019 article "150 Successful Machine Learning Models: 6 Lessons Learned at Booking.com" by Bernardi et al.
gnu macos unix linux command-line data-science
Using Command-line Tools for Text Data Preprocessing: Examples and Reference
09 Nov 2019 Use native command-line tools for common tasks related to text preprocessing, like stripping bad characters, normalizing whitespace/newlines, replacing regular expressions, text normalization, etc. They're very fast and work surprisingly well.
Read More ›Matplotlib Errorbar Examples
06 Nov 2019 Examples on how to plot data including spread information such as standard deviations, variance, etc.
Read More ›Git Diff: Reference and Examples
24 Oct 2019 How to compare and show differences in files from different branches, different commits, etc.
Read More ›paper-summary natural-language-processing
Paper Summary: TextRank: Bringing Order into Texts
16 Sep 2019 Summary of the 2004 article "TextRank: Bringing Order into Texts" by Mihalcea and Tarau.
Conda, Pip, Virtualenv and Pyenv: Commands Compared
16 Sep 2019 Equivalent commands for conda on the one hand and pip plus virtualenv on the other.
Read More ›GNU Gzip examples
15 Sep 2019 Examples on how to perform simple tasks using the gzip command (along with sister commands such as gunzip and zcat).
Read More ›Python Open File: Reference and Examples
15 Sep 2019 Quick reference for how to open files in Python, encodings, modes, etc.
Read More ›