Evaluation Metrics for Regression Problems: Quick examples + Reference
26 May 2018 Regression problems are evaluated against specific metrics that analyze whether the residuals (difference between actual and predicted values) indicate that a fitted model is a good fit for the data. Here are some of the most commonly-used metrics in that domain.
Read More ›Vim Examples: Search and Replace
24 May 2018 Examples on how to search and replace text on Vim; simple exmaples, using regexes, etc.
Read More ›paper-summary embeddings compositionality natural-language-processing
Paper Summary: A Simple but Tough-to-beat Baseline for Sentence Embeddings
13 May 2018 Summary of the 2017 article "A Simple but Tough-to-beat Baseline for Sentence Embeddings" by Arora et al.
Read More ›Scikit-Learn examples: Making Dummy Datasets
02 May 2018 Make dummy datasets to test out classifiers and/or parameter configurations in Scikit-learn.
Read More ›paper-summary compositionality embeddings natural-language-processing
Paper Summary: Context is Everything: Finding Meaning Statistically in Semantic Spaces
01 May 2018 Summary of the 2018 article "Context is Everything: Finding Meaning Statistically in Semantic Spaces" by Zelikman, where the author introduces CoSal weighting for bag-of-words vectors.
Read More ›data-science peopleware data-newsletter-5 machine-learning-engineering
Podcast Episode Overview: What Machine Learning Engineers need to Know
23 Apr 2018 Overview of a great podcast episode on how much (if at all) we need a new role for data teams, namely Machine Learning Engineers.
Read More ›matplotlib machine-learning scikit-learn
Visualizing Machine Learning Models: Examples with Scikit-learn, XGB and Matplotlib
23 Apr 2018 Examples on how to use matplotlib and Scikit-learn together to visualize the behaviour of machine learning models, conduct exploratory analysis, etc.
Read More ›Pandas Dataframe: Merge and Join Examples
17 Apr 2018 Examples on how to use pandas.merge to do SQL-style joins on pandas dataframes.
Read More ›machine-learning data-science model-evaluation
Introduction to AUC and Calibrated Models with Examples using Scikit-Learn
15 Apr 2018 Inspired by a podcast episode by Linear Digressions, which talks about what AUC is and what it is not and why you need well calibrated models if you want to treat their outputs as probabilities.
Read More ›Corda Framework Overview + Examples
07 Apr 2018 Overview of the main concepts of the Corda framework for building decentralized applications based on Distributed Ledger Technology (DLT).
Read More ›