embeddings structure paper-summary neural-networks

Paper Summary: Translating Embeddings for Modeling Multi-relational Data

30 Sep 2017   Summary of the 2013 article "Translating Embeddings for Modeling Multi-relational Data" by Bordes et al.

Read More ›

data-science python data-preprocessing

Feature Scaling: Quick Introduction and Examples using Scikit-learn

26 Sep 2017   Feature Scaling techniques (rescaling, standardization, mean normalization, etc) are useful for all sorts of machine learning approaches and critical for things like k-NN, neural networks and anything that uses SGD (stochastic gradient descent), not to mention text processing systems.

Included examples: rescaling, standardization, scaling to unit length, using scikit-learn.

Read More ›

python data-visualization data-newsletter-5

Matplotlib, Pyplot, Pylab etc: What's the difference between these and when to use each?

26 Sep 2017   Do you often get confused with terms like maptlotlib, pyplot, pylab, figures, axes, gcf, gca, etc and wonder what they mean? Matplotlib is the toolkit, PyPlot is an interactive way to use Matplotlib and PyLab is the same thing as PyPlot but with some extra shortcuts.

Read More ›

data-newsletter-5 data-science best-practices

5 Tips for moving your Data Science Operation to the next Level

25 Sep 2017   Principles for disciplined data science include: Discoverability, Automation, Collaboration, Empowerment and Deployment.

Read More ›

recommender-systems data-newsletter-5

Highlights of the Talk with Dr. Konstan on Recommender Systems

23 Sep 2017   Some highlights of the Podcast Episode with Dr. Joseph Konstan on interesting topics related to Recommender Systems. Discussed topics include serendipity, serpentining, diversity and temporal effects.

Read More ›

python data-visualization plotting

Seaborn by Example: Data Visualization and Plotting using Python

09 Sep 2017   Seaborn is a higher-level interface to Matplotlib. It has a more convenient API and has useful data visualization functions right out of the box.

Read More ›

data-newsletter-5 data-science

Data Provenance: Quick Summary + Reasons Why

07 Sep 2017   Data Provenance (also called Data Lineage) is version control for data. It refers to keeping track of modifications to datasets you use and train models on. This is crucial in data science projects if you need to ensure data quality and reproducibility.

Read More ›

data-newsletter-4 recommender-systems

Lessons from the Netflix Prize: Changing Requirements and Cost-Effectiveness

03 Sep 2017   Netflix never really used the #1 winning solution to the Netflix Challenge. Some of the reasons were that just wasn't cost-effective to implement the full thing and another was that requirements had changed.

Read More ›

data-newsletter-4 kaggle data-science

Winning Solutions Overview: Kaggle Instacart Competition

03 Sep 2017   The Instacart "Market Basket Analysis" competition focused on predicting repeated orders based upon past behaviour. Among the best-ranking solutings, there were many approaches based on gradient boosting and feature engineering and one approach based on end-to-end neural networks.

Read More ›

technology   data-newsletter-4 machine-learning

A Quick Summary of Ensemble Learning Strategies

01 Sep 2017   Ensemble learning refers to mixing the outputs of several classifiers in various ways, so as to get a better result than each classifier individually.

Read More ›