technology   data-newsletter-5 data-science

Data Provenance: Quick Summary + Reasons Why

07 Sep 2017   Data Provenance (also called Data Lineage) is version control for data. It refers to keeping track of modifications to datasets you use and train models on. This is crucial in data science projects if you need to ensure data quality and reproducibility.

Read More ›

technology   data-newsletter-4 recommender-systems

Lessons from the Netflix Prize: Changing Requirements and Cost-Effectiveness

03 Sep 2017   Netflix never really used the #1 winning solution to the Netflix Challenge. Some of the reasons were that just wasn't cost-effective to implement the full thing and another was that requirements had changed.

Read More ›

technology   data-newsletter-4 kaggle data-science

Winning Solutions Overview: Kaggle Instacart Competition

03 Sep 2017   The Instacart "Market Basket Analysis" competition focused on predicting repeated orders based upon past behaviour.

Read More ›

technology   data-newsletter-4 machine-learning

A Quick Summary of Ensemble Learning Strategies

01 Sep 2017   Ensemble learning refers to mixing the outputs of several classifiers in various ways, so as to get a better result than each classifier individually.

Read More ›

technology   data-newsletter-4 machine-learning

Evaluation Metrics for Classification Problems: Quick Examples + References

31 Aug 2017   There are multiple ways to measure your model's performance in machine learning, depending upon what objectives you have in mind. Some of the most important are Accuracy, Precision, Recall, F1 and AUC.

Read More ›

technology   data-newsletter-4 pandas performance

Pandas for Large Data

13 Aug 2017   In order to successfully work with large data on Pandas, there are some ways to reduce memory usage and make sure you get good speed performance.

Read More ›

business   metrics linkedin

Suggestions on how to make LinkedIn more relevant

04 Aug 2017   LinkedIn is a nice platform for connecting to professional peers but its real value lies, in my opinion, in its potential to the the global professional rating system. But it needs some improvement.

Read More ›

technology   reminder hierarchy clustering

Quick Reminder: Clustering

29 Jul 2017   Quick reminder on key points regarding clustering (hierarchical and otherwise)

Read More ›

technology   nodejs npm

Install NodeJS and NPM on Ubuntu 16.04

28 Jul 2017   Installing the latest NPM + NodeJS on Ubuntu 16.04.

Read More ›

technology   codebuild codepipeline docker beanstalk continuous-integration continuous-deployment

Using AWS CodePipeline to Automatically Deploy and Build your App Stored on Github as a Docker-based Beanstalk Application

07 Jul 2017   A full guide on how to set up a continuous deployment pipeline using GitHub and AWS CodePipeline, in order to deploy a Docker-based Beanstalk Application.

Read More ›