queirozf.com

Entries by tag: data-newsletter

Including child/synonym tags

Podcast Episode Overview: What Machine Learning Engineers need to Know  23 Apr 2018    data-science peopleware data-newsletter-5 machine-learning-engineering
Overview of a great podcast episode on how much (if at all) we need a new role for data teams, namely Machine Learning Engineers. Read More ›

Package a Python Project and Make it Available via pip install: Simple Example  15 Nov 2017    python data-newsletter-6
It's easy to package some code you wrote as a package. Publish your Python code to PyPi to have other people use and contribute to it! Read More ›

Scaling Data Teams  09 Oct 2017    data-science data-newsletter-5
Needs of data teams are mostly around data access and sharing; Columnar databases are often more efficient for analytics; MS Excel is useful at many scales; Stakeholder communication is important to make your work more relevant; Use metrics to get to know how data products are being used. Read More ›

Matplotlib, Pyplot, Pylab etc: What's the difference between these and when to use each?  26 Sep 2017    python data-visualization data-newsletter-5
Do you often get confused with terms like maptlotlib, pyplot, pylab, figures, axes, gcf, gca, etc and wonder what they mean? Matplotlib is the toolkit, PyPlot is an interactive way to use Matplotlib and PyLab is the same thing as PyPlot but with some extra shortcuts. Read More ›

5 Tips for moving your Data Science Operation to the next Level  26 Sep 2017    data-newsletter-5 data-science best-practices
Principles for disciplined data science include: Discoverability, Automation, Collaboration, Empowerment and Deployment. Read More ›

Highlights of the Talk with Dr. Konstan on Recommender Systems  24 Sep 2017    recommender-systems data-newsletter-5
Some highlights of the Podcast Episode with Dr. Joseph Konstan on interesting topics related to Recommender Systems. Discussed topics include serendipity, serpentining, diversity and temporal effects. Read More ›

Data Provenance: Quick Summary + Reasons Why  07 Sep 2017    data-newsletter-5 data-science
Data Provenance (also called Data Lineage) is version control for data. It refers to keeping track of modifications to datasets you use and train models on. This is crucial in data science projects if you need to ensure data quality and reproducibility. Read More ›

Lessons from the Netflix Prize: Changing Requirements and Cost-Effectiveness  04 Sep 2017    data-newsletter-4 recommender-systems
Netflix never really used the #1 winning solution to the Netflix Challenge. Some of the reasons were that just wasn't cost-effective to implement the full thing and another was that requirements had changed. Read More ›

Winning Solutions Overview: Kaggle Instacart Competition  04 Sep 2017    data-newsletter-4 kaggle data-science
The Instacart "Market Basket Analysis" competition focused on predicting repeated orders based upon past behaviour. Among the best-ranking solutings, there were many approaches based on gradient boosting and feature engineering and one approach based on end-to-end neural networks. Read More ›

A Quick Summary of Ensemble Learning Strategies  01 Sep 2017    data-newsletter-4 machine-learning
Ensemble learning refers to mixing the outputs of several classifiers in various ways, so as to get a better result than each classifier individually. Read More ›

Evaluation Metrics for Classification Problems: Quick Examples + References  31 Aug 2017    data-newsletter-4 machine-learning model-evaluation
There are multiple ways to measure your model's performance in machine learning, depending upon what objectives you have in mind. Some of the most important are Accuracy, Precision, Recall, F1 and AUC. Read More ›