data-science data-newsletter-5

Scaling Data Teams

09 Oct 2017   Needs of data teams are mostly around data access and sharing; Columnar databases are often more efficient for analytics; MS Excel is useful at many scales; Stakeholder communication is important to make your work more relevant; Use metrics to get to know how data products are being used.

Read More ›

pyplot matplotlib

Matplotlib: Pyplot By Example

05 Oct 2017   Examples for common operations on PyPlot, like changing figure size, changing title and tick sizes, changing legends, etc.

Read More ›

paper-summary embeddings tags

Paper Summary: WSABIE: Scaling Up To Large Vocabulary Image Annotation

05 Oct 2017   Summary of the 2011 article "WSABIE: Scaling Up To Large Vocabulary Image Annotation" by Weston et al.

Read More ›

paper-summary tags neural-nets embeddings

Paper Summary: Recursive Neural Language Architecture for Tag Prediction

05 Oct 2017   Summary of the 2016 article "Recursive Neural Language Architecture for Tag Prediction" by Kataria.

Read More ›


Thoughts on App Monetization with Examples from Popular Apps

05 Oct 2017   A couple of thoughts on what approaches seem to work best when optimizing monetization on web/mobile apps. Tips include: Focus on the First Purchase, Mix Free and Paid Features on the same interface, Give away freebies consistently.

Read More ›

python parallel

Parallel For Loops in Python: Examples with Joblib

02 Oct 2017   Joblib.Parallel is a simple way to spread your for loops across multiple cores, for parallel execution.

Read More ›

ubuntu animations

How to Make Gif Animations from Screencasts on Ubuntu

01 Oct 2017   To make short gif-videos on Ubuntu, you can use Kazam for the Screencasts and then Gifify to turn those videos into gif animations.

Read More ›


How to Change the Default Application for a given Extension on Ubuntu

01 Oct 2017   Change the default applications used by certain file extensions.

Read More ›

embeddings structure paper-summary neural-networks

Paper Summary: Translating Embeddings for Modeling Multi-relational Data

01 Oct 2017   Summary of the 2013 article "Translating Embeddings for Modeling Multi-relational Data" by Bordes et al.

Read More ›

data-science python data-preprocessing

Feature Scaling: Quick Introduction and Examples using Scikit-learn

27 Sep 2017   Feature Scaling techniques (rescaling, standardization, mean normalization, etc) are useful for all sorts of machine learning approaches and critical for things like k-NN, neural networks and anything that uses SGD (stochastic gradient descent), not to mention text processing systems.

Included examples: rescaling, standardization, scaling to unit length, using scikit-learn.

Read More ›