# Paper Summary: A Simple but Tough-to-beat Baseline for Sentence Embeddings

Last updated:

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

## WHAT

It's an unsupervised method to build sentence embeddings from each individual word embedding in the sentence.

## HOW

• 1) Compute the weighted average of the word vectors (where the weight $$w$$ is the SIF: Smooth Inverse Frequency) in the sentence;

$$SIF(w)=\frac{a}{(a+p(w)}$$

where $$a$$ is a hyper-parameter and $$p(w)$$ is the estimated word frequency in the corpus.

• 2) Subtract from the sentence embedding obtained in step 1) the first principal component of the matrix with all sentence embeddings as columns.

## CLAIMS

• It's a simple and unsupervised approach but it performs better (in unsupervised and supervised tasks) than more complex methods that need supervision, like RNNs and LSTMs.

## NOTES

• In the experiments, TF-IDF weighted GloVe embeddings also had satisfactory results, sometimes better than all other methods (supervised or otherwise).