Paper Summary: Recursive Neural Language Architecture for Tag Prediction

Last updated: 04 Oct 2017

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.

WHAT

Authors jointly learn a classifier that predicts tags for documents and distributed representations for both documents and tags.

The objective is to predict tags given to a document based on its contents.

HOW

They build on WSABIE but they define a different similarity function between a document-tag pair, adding an extra dimension by replacing matrices with tensors.

Each tensor slice represents a "context", along which the similarity between each word and each tag is computed.

This enables the model to learn specific representations for tags under each specific context (e.g. "apple" may refer to technology companies or to fruits, depending upon the content) to learn different representation modalities for tags.

This similarity function is optimized bia SGD with negative sampling.

CLAIMS

They claim to beat then state-of-the-art approaches (including WSABIE) on two datasets, measured by Recall@k and MAP.

NOTES

It's called "recursive" because of the the way tag representations are learned (iteratively).
They fix tensor parameters to avoid overfitting.
They report Recall@k with unusually large values for k (50,100,150,200,250 and 300).

References

https://arxiv.org/abs/1603.07646

Felipe 05 Oct 2017 04 Oct 2017 paper-summary tags neural-nets embeddings