Paper Summary: Multi-Label Classification on Tree- and DAG-Structured Hierarchies

Paper Summary: Multi-Label Classification on Tree- and DAG-Structured Hierarchies

Last updated:

Please note This post is mainly intended for my personal use. It is not peer-reviewed work and should not be taken as such.


The authors propose a method for learning from multi-label data where the labels form an is-a hierarchy.


Other approaches for structured learning in a multi-label context focus on trees only, not DAGs (which are a generalization of trees).

In the few cases they do (e.g. Barutcuoglu and Toryanskaya 2006) they focus on algorithm adaptation methods (rather than problem adaptation methods) which are restricted to a single technique.


First they reduce the dimensionality of the label space using a technique called Kernel Density Estimation (KDE) that produces a lower-rank matrix where the depencies in the original matrix are kept. This happens because they use Kernel PCA instead of PCA.

Then they frame the label assignment problem as an optimization problem, with the added constraint that if a label \(l\) is given to an instance, all \( l_p \ | \ l_p \text{ is a parent of } \ l \) must also be given.

They make a few simplifying assumptions and conclude that this problem can be reduced to a well-known problem called the fractional knapsack problem 1 which in this case can be solved by an algorithm called CSSA (Condensing Sort and Select Algorithm)


  • The state-of-the art in DAG-structured multi-label classification was a method called CLUS-HMC by Vens et al., 2008

  • The method proposed by the authors support both tree-based and DAG-based label hierarchies and outperform CLUS-HMC and Hierarchical-SVM on multiple datasets.


  • They use an interesting metric called AUPRC (Area under the Precision-Recall Curve) that's analogous to AUC but for precision-recall curves.

    • This is very interesting for Information Retrieval problems.

1: The Fractional Knapsack Problem is a simplified version of the Knapsack Problem and it can be solved in polynomial time. The original problem is NP-HARD.