Hierarchical Multi-Label Classification
Posted on
This post is for two papers on Hierarchical multi-label classification (HMC), which imposes a hierarchy constraint on the classes.
C-HMCNN(h)
This section is based on Giunchiglia, Eleonora, and Thomas Lukasiewicz. “Coherent Hierarchical Multi-Label Classification Networks.” In Advances in Neural Information Processing Systems, 33:9662–73. Curran Associates, Inc., 2020.
two basic elements
- a constraint layer built on top of $h$, ensuring that the predictions are coherent by construction.
- a loss function teaching C-HMCNN(h) when to exploit the prediction on the lower classes in the hierarchy to make predictions on the upper ones
two basic approaches (2 classes: A is a subclass of B)
- one output per class and post-processing: e.g., $\min(f_A, f_B)$ and $\max(f_A, f_B)$. Or $f^+_A = \min(f_A, f_B)$ and $f_B^+=f_B$.
- a network with two outputs, one for $A$ and one for $B\ A$. An additional post-processing step for B is $g_B^+ = \max(g_{B\backslash A}, g_A)$.
Ideally, build a neural network that is able to have roughly the same performance
- of $f^+$ when $R_1\cap R_2 = R_1$
- of $g^+$ when $R_1\cap R_2 = \empty$
Given a class $A\in S$, $D_A$ is the set of subclasses of $A$ in $S$. The outputs of max constraint module (MCM) for a class $A$ is
\[MCM_A = \max_{B\in D_A} (h_B)\]HMC-LMLP
HMC problems can be solved by either local or global approaches
Silla et al (2010): different strategies can be used in the local approach
- one local classifier per node (LCN)
- one local classifier per parant node (LCPN)
- one local classifier per level (LCL)
global approach trains only one classifer to cope with all hierarchical classes.
Hierarchical Multi-label Classification with Local Multi-layer Perceptrons (HMC-LMLP)
learn MLP network sequentially, one for each level of the class hierarchy.