# Hierarchical Multi-Label Classification

##### Posted on

This post is for two papers on Hierarchical multi-label classification (HMC), which imposes a hierarchy constraint on the classes.

## C-HMCNN(h)

This section is based on Giunchiglia, Eleonora, and Thomas Lukasiewicz. “Coherent Hierarchical Multi-Label Classification Networks.” In Advances in Neural Information Processing Systems, 33:9662–73. Curran Associates, Inc., 2020.

two basic elements

- a constraint layer built on top of $h$, ensuring that the predictions are coherent by construction.
- a loss function teaching C-HMCNN(h) when to exploit the prediction on the lower classes in the hierarchy to make predictions on the upper ones

two basic approaches (2 classes: A is a subclass of B)

- one output per class and post-processing: e.g., $\min(f_A, f_B)$ and $\max(f_A, f_B)$. Or $f^+_A = \min(f_A, f_B)$ and $f_B^+=f_B$.
- a network with two outputs, one for $A$ and one for $B\ A$. An additional post-processing step for B is $g_B^+ = \max(g_{B\backslash A}, g_A)$.

Ideally, build a neural network that is able to have roughly the same performance

- of $f^+$ when $R_1\cap R_2 = R_1$
- of $g^+$ when $R_1\cap R_2 = \empty$

Given a class $A\in S$, $D_A$ is the set of subclasses of $A$ in $S$. The outputs of max constraint module (MCM) for a class $A$ is

\[MCM_A = \max_{B\in D_A} (h_B)\]## HMC-LMLP

HMC problems can be solved by either local or global approaches

Silla et al (2010): different strategies can be used in the local approach

- one local classifier per node (LCN)
- one local classifier per parant node (LCPN)
- one local classifier per level (LCL)

global approach trains only one classifer to cope with all hierarchical classes.

Hierarchical Multi-label Classification with Local Multi-layer Perceptrons (HMC-LMLP)

learn MLP network sequentially, one for each level of the class hierarchy.