Exploring DNN via Layer-Peeled Model
Posted on 0 Comments
neural collapse: the emergence of certain geometric patterns of the last-layer features and the last-layer classifiers, when the neural network for balanced classification problems is well-trained in the sense that it is toward not only zero misclassification error but also negligible cross-entropy loss.
- the last-layer features from the same class tend to be very close to their class mean
- these $K$ class means centered at the global-mean have the same length and form the maximally possible equal-sized angles between any, i.e., it collapses to the vertices of a simplex equiangular tight frame (ETF) up to scaling
- the last-layer classifiers become dual to the class means in the sense that they are equal to each other for each class up to a scaling factor
- the network’s decision collapses to simply choosing the class with the closest Euclidean distance between its class mean and the activations of the test sample.
the paper will show the phenomenon emerges in the surrogate model instead of the multiple-layer neural networks.
Neural collapse occurs in the Layer-Peeled Model
In the Layer-Peeled Model, the last-layer classifiers corresponding to the minority classes collapse to a single vector when the imbalance ratio $R$ is sufficiently large.
In slightly more detail,
- when $R$ is below a threshold, the minority classes are distinguishable in terms of their last classifiers
- when $R$ is above the threshold, they become indistinguishable
The Minority Collapse phenomenon reveals the fundamental difficulty in using deep learning for classification when the dataset is widely imbalanced, even in terms of optimization, not to mention generalization.
For Explaining Neural Collapse
Extensions to Other Loss Functions