Computer Science > Computer Vision and Pattern Recognition

arXiv:2212.12965 (cs)

[Submitted on 25 Dec 2022 (v1), last revised 14 Dec 2024 (this version, v2)]

Title:BD-KD: Balancing the Divergences for Online Knowledge Distillation

Authors:Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark

Abstract:We address the challenge of producing trustworthy and accurate compact models for edge devices. While Knowledge Distillation (KD) has improved model compression in terms of achieving high accuracy performance, calibration of these compact models has been overlooked. We introduce BD-KD (Balanced Divergence Knowledge Distillation), a framework for logit-based online KD. BD-KD enhances both accuracy and model calibration simultaneously, eliminating the need for post-hoc recalibration techniques, which add computational overhead to the overall training pipeline and degrade performance. Our method encourages student-centered training by adjusting the conventional online distillation loss on both the student and teacher losses, employing sample-wise weighting of forward and reverse Kullback-Leibler divergence. This strategy balances student network confidence and boosts performance. Experiments across CIFAR10, CIFAR100, TinyImageNet, and ImageNet datasets, and various architectures demonstrate improved calibration and accuracy compared to recent online KD methods.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2212.12965 [cs.CV]
	(or arXiv:2212.12965v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2212.12965

Submission history

From: Ibtihel Amara [view email]
[v1] Sun, 25 Dec 2022 22:27:32 UTC (3,036 KB)
[v2] Sat, 14 Dec 2024 18:40:10 UTC (3,525 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:BD-KD: Balancing the Divergences for Online Knowledge Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:BD-KD: Balancing the Divergences for Online Knowledge Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators