Computer Science > Machine Learning

arXiv:2401.03619 (cs)

[Submitted on 8 Jan 2024]

Title:AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks

Authors:Zeinab Ebrahimi, Gustavo Batista, Mohammad Deghat

Abstract:Stochastic gradient descent (SGD) and its many variants are the widespread optimization algorithms for training deep neural networks. However, SGD suffers from inevitable drawbacks, including vanishing gradients, lack of theoretical guarantees, and substantial sensitivity to input. The Alternating Direction Method of Multipliers (ADMM) has been proposed to address these shortcomings as an effective alternative to the gradient-based methods. It has been successfully employed for training deep neural networks. However, ADMM-based optimizers have a slow convergence rate. This paper proposes an Anderson Acceleration for Deep Learning ADMM (AA-DLADMM) algorithm to tackle this drawback. The main intention of the AA-DLADMM algorithm is to employ Anderson acceleration to ADMM by considering it as a fixed-point iteration and attaining a nearly quadratic convergence rate. We verify the effectiveness and efficiency of the proposed AA-DLADMM algorithm by conducting extensive experiments on four benchmark datasets contrary to other state-of-the-art optimizers.

Comments:	18 pages, 5 figures, 5 tables
Subjects:	Machine Learning (cs.LG); Systems and Control (eess.SY)
Cite as:	arXiv:2401.03619 [cs.LG]
	(or arXiv:2401.03619v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2401.03619

Submission history

From: Zeinab Ebrahimi [view email]
[v1] Mon, 8 Jan 2024 01:22:00 UTC (1,000 KB)

Computer Science > Machine Learning

Title:AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:AA-DLADMM: An Accelerated ADMM-based Framework for Training Deep Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators