Computer Science > Machine Learning

arXiv:2301.01651 (cs)

[Submitted on 4 Jan 2023 (v1), last revised 9 Jan 2023 (this version, v2)]

Title:On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

Authors:Matteo Cacciola, Antonio Frangioni, Masoud Asgharian, Alireza Ghaffari, Vahid Partovi Nia

View PDF

Abstract:Deep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing. Stochastic Gradient Descent (SGD) is the main tool for training such models, where the computations are usually performed in single-precision floating-point number format. The convergence of single-precision SGD is normally aligned with the theoretical results of real numbers since they exhibit negligible error. However, the numerical error increases when the computations are performed in low-precision number formats. This provides compelling reasons to study the SGD convergence adapted for low-precision computations. We present both deterministic and stochastic analysis of the SGD algorithm, obtaining bounds that show the effect of number format. Such bounds can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.

Subjects:	Machine Learning (cs.LG); Numerical Analysis (math.NA)
Cite as:	arXiv:2301.01651 [cs.LG]
	(or arXiv:2301.01651v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2301.01651

Submission history

From: Alireza Ghaffari [view email]
[v1] Wed, 4 Jan 2023 14:54:15 UTC (4,733 KB)
[v2] Mon, 9 Jan 2023 15:26:45 UTC (4,733 KB)

Computer Science > Machine Learning

Title:On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators