Dipartimento di Matematica, Aula Magna.
In the context of low-precision computation for the training of neural networks with the
gradient descent method (GD), the occurrence of deterministic rounding errors often leads
to stagnation or adversely affects the convergence of the optimizers. The employ-
ment of unbiased stochastic rounding (SR) may partially capture gradient updates that
are lower than the minimum rounding precision, with a certain probability. We
provide a theoretical elucidation for the stagnation observed in GD when training neural
networks with low-precision computation. We analyze the impact of floating-point round-
off errors on the convergence behavior of GD with a particular focus on convex problems.
Two biased stochastic rounding methods, signed-SR$_\varepsilon$ and SR$_\varepsilon$, are proposed, which have
been demonstrated to eliminate the stagnation of GD and to result in significantly faster
convergence than SR in low-precision floating-point computation.
We validate our theoretical analysis by training a binary logistic regression model on
the Cifar10 database and a 4-layer fully-connected neural network model on the MNIST
database, utilizing a 16-bit floating-point representation and various rounding techniques.
The experiments demonstrate that signed-SR$_\varepsilon$ and SR$_\varepsilon$ may achieve higher classification
accuracy than rounding to the nearest (RN) and SR, with the same number of training
epochs. It is shown that a faster convergence may be obtained by the new rounding
methods with 16-bit floating-point representation than by RN with 32-bit floating-point
Further information is available on the event page on the Indico platform.