ADINE: an adaptive momentum method for stochastic gradient descent

Srinivasan, Vishwak and Sankar, Adepu Ravi and Balasubramanian, Vineeth N (2018) ADINE: an adaptive momentum method for stochastic gradient descent. In: Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 11-13 January 2018, Goa , India.

Full text not available from this repository. (Request a copy)

Abstract

Momentum based learning algorithms are one of the most successful learning algorithms in both convex and non-convex optimization. Two major momentum based techniques that achieved tremendous success in gradient-based optimization are Polyak's heavy ball method and Nesterov's accelerated gradient. A crucial step in all the momentum based methods is the choice of the momentum parameter m, which is always set to less than 1. Although the choice of m < 1 is justified only under very strong theoretical assumptions, it works well in practice. In this paper we propose a new momentum based method ADINE, which relaxes the constraint of m < 1 and allows the learning algorithm to use adaptive higher momentum. We motivate our relaxation on m by experimentally verifying that a higher momentum (≥ 1) can help escape saddles much faster. ADINE uses this intuition and helps weigh the previous updates more, inherently setting the momentum parameter to be greater in the optimization method. To the best of our knowledge, the idea of increased momentum is first of its kind and is very novel. We evaluate this on deep neural networks and show that ADINE helps the learning algorithm to converge much faster without compromising on the generalization error.

[error in script]
IITH Creators:
IITH CreatorsORCiD
Balasubramanian, Vineeth NUNSPECIFIED
Item Type: Conference or Workshop Item (Paper)
Subjects: Computer science
Divisions: Department of Computer Science & Engineering
Depositing User: Team Library
Date Deposited: 17 May 2019 06:06
Last Modified: 17 May 2019 06:06
URI: http://raiithold.iith.ac.in/id/eprint/5220
Publisher URL: http://doi.org/10.1145/3152494.3152515
Related URLs:

Actions (login required)

View Item View Item
Statistics for RAIITH ePrint 5220 Statistics for this ePrint Item