A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization

Sankar, Adepu Ravi and Khasbage, Yash and Vigneswaran, Rahul and Balasubramanian, Vineeth N (2021) A Deeper Look at the Hessian Eigenspectrum of Deep Neural Networks and its Applications to Regularization. In: 35th AAAI Conference on Artificial Intelligence, AAAI 2021, 2 February 2021 through 9 February 2021, Virtual, Online.

Full text not available from this repository. (Request a copy)

Abstract

Loss landscape analysis is extremely useful for a deeper understanding of the generalization ability of deep neural network models. In this work, we propose a layerwise loss landscape analysis where the loss surface at every layer is studied independently and also on how each correlates to the overall loss surface. We study the layerwise loss landscape by studying the eigenspectra of the Hessian at each layer. In particular, our results show that the layerwise Hessian geometry is largely similar to the entire Hessian. We also report an interesting phenomenon where the Hessian eigenspectrum of middle layers of the deep neural network are observed to most similar to the overall Hessian eigenspectrum. We also show that the maximum eigenvalue and the trace of the Hessian (both full network and layerwise) reduce as training of the network progresses. We leverage on these observations to propose a new regularizer based on the trace of the layerwise Hessian. Penalizing the trace of the Hessian at every layer indirectly forces Stochastic Gradient Descent to converge to flatter minima, which are shown to have better generalization performance. In particular, we show that such a layerwise regularizer can be leveraged to penalize the middlemost layers alone, which yields promising results. Our empirical studies on well-known deep nets across datasets support the claims of this work. Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

[error in script]

IITH Creators:

IITH Creators	ORCiD
Balasubramanian, Vineeth N	https://orcid.org/0000-0003-2656-0375

Item Type:

Conference or Workshop Item (Paper)

Additional Information:

This work has been partly supported by the funding received from DST, Govt of India, through the MATRICS program (MTR/2017/001047), MHRD and the Intel India PhD Fellowship. We also acknowledge IIT-Hyderabad and JICA for provision of GPU servers for the work. We thank the anonymous reviewers for their valuable feedback that improved the presentation of this work.

Uncontrolled Keywords:

Eigenspectrum; Generalization ability; ITS applications; Landscape analysis; Layer-wise; Middle layer; Neural network application; Neural network model; Regularisation; Regularizer

Subjects:

Computer science

Divisions:

Department of Computer Science & Engineering

Depositing User:

. LibTrainee 2021

Date Deposited:

03 Aug 2022 09:32

Last Modified:

03 Aug 2022 09:32

URI:

http://raiithold.iith.ac.in/id/eprint/10072