On the benefits of defining vicinal distributions in latent space
Mangla, Puneet and Singh, Vedant and Havaldar, Shreyas and Balasubramanian, Vineeth N (2021) On the benefits of defining vicinal distributions in latent space. Pattern Recognition Letters, 152. pp. 382-390. ISSN 0167-8655
Text
Pattern_Recognition_Letters.pdf - Published Version Available under License Creative Commons Attribution. Download (1MB) |
Abstract
The vicinal risk minimization (VRM) principle is an empirical risk minimization (ERM) variant that replaces Dirac masses with vicinal functions. There is strong numerical and theoretical evidence showing that VRM outperforms ERM in terms of generalization if appropriate vicinal functions are chosen. Mixup Training (MT), a popular choice of vicinal distribution, improves generalization performance of models by introducing globally linear behavior in between training examples. Apart from generalization, recent works have shown that mixup trained models are relatively robust to input perturbations/corruptions and at same time are calibrated better than their non-mixup counterparts. In this work, we investigate the benefits of defining these vicinal distributions like mixup in latent space of generative models rather than in input space itself. We propose a new approach - VarMixup (Variational Mixup) - to better sample mixup images by using the latent manifold underlying the data. Our empirical studies on CIFAR-10, CIFAR-100 and Tiny-ImageNet demonstrates that models trained by performing mixup in the latent manifold learned by VAEs are inherently more robust to various input corruptions/perturbations, are significantly better calibrated and exhibit more local-linear loss landscapes. © 2021 Elsevier B.V.
IITH Creators: |
|
||||
---|---|---|---|---|---|
Item Type: | Article | ||||
Additional Information: | This work has been partly supported by the funding received from DST, Govt of India, through the IMPRINT program (IMP/2019/000250). We also acknowledge IIT-Hyderabad and JICA for provision of GPU servers for the work. | ||||
Uncontrolled Keywords: | Calibration; Common corruptions; Mixup; Robustness; VRM | ||||
Subjects: | Computer science | ||||
Divisions: | Department of Computer Science & Engineering | ||||
Depositing User: | . LibTrainee 2021 | ||||
Date Deposited: | 12 Sep 2022 08:46 | ||||
Last Modified: | 12 Sep 2022 08:46 | ||||
URI: | http://raiithold.iith.ac.in/id/eprint/10539 | ||||
Publisher URL: | http://doi.org/10.1016/j.patrec.2021.10.016 | ||||
OA policy: | https://v2.sherpa.ac.uk/id/publication/11448 | ||||
Related URLs: |
Actions (login required)
View Item |
Statistics for this ePrint Item |