Clark Hall, Room 110
?Does the Data InduceCapacity Control in Deep Learning??
University of Pennsylvania Pratik Chaudhari
Abstract: Acceptedstatistical wisdom suggests that larger the model class, the more likely it isto overfit the training data. And yet, deep networks generalize extremely well.The larger the deep network, the better its accuracy on new data. This talkseeks to shed light upon this apparent paradox.
Wewill argue that deep networks are successful because of a characteristicstructure in the space of learning tasks. The input correlation matrix fortypical tasks has a peculiar (?sloppy?) eigenspectrum where, in addition to afew large eigenvalues (salient features), there are a large number of smalleigenvalues that are distributed uniformly over exponentially large ranges.This structure in the inputdata is strongly mirrored in the representationlearned by the network. A number of quantities such as the Hessian, the FisherInformation Matrix, aswell as others activation correlations and Jacobians,are also sloppy. Even if the model class for deep networks is very large, thereis an exponentially small subset of models (in the number of data) that fitsuch sloppy tasks. This talk will demonstrate the first analytical non-vacuousgeneralization bound for deep networks that does not use compression. We willalso discuss an application of these concepts that develops new algorithms forsemi-supervised learning.
1. Does the data induce capacity control in deep learning?. Rubing Yang, JialinMao, and Pratik Chaudhari. [ICML ’22] https://arxiv.org/abs/2110.14163
2. Deep Reference Priors: What is the best way to pretrain a model? YansongGao, Rahul Ramesh, Pratik Chaudhari. [ICML ’22] https://arxiv.org/abs/2202.00187
Biography: Pratik Chaudhari is an AssistantProfessor in Electrical and Systems Engineering and Computer and InformationScience at the University of Pennsylvania. He is a member of the GRASPLaboratory. From 2018-19, he was a Senior Applied Scientist at Amazon Web Servicesand a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech.Pratik received his PhD (2018) in Computer Science from UCLA, his Master’s(2012) and Engineer’s (2014) degrees in Aeronautics and Astronautics from MIT.He was a part of NuTonomy Inc. (now Hyundai- Aptiv Motional) from 2014?16. Hereceived the NSF CAREER award and the Intel Rising Star Faculty Awardin 2022.