November 20, 2018

12:15 pm / 1:15 pm


Clark Hall, Room 316

Title: ?On Expressiveness and Optimization in Deep Learning?
Speaker: Nadav Cohen, PhD
Date: Tuesday, November 20, 2018
Time: 12:15 pm
Location: Kavli NDI North, 316 Clark Hall, Homewood Campus, Johns Hopkins University

?On Expressiveness and Optimization in Deep Learning?

Nadav Cohen, PhD

Postdoctoral Research Scholar
School of Mathematics
Institute for Advanced Study
Princeton, New Jersey
nAbstract: Understanding deep learning calls for addressing three fundamental questions: expressiveness, optimization and generalization. Expressiveness refers to the ability of compactly sized deep neural networks to represent functions capable of solving real-world problems. Optimization concerns the effectiveness of simple gradient-based algorithms in solving non-convex neural network training programs. Generalization treats the phenomenon of deep learning models not overfitting despite having much more parameters than examples to learn from. This talk will describe a series of worksaimed at unraveling some of the mysteries behind expressiveness and optimization. I will begin by establishing an equivalence between convolutional and recurrent networks — the most successful deep learning architectures to date — and hierarchical tensor decompositions. The equivalence will be used to answer various questions concerning expressiveness, and in addition, to provide new tools for deep network design. I will then turn to discuss a recent line of work analyzing optimization of deep linear networks. It shows that gradient descent converges to global minimum under mild assumptions on initialization. Moreover, in stark contrast with conventionalwisdom, sometimes, the rate of convergence exceeds that obtained with a linear model. In other words, depth can accelerate optimization, even without any gain in expressiveness, and despite introducing non-convexity toa formerly convex problem.

Works covered in this talk were in collaboration with Sanjeev Arora, Noah Golowich, Wei Hu, Elad Hazan, Yoav Levine, Or Sharir, Amnon Shashua, Ronen Tamari and David Yakira.

Bio: Nadav Cohen is a postdoctoral member at the School of Mathematics in the Institute for Advanced Study. His research focuses on the theoretical and algorithmic foundations of deep learning. In particular, he is interested in mathematically analyzing aspects of expressiveness, optimization and generalization, with the goal of deriving theoretically founded procedures and algorithms that will improve practical performance. Nadav earned his PhD atthe School of Computer Science and Engineering in the Hebrew University ofJerusalem, under the supervision of Prof. Amnon Shashua. Prior to that, he obtained a BSc in electrical engineering and a BSc in mathematics (both summa cum laude) at the Technion Excellence Program for Distinguished Undergraduates. For his contributions to the theoretical understanding of deep learning, Nadav received a number of awards, including the Google DoctoralFellowship in Machine Learning, the Rothschild Postdoctoral Fellowship, and the Zuckerman Postdoctoral Fellowship.