February 16, 2021

12:00 pm / 1:00 pm

Venue

Zohttps://wse.zoom.us/j/98294205026?pwd=SE95N2xqUC9iWFdZWmhueWI5eWZtdz09om

Recorded Seminar:

https://wse.zoom.us/rec/play/AzCj5WDi8m6tngJHJeFh4tlmpSUJ4trB1Mf8ETMzvWxgq0H55QvEJCi2klBEq0HGstl4-sJQrCa9S8TG.MXEycajPh3jAX0Dl?autoplay=true&startTime=1613494025000


Andrej Risteski, PhD

Assistant Professor
MachineLearning Department
Carnegie Mellon University

?Representational aspects of depth and conditioning in normalizing flows?

Abstract:  Normalizingflows are among the most popularparadigms in generative modeling, especially forimages, primarily becausewe canefficiently evaluate the likelihood of a data point, allowing likelihoodtraining via gradient descent. However, training normalizing flowscomes with difficulties as well: models which produce good samplestypicallyneed to be extremelydeep and they are often poorlyconditioned. 

In our paper, we tacklerepresentational aspects around depth and conditioning of normalizing flows: both for general invertiblearchitectures, and for a particular common architecture, affine couplings. Weprove that affine couplinglayers suffice to exactly represent a permutation or 1×1 convolution, as used in GLOW,showing that representationally thechoice of partition is not a bottleneck for depth. We also show thatshallow affine coupling networks are universal approximators in Wassersteindistance if ill-conditioning is allowed, and experimentally investigate related phenomena involving padding. Finally, weshow a depth lower bound for general flow architectureswith few neurons per layer and boundedLipschitz constant.

Joint with Fred Koehler and VirajMehta.

Bio: Andrej Risteski is an Assistant Professor at theMachine Learning Department in Carnegie Mellon University. He receivedhis PhDin the Computer Science Department at Princeton University under the advisementof Sanjeev Arora. His research interests lie in machine learning andstatistics, spanning topics like representation learning, generativemodels,word embeddings, variational inference and MCMC and non-convexoptimization. The broad goal of his research is principled and mathematicalunderstanding of statistical and algorithmic phenomena and problems arising inmodern machine learning.