Posts by Collection

portfolio

publications

Analysis of normal-form algorithms for solving systems of polynomial equations

Published in Journal of Computational and Applied Mathematics, 2022

Analysis of eigenvalue methods for multivariate numerical rootfinding.

ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models

Published in arXiv, 2023

Connects the representation cost of neural networks with 1 ReLU layer and many linear layers to the spectrum of the expected gradient outer product matrix (EGOP), showing that this architecture is biased towards single- and multi-index models.

Depth Separation in Norm-Bounded Infinite-Width Neural Networks

Published in 37th Annual Conference on Learning Theory (COLT), 2024

Establishes a separation in the representation cost and sample complexity needed to approximate functions with two vs. three layer neural networks.

talks

Finding Low-Rank Functions Using Linear Layers in Neural Networks

Published: February 14, 2023

A fundamental question in the theory of neural networks is the role of depth. Empirically it is widely known that deeper networks tend to perform better than shallow ones. However, the reasoning behind this phenomenon is not well understood. In this talk I will discuss the role of depth in the simplified case where most of the layers have a linear activation. Specifically, the regularization associated with training a neural network with many linear layers followed by a single ReLu layer using weight decay is equivalent to a function-space penalty that encourages the network to select a low-rank function, i.e. one with small active subspace.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015