2024 FFT

Time TBD

A Universal Law in Deep Learning: from MLP to Transformer

Weijie Su (U.Penn)

Abstract: In this talk, we introduce a universal phenomenon that governs the inner workings of a wide range of neural network architectures, including multilayer perceptrons, convolutional neural networks, transformers, and Mamba. Through extensive computational experiments, we demonstrate that deep neural networks tend to process data in a uniform improvement manner across layers. We conclude this talk by discussing how this universal law provides useful insights into practice.