2024 FFT
Time TBDA Universal Law in Deep Learning: from MLP to Transformer
Weijie Su (U.Penn)
Abstract: In this talk, we introduce a universal phenomenon that governs the inner workings of a wide range of neural network architectures, including multilayer perceptrons, convolutional neural networks, transformers, and Mamba. Through extensive computational experiments, we demonstrate that deep neural networks tend to process data in a uniform improvement manner across layers. We conclude this talk by discussing how this universal law provides useful insights into practice.