Seminar: Implicit Regularization of SGD in High-dimensional Linear Regression
Описание
Speaker: Cong Fang, Researcher at Peking University
What will the talk cover?
Stochastic Gradient Descent (SGD) is one of the most widely used algorithms in modern machine learning. In high-dimensional learning problems, the number of SGD iterations is often smaller than the number of model parameters, and the implicit regularization induced by the algorithm plays a key role in ensuring strong generalization performance.
In this seminar, we will:
🔵 Analyze the generalization behavior of SGD across different learning scenarios;
🔵 Compare learning efficiency under various scales — depending on data size and dimensionality;
🔵 Discuss the effects of covariate shift;
🔵 Present theoretical insights that inspire memory-efficient training algorithms for large language models (e.g., GPT-2)
Рекомендуемые видео



















