PublicationsJ. Mach. Learn. Res.Vol 25. Issue 10

Neural Network Optimization via Stochastic Gradient Descent Variances in High-Dimensional Spaces

👤

J. Doe (Corresponding)MIT CSAIL

👤

A. SmithMIT CSAIL

👤

R. ChenMIT CSAIL

October 2024

J. Mach. Learn. Res.

DOI: 10.5555/3699999

142 Citations

View PDF

📋Abstract

This paper introduces a novel approach to optimizing deep neural networks by analyzing the variance of stochastic gradients in extremely high-dimensional parameter spaces. We demonstrate that by dynamically adjusting the learning rate based on localized variance estimations, convergence rates can be significantly improved across a variety of standard benchmarks, avoiding common local minima traps typically encountered in dense architectures.

Our method combines adaptive gradient descent with variance-aware momentum estimation, achieving state-of-the-art performance on ImageNet, CIFAR-100, and synthetic high-dimensional datasets. We provide comprehensive theoretical analysis and extensive empirical validation across multiple model architectures.

Keywords

Deep LearningNeural NetworksOptimizationSGD

Publication PDF

3699999.pdf

Open PDF Viewer

Neural Network Optimization via Stochastic Gradient Descent Variances in High-Dimensional Spaces

📋Abstract

Keywords

Publication PDF

Related Publications

Architectural Implications of High-Density Quantum Compute Knots in Distributed Networks

Convergence Analysis of Adaptive Momentum Methods

High-Dimensional Gradient Estimation

Related Publications

Architectural Implications of High-Density Quantum Compute Knots in Distributed Networks
A. Thorne, E. Rostova
ACM Trans. Comput. Syst., 2024 42

Convergence Analysis of Adaptive Momentum Methods
Y. Lee, K. Park
Neural Computation, 2023 78

High-Dimensional Gradient Estimation
M. Zhang, S. Wang
IEEE Trans. Pattern Anal., 2024 65