Neural Network Optimization via Stochastic Gradient Descent Variances in High-Dimensional Spaces

👤
A. Smith
👤
R. Chen
View PDF

📋Abstract

This paper introduces a novel approach to optimizing deep neural networks by analyzing the variance of stochastic gradients in extremely high-dimensional parameter spaces. We demonstrate that by dynamically adjusting the learning rate based on localized variance estimations, convergence rates can be significantly improved across a variety of standard benchmarks, avoiding common local minima traps typically encountered in dense architectures.

Our method combines adaptive gradient descent with variance-aware momentum estimation, achieving state-of-the-art performance on ImageNet, CIFAR-100, and synthetic high-dimensional datasets. We provide comprehensive theoretical analysis and extensive empirical validation across multiple model architectures.

Keywords

📄

Interactive PDF Viewer Loading Area

Related Publications

Architectural Implications of High-Density Quantum Compute Knots in Distributed Networks

A. Thorne, E. Rostova

Convergence Analysis of Adaptive Momentum Methods

Y. Lee, K. Park

High-Dimensional Gradient Estimation

M. Zhang, S. Wang