StackedML
Practice
Labs
Questions
Models
Pricing
Sign in
Questions
/
Optimization
/
Gradient Methods
/
Convergence
← Previous
Next →
723.
SGD Non-Convergence to Precise Minimum
medium
Mini-batch SGD does not converge to a precise minimum even with enough iterations. Why?
A
The persistent noise in gradient estimates from random mini-batches causes parameters to fluctuate around the minimum rather than settling into it
B
The mini-batch gradient is always biased toward the majority class, preventing convergence to the true optimal parameters
C
The random sampling introduces autocorrelation between successive updates that prevents the optimizer from reaching stationarity
D
The mini-batch size is never large enough to produce an unbiased gradient estimate for complex loss surfaces
Sign in to verify your answer
← Back to Questions