StackedML
Practice
Labs
Questions
Models
Pricing
Sign in
Questions
/
Optimization
/
Gradient Methods
/
Adam / adaptive optimizers
← Previous
Next →
11.
Adam's Two Moment Estimates
easy
Adam maintains two moving averages of the gradient. What do the first and second moments represent?
A
The first moment is the gradient norm; the second moment is the gradient direction normalized to unit length at each step
B
The first moment is the sum of all past gradients; the second moment is the maximum squared gradient observed across all training steps
C
The first moment is the current gradient; the second moment is the gradient from the previous time step accumulated across all prior updates
D
First moment is the moving average of the gradient; second is the moving average of squared gradients
Sign in to verify your answer
← Back to Questions