StackedML
Practice
Labs
Questions
Models
Pricing
Sign in
Questions
/
Optimization
/
Gradient Methods
/
Adam / adaptive optimizers
← Previous
Next →
14.
Adaptive Optimizer Key Idea
easy
What is the key idea behind adaptive optimizers like Adam?
A
They maintain per-parameter learning rates that scale inversely with the magnitude of recent gradients, giving smaller steps to frequently updated parameters
B
They maintain per-parameter momentum terms that scale proportionally with the magnitude of recent gradients, adjusting updates over time to improve stability
C
They maintain per-layer learning rates that are tuned automatically using the validation loss after each epoch
D
They maintain a global learning rate that decreases monotonically over training based on the total number of gradient steps taken
Sign in to verify your answer
← Back to Questions