335. Gradient Descent Oscillation
medium

A model trained with gradient descent oscillates between high and low loss values without converging. What is the most likely cause?