716. Second-Order vs First-Order Optimization
hard

Second-order optimization methods like Newton's method use the Hessian matrix. What advantage do they have over first-order gradient descent?