338. Gradient Flow and Layer Learning
easy
Gradient flow through a network determines which layers learn during training. In early deep networks, why did only the last few layers learn effectively?
Gradient flow through a network determines which layers learn during training. In early deep networks, why did only the last few layers learn effectively?