StackedML
Practice
Labs
Questions
Models
Pricing
Sign in
Questions
/
Deep Learning
/
Architectures (Conceptual)
/
Transformers (high-level intuition)
← Previous
Next →
787.
Transformer Key Innovation
easy
What is the key architectural innovation that transformers introduced over RNNs?
A
Residual connections, which allow gradients to flow directly from the output layer to the input layer during backpropagation
B
Self-attention, which allows each position in a sequence to directly attend to all other positions regardless of distance
C
Positional encodings, which inject sequential order information that RNNs encode implicitly through their recurrent structure
D
Layer normalization, which stabilizes training of deep networks more effectively than the batch normalization used in RNNs
Sign in to verify your answer
← Back to Questions