The transformer encoder produces contextual representations. How do these differ from static word embeddings?