Transformers are the state-of-the-art model for seq2seq tasks.
Similarity: dot product
Next: Normalise the weights