Attention in transformers, visually explained | Chapter 6, Deep Learning