Transformer ( Part 3: Transformer Architecture )

Encoder & Decoder

The Transformer consists of two main parts: an encoder and a decoder. They are connected by Cross-Attention.

  • Encoder: Processes the input sequence using multiple layers of self-attention and feed-forward networks.
  • Decoder: Takes the encoder’s output and generates the target sequence using self-attention and cross-attention mechanisms.

The Transformer Architecture:

https://arxiv.org/pdf/1706.03762

Read more