@algorithmicsimplicity
  @algorithmicsimplicity
Algorithmic Simplicity | Transformer Neural Networks Derived from Scratch @algorithmicsimplicity | Uploaded August 2023 | Updated October 2024, 1 hour ago.
#transformers #chatgpt #SoME3 #deeplearning

Join me on a deep dive to understand the most successful neural network ever invented: the transformer. Transformers, originally invented for natural language translation, are now everywhere. They have fast taken over the world of machine learning (and the world more generally) and are now used for almost every application, not the least of which is ChatGPT.

In this video I take a more constructive approach to explaining the transformer: starting from a simple convolutional neural network, I will step through all of the changes that need to be made, along with the motivations for why these changes need to be made.

*By "from scratch" I mean "from a comprehensive mastery of the intricacies of convolutional neural network training dynamics". Here is a refresher on CNNs: youtube.com/watch?v=8iIdWHjleIs

Chapters:
00:00 Intro
01:13 CNNs for text
05:28 Pairwise Convolutions
07:54 Self-Attention
13:39 Optimizations
Transformer Neural Networks Derived from ScratchWhy Does Diffusion Work Better than Auto-Regression?

Transformer Neural Networks Derived from Scratch @algorithmicsimplicity

SHARE TO X SHARE TO REDDIT SHARE TO FACEBOOK WALLPAPER