If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX
If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html
0:00 - Predict, sample, repeat 3:03 - Inside a transformer 6:36 - Chapter layout 7:20 - The premise of Deep Learning 12:27 - Word embeddings 18:25 - Embeddings beyond words 20:22 - Unembedding 22:22 - Softmax with temperature 26:03 - Up next
How large language models work, a visual intro to transformers | Chapter 5, Deep Learning3Blue1Brown2024-04-01 | Breaking down how Large Language Models work Instead of sponsored ad reads, these lessons are funded directly by viewers: 3b1b.co/support
If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX
If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html
0:00 - Predict, sample, repeat 3:03 - Inside a transformer 6:36 - Chapter layout 7:20 - The premise of Deep Learning 12:27 - Word embeddings 18:25 - Embeddings beyond words 20:22 - Unembedding 22:22 - Softmax with temperature 26:03 - Up nextHow I animate 3Blue1Brown | A Manim demo with Ben Sparks3Blue1Brown2024-10-12 | A behind-the-scenes look at how I animate videos. Code for all the videos: github.com/3b1b/videos Manim: github.com/3b1b/manim Community edition: github.com/ManimCommunity/manim Example scenes shown near the end: github.com/3b1b/manim/blob/master/example_scenes.py
These lessons are funded directly by viewers: 3b1b.co/support
Timestamp: 0:00 - Intro 2:39 - Hello World 10:32 - Coding up a Lorenz attractor 23:46 - Add some tracking points 28:52 - The globals().update(locals()) hack 32:57 - Final styling on the scene 41:42 - Rending the scene 44:35 - Adding equations 48:43 - Where to start
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
Hologram credits: The Microscope is by Walter Spierings, 1984 Donations Hologram by Cherry Optical Holography Lucy in a Tin Hat is by Patrick Keown Boyd, 1988 The Star Wars-themed Direct-Write Digital Holograms were produced by Zebra Imaging. The 'Shakespeare' embossed animated integral hologram was made by Applied Holographics.
Walter Spierings, who did the microscope, is from Dutch Holographic Laboratory. He wanted me to let you know that anyone should feel free to approach them when it comes to producing holograms, they do a lot of innovative things with the medium: www.holoprint.com
Thanks to everyone who helped with this project: Paul Dancstep, for help writing, and for all the 3d modeling Craig Newswanger and Sally Weber, for making the central hologram shown Kurt Bruns, for the artwork of Dennis Gabor Phoebe Tooke, Wayne Grim, and Rick Danielson, for filming at the exploratorium Quinn Brodsky and Mithuna Yoganathan, for footage of lasers through diffraction gratings Vince Rubinetti, for writing the music Cliff Stoll for the Klein Bottle
Small correction: After the algebra in the end, I say "We don't even make assumptions about R", but that's not quite true. To treat |R^2| as some scaling factor in the expression |R^2| * O, it matters that the amplitude of R is approximately constant around a given point.
Timestamps 0:00 - What is a Hologram? 3:28 - The recording process 11:45 - The simplest hologram 17:12 - Diffraction gratings 25:15 - Reconstructing the simplest hologram 28:24 - Conjugate image 31:11 - More complex scenes 35:58 - The bigger picture of holography 38:27 - The formal explanation
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
See Matt Parker's video for more: youtu.be/ga9Qk38FaHMHow might LLMs store facts | Chapter 7, Deep Learning3Blue1Brown2024-08-31 | Unpacking the multilayer perceptrons in a transformer, and how they may store facts Instead of sponsored ad reads, these lessons are funded directly by viewers: 3b1b.co/support An equally valuable form of support is to share the videos.
Anthropic posts about superposition referenced near the end: https://transformer-circuits.pub/2022/toy_model/index.html https://transformer-circuits.pub/2023/monosemantic-features
Some added resources for those interested in learning more about mechanistic interpretability, offered by Neel Nanda
Coding tutorials for mechanistic interpretability (made by ARENA) https://arena3-chapter1-transformer-interp.streamlit.app/
Sections: 0:00 - Where facts in LLMs live 2:15 - Quick refresher on transformers 4:39 - Assumptions for our toy example 6:07 - Inside a multilayer perceptron 15:38 - Counting parameters 17:04 - Superposition 21:37 - Up next
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
Timestamps: 0:00 - End of Harriet Nembhard's introduction 0:45 - The cliché 2:28 - The shifting goal 5:57 - Action precedes motivation 7:02 - Timing 10:47 - Know your influence 12:05 - Anticipate change
------------------
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
Demystifying self-attention, multiple heads, and cross-attention. Instead of sponsored ad reads, these lessons are funded directly by viewers: 3b1b.co/support
The first pass for the translated subtitles here is machine-generated, and therefore notably imperfect. To contribute edits or fixes, visit translate.3blue1brown.com
And yes, at 22:00 (and elsewhere), "breaks" is a typo.
If you want a conceptual understanding of language models from the ground up, @vcubingx just started a short series of videos on the topic: youtu.be/1il-s4mgNdI?si=XaVxj6bsdy3VkgEX
If you're interested in the herculean task of interpreting what these large networks might actually be doing, the Transformer Circuits posts by Anthropic are great. In particular, it was only after reading one of these that I started thinking of the combination of the value and output matrices as being a combined low-rank map from the embedding space to itself, which, at least in my mind, made things much clearer than other sources. https://transformer-circuits.pub/2021/framework/index.html
3blue1brown is a channel about animating math, in all senses of the word animate. If you're reading the bottom of a video description, I'm guessing you're more interested than the average viewer in lessons here. It would mean a lot to me if you chose to stay up to date on new ones, either by subscribing here on YouTube or otherwise following on whichever platform below you check most regularly.
Thanks to these viewers for their contributions to translations Bulgarian: Martin Grozdanov French: GiveMeChocolate, Yoyodotpy German: Josh, dlatikay Hebrew: Omer Tuchfeld Hindi: rajeshwar-pandey Spanish: Marcelo LynchA challenging puzzle about subset sums3Blue1Brown2024-01-22 | A link to the full video answering this is at the bottom of the screen. Or, for reference: youtu.be/bOXCLR3Wric
Thanks to these viewers for their contributions to translations French: GiveMeChocolate Hindi: rajeshwar-pandey Spanish: Yago IglesiasEllipses have multiple definitions, how are these the same?3Blue1Brown2024-01-19 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/pQa_tWZmlGs
The full video this comes from proves why slicing a cone gives the same shape as the two-thumbtacks-and-string construction, which is beautiful.
Editing from long-form to short by Dawid KołodziejThree levels of understanding Bayes theorem3Blue1Brown2024-01-17 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/HZGCoVF3YvM
Editing from long-form to short by Dawid KołodziejThe medical test paradox (well paradox)3Blue1Brown2024-01-15 | A link to the full video about Bayesian thinking is at the bottom of the screen. Or, for reference: youtu.be/lG4VkPoG3ko
Long-to-short editing by Dawid KołodziejPositioned as the hardest question on a Putnam exam (#6, 1992)3Blue1Brown2024-01-12 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/OkmNXy7er84
Editing from the original video into this short by Dawid KołodziejWhy does light slowing imply a bend? (Beyond the tank/car analogy)3Blue1Brown2024-01-11 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/Cz4Q4QOuoo8
That video answers various viewer questions about the index of refraction.
Editing from long-form to short by Dawid KołodziejThe cube shadow puzzle3Blue1Brown2024-01-09 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/ltLUadnCyi0
Thanks to these viewers for their contributions to translations Chinese: ZstringX French: GiveMeChocolate, Yoyodotpy German: Josh, dlatikay Hindi: VaMErYT, rajeshwar-pandey Korean: tebaioiooWhat does it mean that light slows down in glass?3Blue1Brown2024-01-08 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/KTzGBJPuJwM
That video unpacks the mechanism behind how light slows down in passing through a medium, and why the slow-down rate would depend on color.
Editing from long-form to short by Dawid KołodziejA beautiful international math olympiad problem3Blue1Brown2024-01-03 | The link to the full video is at the bottom of the screen. For reference, here it is: youtu.be/M64HUIJFTZMDefinition of a bit, in information theory3Blue1Brown2024-01-02 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/v68zYyaEmEA
That video describes using information theory to write a bot that plays Wordle
Editing from long-form to short by Dawid KołodziejThe Newton art puzzle3Blue1Brown2023-12-29 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/-RdOwhmqP5s
Thanks to these viewers for their contributions to translations French: PyStL Spanish: Yago IglesiasWhat is a group?3Blue1Brown2023-12-27 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/mH0oCDa74tE
That video introduces group theory and the monster group.
Editing from long-form to short by Dawid KołodziejHow to derive a formula for π3Blue1Brown2023-12-25 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/NaL_Cb42WyY
That video explores how this question leads to a quandary on prime numbers, and how a pattern in primes allows for a clean final answer.
Editing from long-form to short by Dawid KołodziejThe limit of limiting arguments3Blue1Brown2023-12-23 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/VYQVlVoWoPY
That video gives multiple examples of lying with visual proofs
Editing from the original video into this short by Dawid KołodziejFor anyone who might not know how links in shorts work3Blue1Brown2023-12-21 | YouTube disabled links in descriptions and comments, but we can add links to videos on the shorts player itself. For reference, the one at the bottom of this screen is youtu.be/KTzGBJPuJwMInfinite Lighthouses and π3Blue1Brown2023-12-21 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/d-o3eB9sflsCan you even imagine 2^256?3Blue1Brown2023-12-16 | Originally written as a supplement to an explanation of the cryptography behind Bitcoin: youtu.be/bBC-nXj3Ng4 (An active link is on the bottom of the video player)Order from chaos3Blue1Brown2023-12-15 | A link to the full video on the Central Limit Theorem is at the bottom of the screen. Or, for reference: youtu.be/zeJD6dqJ5lo
Thanks to Dawid Kołodziej from long-to-short editingThe surface area of a sphere3Blue1Brown2023-12-14 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/GNcFjFmqEc8A short on shorts3Blue1Brown2023-12-13 | Animations taken from this video: youtu.be/-RdOwhmqP5s And this one: youtu.be/LqbZpur38nw (Description links are not active in the shorts player, but you can follow the link at the bottom of the video screen itself)A pretty way to add weighted dice3Blue1Brown2023-12-12 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/IaSGqQa5O-M It describes convolutions in probability, extending to the continuous case
Editing from long-form to short by Dawid KołodziejA simple image convolution3Blue1Brown2023-12-12 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/KuXjwB4LzSA That video introduces convolutions, as used in image processing, probability, and signal processing.
Editing from long-form to short by Dawid KołodziejThese integrals all equal π, until...3Blue1Brown2023-12-11 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/851U557j6HE These are known as Borwein integrals
Editing from long-form to short by Dawid KołodziejThe split necklace puzzle (with a surprise topological solution)3Blue1Brown2023-12-11 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/yuVqxCSsE7c A puzzle about stolen necklaces, from a video about the Borsuk Ulam theorem in topology
Editing from long-form to short by Dawid KołodziejError correction is incredible3Blue1Brown2023-12-10 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/X8jsijhllIA
Editing from long-form to short by Dawid KołodziejThe chessboard and coins puzzle3Blue1Brown2023-12-10 | Video with the solution: youtu.be/as7Gkm7Y7h4 3b1b on a meta-puzzle: youtu.be/wTJI_WuZSwE (These description links aren't active in the shorts player, but you can follow the link on the bottom of the video screen itself)
This comes from a collaboration I did with Stand-up Maths, where on his channel we covered the solution, and here on 3blue1brown we analyze a meta-puzzle.
Editing from long-form to short by Dawid Kołodziej This comes from a collaboration I did with Stand-up Maths, where on his channel we covered the solution, and here on 3blue1brown we analyze a meta-puzzle.
Editing from long-form to short by Dawid KołodziejPrime spirals3Blue1Brown2023-12-09 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/EK32jo7i5LQ
Thanks to Dawid Kołodziej for editing this short from the original.The barber pole effect3Blue1Brown2023-12-09 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/QCX62YJCmGk
Filming by Quinn Brodsky Editing from long-form to short by Dawid KołodziejFourier series3Blue1Brown2023-12-08 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/r6sGWTCMz2k
That video tells the story of how this concept was originally invented to solve the heat equation.
Thanks to Dawid Kołodziej for editing together this shortSeeing with sound3Blue1Brown2023-12-08 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/3s7h2MHQtxc
Editing from the original video into this short by Dawid KołodziejHow prisms work (full video linked above)3Blue1Brown2023-12-07 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/KTzGBJPuJwM
There, I wanted to dig deeper to understand light slows down, and why this would depend on the color.Im still astounded this is true3Blue1Brown2023-12-07 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/jsYwFizhncE Sliding blocks on a frictionless plane, counter their collisions, and...
Thanks to Dawid Kołodziej for editing together this shortDont let it fool you! (Link above explains whats happening)3Blue1Brown2023-12-07 | A link to the full video is at the bottom of the screen. Or, for reference: youtu.be/YtkIWDE36qU
Thanks to Dawid Kołodziej for editing together this short