Jeremy Howard (youtube account)
This is a section from the Modular launch video. The full video, docs, and details are here: modular.com
updated 1 year ago
We also discuss three important new papers that have been released in the last week, which improve inference performance by over 10x, and allow any photo to be “edited” by just describing what the new picture should show.
The second half of the lesson begins the “from the foundations” stage of the course, developing a basic matrix class and random number generator from scratch, as well as discussing the use of iterators in Python.
You can discuss this lesson, and access links to all notebooks and resources from it, at this forum topic: forums.fast.ai/t/lesson-10-part-2-preview/101337
Additional Links:
- Progressive Distillation for Fast Sampling of Diffusion Models - arxiv.org/abs/2202.00512
- On Distillation of Guided Diffusion Models - arxiv.org/abs/2210.03142
- Imagic: Text-Based Real Image Editing with Diffusion Models - arxiv.org/abs/2210.09276
0:00 - Introduction
0:35 - Showing student’s work over the past week.
6:04 - Recap Lesson 9
12:55 - Explaining “Progressive Distillation for Fast Sampling of Diffusion Models” & “On Distillation of Guided Diffusion Models”
26:53 - Explaining “Imagic: Text-Based Real Image Editing with Diffusion Models”
33:53 - Stable diffusion pipeline code walkthrough
41:19 - Scaling random noise to ensure variance
50:21 - Recommended homework for the week
53:42 - What are the foundations of stable diffusion? Notebook deep dive
1:06:30 - Numpy arrays and PyTorch Tensors from scratch
1:28:28 - History of tensor programming
1:37:00 - Random numbers from scratch
1:42:41 - Important tip on random numbers via process forking
Thanks to fmussari for the transcript, and to Raymond-Wu (on forums.fast.ai) for the timestamps.
Lesson 9A (Deep dive): youtu.be/0_BBRNYInx8
Wasim, Tanishq, and Jeremy walk through the math of diffusion models from the ground up. The lesson assumes no prerequisite knowledge beyond what you covered in high school. We walk through the insight underlying the key equations in the work of Sohl-Dickstein et al. that originally discovered diffusion models.
By the end of the lesson you'll have some understanding of the following key concepts and you'll know how to recognize and interpret their symbols in research papers: probability density function (pdf), data distribution, forward process, reverse process, Markov process, Gaussian distribution, log likelihood, and evidence lower bound (ELBO).
We also touch on the more recent breakthroughs of Ho et al. and Song et al., both of which enabled even simpler and more powerful diffusion models.
You can discuss this lesson, and access links to all notebooks and resources from it, at this forum topic: forums.fast.ai/t/math-of-stable-diffusion/101077.
Additional links:
- Sohl-Dickstein et al. Deep Unsupervised Learning using Nonequilibrium Thermodynamics - arxiv.org/abs/1503.03585
- Ho et al. Denoising Diffusion Probabilistic Models - arxiv.org/abs/2006.11239
- Song et al. Denoising Diffusion Implicit Models - arxiv.org/abs/2010.02502
0:00 - Introduction
2:19 - Data distribution
6:38 - Math behind lesson 9’s “Magic API”
18:50 - CLIP (Contrastive Language–Image Pre-training)
27:04 - Forward diffusion (markov process with gaussian transitions)
36:11 - Likelihood vs log likelihood
42:16 - Denoising diffusion probabilistic model (DDPM)
48:04 - Conclusion
Thanks to raymond-wu on forums.fast.ai for the timestamps.
This was made as a companion to lesson 9 of the FastAI 2022 course by Jonathan Whitaker (his channel: youtube.com/channel/UCP6gT9X2oXYcssfZu05RV2g).
00:00 - Introduction
00:40 - Replicating the sampling loop
01:17 - The Auto-Encoder
03:55 - Adding Noise and image-to-image
08:43 - The Text Encoding Process
15:15 - Textual Inversion
18:36 - The UNET and classifier free guidance
24:41 - Sampling explanation
36:30 - Additional guidance
Errata: there should be some scaling done to the model inputs for the unet demo in cell 49 (19 minutes in) - see scheduler.scale_model_input in all the loops for the code that is missing. And in the autoencoder part the 'compression' isn't exactly 64 times since there are 4 channels in the latent representation and only 3 in the input.
Lesson 9A (Deep dive): youtu.be/0_BBRNYInx8
Lesson 9B (Math of diffusion): youtu.be/mYpjmM7O-30
This lesson starts with a tutorial on how to use pipelines in the Diffusers library to generate images. Diffusers is (in our opinion!) the best library available at the moment for image generation. It has many features and is very flexible. We explain how to use its many features, and discuss options for accessing the GPU resources needed to use the library.
We talk about some of the nifty tweaks available when using Stable Diffusion in Diffusers, and show how to use them: guidance scale (for varying the amount the prompt is used), negative prompts (for removing concepts from an image), image initialisation (for starting with an existing image), textual inversion (for adding your own concepts to generated images), Dreambooth (an alternative approach to textual inversion).
The second half of the lesson covers the key concepts involved in Stable Diffusion:
- CLIP embeddings
- The VAE (variational autoencoder)
- Predicting noise with the unet
- Removing noise with schedulers.
You can discuss this lesson, and access links to all notebooks and resources from it, at this forum topic: forums.fast.ai/t/lesson-9-part-2-preview/101336
0:00 - Introduction
6:38 - This course vs DALL-E 2
10:38 - How to take full advantage of this course
12:14 - Cloud computing options
14:58 - Getting started (Github, notebooks to play with, resources)
20:48 - Diffusion notebook from Hugging Face
26:59 - How stable diffusion works
30:06 - Diffusion notebook (guidance scale, negative prompts, init image, textual inversion, Dreambooth)
45:00 - Stable diffusion explained
53:04 - Math notation correction
1:14:37 - Creating a neural network to predict noise in an image
1:27:46 - Working with images and compressing the data with autoencoders
1:40:12 - Explaining latents that will be input into the unet
1:43:54 - Adding text as one hot encoded input to the noise and drawing (aka guidance)
1:47:06 - How to represent numbers vs text embeddings in our model with CLIP encoders
1:53:13 - CLIP encoder loss function
2:00:55 - Caveat regarding "time steps"
2:07:04 Why don’t we do this all in one step?
Thanks to fmussari for the transcript, and to Raymond-Wu (on forums.fast.ai) for the timestamps.
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-17/98231
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-16/98202
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-15/98107
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-14/98070
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-13/98031
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-11/97949
(Another version of this video is available with the silent sections sped up using the `unsilence` python library - it's 27 mins shorter: youtu.be/67FdzLSt4aA)
00:00 - Welcome
04:39 - Turn a github repo into a nbdev repo
06:48 - What are nbdev commands?
08:17 - How to find out the docs of a nbdev command
08:37 - What does nbdev_new give us?
10:54 How should you name ipynbs for your library
15:32 - What does the heading 1 and heading 2 do?
18:02 - Create a Card module
19:01 - Creating suits
21:23 - Creating a class
26:54 - How to overwrite the __str__ and __repr__
29:03 - How to write docs for input argument
33:13 - Create tests for the class
37:06 - How to define equality
38:32 - How to define a function outside of a class
42:18 - How to export your Card py file with nbdev_export
48:33 - How to preview your documentation
55:02 - How to do local test
56:52 - How to do debugging in real life
1:00:51 - Which cells should be exported with #|export
1:02:09 - Creating Deck module
1:06:41 - How to overwrite the __len__ and __contains__
1:15:22 - Automatic links
1:18:48 - Creating a function
1:21:50 - Pushing back to github
1:27:21 - run nbdev_docs to put the homepage inside README.md
1:28:47 - Release Your Library
1:30:16 - How nbdev makes PR easier for all
Timestamps based on notes by Daniel 深度碎片 on forums.fast.ai.
J.J. Allaire is the founder and CEO of RStudio and creator of the RStudio IDE. RStudio develops free and open source software for R and Python, and enterprise-ready professional products that help teams who use open source data science tools scale and share their work. Quarto® is an open-source scientific and technical publishing system built on Pandoc. In recent years J.J. has focused on tools for reproducible research and interoperability, including Quarto (tools for scientific communication) and R Markdown, as well as the reticulate, htmlwidgets, sparklyr, tensorflow, and keras R packages.
Jeremy Howard is a founding researcher at fast.ai and hon professor at the University of Queensland. fast.ai is a research and teaching lab dedicated to making deep learning more accessible. Jeremy has built a number of successful startups and published numerous high impact research papers and open source software products.
01:00 - Installing timm persistently in Paperspace
04:00 - Fixing broken symlink
06:30 - Navigating around files in vim
16:40 - Improving a model for a kaggle competition
24:00 - Saving a trained model
34:30 - Test time augmentation
39:00 - Prepare file for kaggle submission
45:00 - Compare new predictions with previous
46:00 - Submit new predictions
49:00 - How to create an ensemble model
54:30 - Change probability of affine transforms
57:00 - Discussion about improving accuracy
00:00 1 - Setting up Paperspace - Clone fastai/paperspace-setup
02:30 - pipi fastai & pipi -U fastai
03:43 - Installing universal-ctags: mambai universal-ctags
05:00 - Next step: Adding a normalization to TIMM models
06:06 - Oh! First lets fix pre-run.sh
07:35 - Normalization in vision_learner (with same pretrained model statistics)
09:40 - Adding TIMM models
13:10 - model.default_cfg() to get TIMM model statistics
16:00 - Lets go to _add_norm()… adding _timm_norm()
20:30 - Test and debugging
28:40 - Doing some redesign
32:23 - Applying redesign for TIMM
36:20 - create_timm_model and TimmBody
38:12 - Check default config from a TIMM models
39:05 - Making create_unet_model work with TIMM
40:20 - Basic idea of U-nets
41:25 - Dynamic U-net
00:00 - Intro
02:10 - Demo of text file manipulation using Vim
19:30 - Creating Youtube video markers using Vim
25:55 - Control + Z and fg for job management
27:57 - split vim into multiple windows
28:30 - Control + W to move between windows
28:40 - Fixing the pre-run.sh script in Paperspace
32:48 - Tips for learning Vim
41:09 - Configuring Vim with a vimrc file
45:00 - Using ctags to navigate a repository
00:00 - Recap on Paddy Competition
04:30 - Tips on getting votes for Kaggle notebooks
07:30 - Gist uploading question
10:30 - Weights and Biases Sweep
14:40 - Tracking GPU metrics
16:40 - fastgpu
20:00 - Using .gitconfig
21:00 - Analysis notebook
26:00 - Parallel coordinates chart on wandb
31:30 - Brute force hyperparameter optimisation vs human approach
37:30 - Learning rate finder
40:00 - Debugging port issues with ps
42:00 - Background sessions in tmux
46:20 - Strategy for iterating between notebooks
49:00 - Cell All Output toggle for overview
50:50 - Final transform for vit models
52:05 - swinv2 fixed resolution models
53:00 - Building an ensemble - appending predictions
55:50 - Model stacking
57:00 - Keeping track of submission notebooks
00:00 Create an notebook
04:13 Symlink from persistence storage
19:24 Create pre-run.sh from scratch
33:15 Create SSH keys from scratch
Topics covered:
- Setting up a paperspace server from scratch
- Paperspace persistent storage details
- pip vs conda/mamba
- Creating a new bash script
- #! script headers
- chmod permissions / octal masks
- Uploading and testing existing ssh keys
- How pre-run.sh works
05:10 - Running a kaggle notebook on your local computer
09:10 - Setting up to run on your own GPU server
15:40 - Get back to where we left off in walkthru7
16:00 - Get file sizes the slow way
17:00 - Using parallel processing to speed things up
23:00 - Selecting a different image model from timm
29:00 - Start fine tuning model
30:50 - Description of fine_tune
35:50 - Discussion of fit one cycle
48:30 - Applying fine tuned model to test set
49:00 - Reviewing docs for test data loader for inference
52:00 - Preparing file for kaggle submission
1:00:00 - Visually check results and submission file
1:02:10 - Submit entry to kaggle from command line
1:06:00 - Check leaderboard on kaggle site for problems
1:07:00 - Fix order of results file and resubmit
1:10:00 - Questions
00:00 People intro and questions to address later
03:27 Switch users in Linux
04:48 Introduction to git and Github
08:32 Build a website with github
19:00 Setting up and using ssh keys
39:01 Using tmux for better terminal productivity
48:54 Create a notebook in jupyter lab
54:26 Committing and pushing to git
58:55 Fork a repo
1:01:31 Installing fastai and fastbook
00:00 - Intros
03:26 - Jeremy's way of doing stuff
04:48 - What is a terminal/shell
06:07 - Setup your mind
06:52 - Guests expectations
14:07 - How to use a terminal/shell
19:18 - Installing python the right way
22:02 - Clean your home directory
31:59 - Using mamba/conda to install python libraries
40:52 - How to make and run a shell script
57:05 - Install other packages with mamba
1:02:04 - Install pytorch
1:09:12 - Install Jupyterlab
1:10:11 - Create the first notebook
00:00 - Questions
02:00 - Running notebooks in the background from scripts
04:00 - Connecting to your server using Xrdp
07:00 - Can you connect to Paperspace machines remotely?
06:30 - Installing Xrdp
13:30 - Dealing with startup issue in Paperspace
16:20 - Native windows in tmux with tmux -CC a
18:30 - Getting mouse support working in tmux
20:00 - Experimenting with closing notebooks while fine tuning
24:30 - Progressive resizing recap
26:00 - Building a weighted model
29:00 - Picking out the images that are difficult to classify
33:30 - Weighting with 1/np.sqrt(value_counts())
36:00 - Merging the weights with the image description dataframe
38:00 - Building a datablock
40:20 - How we want weighted dataloaders to work
41:30 - Datasets for indexing into to get an x,y pair
44:00 - Sorting list of weights and image files to match them up
47:30 - Batch transforms not applied because datasets didn’t have this method
49:00 - Reviewing errors
55:00 - Python debugger pdb %debug
56:50 - List comprehension to assign weight to images
59:00 - Use set_index instead of sort_values
59:30 - Review weighted dataloaders with show_batch
1:00:40 - Review of how weighted sampling will work
1:03:30 - Updating the fastai library to make weighted sampling easier
1:04:47 - Is WeightedDL a callback?
1:06:20 - modifying weighted_dataloaders function
1:10:00 - fixing tests for weighted_dataloaders
1:13:40 - editable install of pip in fastai
1:16:15 - modifying PETS notebooks to work on splitter
1:18:07 - How does splitters work in datablock
1:20:25 - modifying weighted dataloader by using weights from Dataset
1:23:36 - running tests in fastai and creating github issues
1:24:33 - fixing the failing test
1:29:40 - creating a commit to fix the issue
1:31:30 - nbdev hook to clean notebooks
00:00 Catch up Questions from last session
06:11 `settings.ini` and fastbook setup (more advanced)
08:19 The `$PATH` environment variable
12:22 Creating and using a conda environment
18:27 Creating a Paperspace notebook
33:12 The python debugger
43:08 Installing pip packages into your home directory
49:21 Persistent storage, mounted drives, and symlinks
56:27 Paperspace have different python environments by default
1:09:34 Creating a Paperspace notebook with everything set up automatically
1:16:35 Copying SSH keys to Paperspace to communicate with github
00:00 - Review of best vision models for fine tuning
10:50 - Learn export file pth format
12:30 - Multi-head deep learning model setup
16:00 - Getting a sense of the error rate
20:00 - Looking inside the model
22:30 - Shape of the model
23:40 - Last layer at the end of the head
26:00 - Changing the last layer
29:00 - Creating a DiseaseAndTypeClassifier subclass
38:00 - Debugging the plumbing of the new subclass
46:00 - Testing the new learner
49:00 - Create a new loss function
52:00 - Getting predictions for two different targets
56:00 - Create new error function
00:00 - Introduction and Questions
05:15 - MultiTask Classification Notebook
07:40 - Good fastai tutorials
08:30 - DataBlock API
12:35 - How does ImageBlock, get_image_files work
15:15 - How is aug_transforms working
17:30 - Converting ImageDataLoaders to DataBlock
22:08 - In PyTorch DataLoaders, what happens at the last batch?
23:23 - Step2: Make DataBlock spit three things
27:30 - Modifying get_y to send as two inputs
32:00 - Looking into Dataset objects
33:50 - Can we have multiple get_items?
35:20 - Hacky notebook using data frames for creating DataBlock
39:40 - How does TransformBlock and ImageBlock works
49:30 - Looking at the source code of TransformBlock, DataBlock code
54:10 - Dataset, Dataloaders discussion
58:30 - Defining DataBlock for Multi-task classification notebook
1:05:05 - Sneak peek into how the multi-task model is trained
6:36 - Creating a persistent environment in Paperspace
13:08 - Conda install mamba with -p
13:30 - Install universal-ctags using micromamba
14:50 - Clean up conda directory
18:30 - Fixing path to universal-ctags and mamba
20:20 - Create a bash.local file in /storage
23:30 - Install micromamba into conda folder
24:00 - Remove mamba and move conda folder into storage
24:40 - Edit pre-run.sh file with symlinks to conda
25:20 - Preserving .bash_history file
30:00 - Test setup on new machine
34:30 - Clone forked copy of fastbook
42:30 - Adding git config file to persistent storage
45:00 - Discussion about making contributions to repos with pull requests
48:00 - Comparing different versions with nbdime on Paperspace
48:20 - Start fastbook chapter 1
51:20 - "__all__" is pronounced "dunder all"
52:50 - A nifty trick for navigating source files
57:30 - Optimising storage use on Paperspace
59:40 - Move fastai config.ini into storage and symlink
1:05:45 - The Path.BASE_PATH variable trick
1:09:00 - The fastai L class: a dropin replacement for a list
00:00 - Start
01:04 - About Weighting (WeightedDL)
01:50 - Curriculum Learning / Top Losses
03:08 - Distribution of the test set vs training set
03:35 - Is Curriculum Learning related to Boosting?
04:25 - Focusing on examples that the model is getting wrong
04:38 - Are the labels ever wrong? By accident, or intentionally?
06:40 - Image annotation issues: Paddy Kaggle discussion 4
08:23 - UNIFESP X-ray Body Part Classifier Competition 4
10:20 - Medical images / DICOM Images
10:57 - fastai for medical imaging
11:40 - JPEG 2000 Compression
12:40 - ConvNet Paper
13:50 - On Research Field
15:30 - When a paper is worth reading?
17:14 - Quoc V. Le
17:50 - When to stop iterating on a model? - Using the right data.
20:10 - Taking advantage of Semi-Supervised Learning, Transfer Learning
21:33 - Not enough data on certain category. Binary Sigmoid instead of SoftMax
23:50 - Question about submitting to Kaggle
25:33 - Public and private leaderboard on Kaggle
29:30 - Where did we get to in the last lesson?
31:20 - GradientAccumulation on Jeremy’s Road to the Top, Part 3
37:20 - “Save & Run” a Kaggle notebook
38:55 - Next: How outputs and inputs to a model looks like
40:55 - Next: How the “middle” (convnet) of a model looks like
41:32 - Part 2: Outputs of a hidden layer
42:53 - The Ethical Side
44:30 - fastai1/courses/dl1/excel/
0:42 - Background for Kaggle Competitions
10:00 - Setting up for Kaggle competitions on you local machine
14:30 - Create API token for Kaggle
18:00 - Download kaggle competition zip file
25:00 - Using the pipe output to head
28:55 - Back to Paperspace
29:30 - Remove pip from /storage
30:00 - Install kaggle and update symlinks
32:00 - Upload kaggle API json
33:20 - Download kaggle competition file to Paperspace
35:00 - Install unzip to persistent conda env
36:45 - Unzipping kaggle file in notebooks is too slow
40:00 - Unzip kaggle file in home directory for speed
41:20 - Create an executable script for unzipping kaggle file
43:10 - Create a notebook to explore kaggle data
48:00 - Browse image files
51:00 - Review image metadata
53:00 - Image data loaders and labelling function
56:30 - Create a learner
57:00 - Monitor training with nvidia-smi dmon
1:02:00 - Summary
00:00 - Questions
00:05 - About the concept/capability of early stoppings
04:00 - Different models, which one to use
05:25 - Gradient Boosting Machine with different model predictions
07:25 - AutoML tools
07:50 - Kaggle winners approaches, ensemble
09:00 - Test Time Augmentation (TTA)
11:00 - Training loss vs validation loss
12:30 - Averaging a few augmented versions
13:50 - Unbalanced dataset and augmentation
15:00 - On balancing datasets
15:40 - WeightedDL, Weighted DataLoader
17:55 - Weighted sampling on Diabetic Retinopathy competition
19:40 - Lets try something…
21:40 - Setting an environment variable when having multiple GPUs
21:55 - Multi target model
23:00 - Debugging
27:04 - Revise transforms to 128x128 and 5 epochs
28:00 - Progressive resizing
29:16 - Fine tuning again but on larger 160x160 images
34:30 - Oops, small bug, restart (without creating a new learner)
37:30 - Re-run second fine-tuning
40:00 - How did you come up with the idea of progressive resizing?
41:00 - Changing things during training
42:30 - On the paper Fixing the train-test resolution discrepancy
44:15 - Fine tuning again but on larger 192x192 images
46:11 - A detour about paper reference management
48:27 - Final fine-tuning 256x192
49:30 - Looking at WeightedDL, WeightedDataLoader
57:08 - Back to the results of fine-tuning 256x192
58:20 - Question leading to look at callbacks
59:18 - About SaveModelCallback
01:00:56 - Contributing, Documentation, and looking at “Docments”
01:03:50 - Final questions: lr_find()
01:04:50 - Final questions: Training for longer, decreasing validation loss, epochs, error rate
01:06:15 - Final questions: Progressive resizing and reinitialization
01:08:00 - Final questions: Resolution independent models
00:00 - Questions
06:00 - Steps for Entering a Standard Image Recognition Competition on Kaggle
08:40 - The best models for fine tuning image recognition
12:00 - Thomas Capelle script to run experiments
14:00 - Github Gist
16:00 - Weights and Biases API
17:00 - Automating Gist generation
20:30 - Summarising and ranking models for fine tuning
23:00 - Scatter plot of performance by model family
25:40 - Best models for images that don't look like Imagenet
33:00 - Pretrained models - Model Zoo, Papers With Code, Huggingface
37:30 - Applying learning on Paddy notebook with small models
46:00 - Applying learning on large models
47:00 - Gradient accumulation to prevent out of memory
52:50 - Majority vote
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-10/97557
4:50 Describes rank. ⍤
5:28 Creates 2 layer, 3 row, 4 column array.
APL: a ← 2 3 4 ⍴ ⍳ 24 ⍝ 2 layer, 3 row, 4 column
5:40 Definition of +/ is that it sums the rows
6:00 Summing over the last axis is always the rows.
6:20 ⌿ (slash bar)- always sums over first/leading axis
6:40 Limit to rank 2. Result is identical as without
7:45 rank 0. Gives unmodified array
8:05 + ⌿ explanation. Treats it as a whole array
8:35 (+ ⌿ ⍤ 2)a
8:56 (+ ⌿ ⍤ 1)a ⍝ Equivalent to +/
9:20 Using rank 1 with + ⌿ is same as removing the bar to be +/
9:40 Function to compute average
10:00 m ← 2 3 ⍴ 6
10:40 Using ⌿ calculates correctly. One average per column
11:05 Average over the rows
11:35 Think of the bar (horizontal) as representing the rows
11:50 Horizontal, last axis reverse, transpose: ⌽, ⊖, ⍉
12:20 3 4 = 3 5 ⍝ Implied rank 0
13:10 Rank operator ⍤ is entirely general purpose
14:00 Debugging trick. {… ⋄ …}
14:15 New statement symbol: ⋄
15:25 Create our own trace operator
16:00 Use trick to create monadic version of trace: ⊢ identity
21:20 Can say operators have long left scope
23:00 Discussion about left-to-right and right-to-left
31:05 What’s proper APL? APL isn’t very opinionated.
32:00 Discussion about writing performant APL
32:55 Make functions leading axis oriented
33:10 Keep the code flat. Don’t use nested arrays.
35:00 Nested arrays aren’t contiguous.
35:10 Bottleneck is often memory throughput
36:10 Trick: Use boolean masks as much as you can
37:50 Use boolean
38:00 APL will squeeze arrays. stored as 1 bit booleans
39:25 Summary of good APL principles
40:45 Each operator: ¨
41:20 Don’t want loops, want array operations because of CPU support
49:50 Describing the definition of Each ¨
52:55 Borrow/down arrow ↓ (drop)
53:44 Never really any reason to use down arrow (↓)
55:25 Can’t use Each ¨ on rows but you can use Rank ⍤
We cover topics such as how to:
- Build and train deep learning, random forest, and regression models
- Deploy models
- Apply deep learning to computer vision, natural language processing, tabular analysis, and collaborative filtering problems
- Use PyTorch, the world’s fastest growing deep learning software, together with popular libraries such as fastai, Hugging Face Transformers, and gradio
You don’t need any special hardware or software — we’ll show you how to use free resources for both building and deploying models. You don’t need any university math either — we’ll teach you the calculus and linear algebra you need during the course.
00:00 - Introduction
00:25 - What has changed since 2015
01:20 - Is it a bird
02:09 - Images are made of numbers
03:29 - Downloading images
04:25 - Creating a DataBlock and Learner
05:18 - Training the model and making a prediction
07:20 - What can deep learning do now
10:33 - Pathways Language Model (PaLM)
15:40 - How the course will be taught. Top down learning
19:25 - Jeremy Howard’s qualifications
22:38 - Comparison between modern deep learning and 2012 machine learning practices
24:31 - Visualizing layers of a trained neural network
27:40 - Image classification applied to audio
28:08 - Image classification applied to time series and fraud
30:16 - Pytorch vs Tensorflow
31:43 - Example of how Fastai builds off Pytorch (AdamW optimizer)
35:18 - Using cloud servers to run your notebooks (Kaggle)
38:45 - Bird or not bird? & explaining some Kaggle features
40:15 - How to import libraries like Fastai in Python
40:42 - Best practice - viewing your data between steps
42:00 - Datablocks API overarching explanation
44:40 - Datablocks API parameters explanation
48:40 - Where to find fastai documentation
49:54 - Fastai’s learner (combines model & data)
50:40 - Fastai’s available pretrained models
52:02 - What’s a pretrained model?
53:48 - Testing your model with predict method
55:08 - Other applications of computer vision. Segmentation
56:48 - Segmentation code explanation
58:32 - Tabular analysis with fastai
59:42 - show_batch method explanation
1:01:25 - Collaborative filtering (recommendation system) example
1:05:08 - How to turn your notebooks into a presentation tool (RISE)
1:05:45 - What else can you make with notebooks?
1:08:06 - What can deep learning do presently?
1:10:33 - The first neural network - Mark I Perceptron (1957)
1:12:38 - Machine learning models at a high level
1:18:27 - Homework
Thanks to bencoman, mike.moloch, amr.malik, and gagan on forums.fast.ai for creating the transcript.
Thanks to Raymond-Wu on forums.fast.ai for help with chapter titles.
02:09 TwoR model
04:43 How to create a decision tree
07:02 Gini
10:54 Making a submission
15:52 Bagging
19:06 Random forest introduction
20:09 Creating a random forest
22:38 Feature importance
26:37 Adding trees
29:32 What is OOB
32:08 Model interpretation
35:47 Removing the redundant features
35:59 What does Partial dependence do
39:22 Can you explain why a particular prediction is made
46:07 Can you overfit a random forest
49:03 What is gradient boosting
51:56 Introducing walkthrus
54:28 What does fastkaggle do
1:02:52 fastcore.parallel
1:04:12 item_tfms=Resize(480, method='squish')
1:06:20 Fine-tuning project
1:07:22 Criteria for evaluating models
1:10:22 Should we submit as soon as we can
1:15:15 How to automate the process of sharing kaggle notebooks
1:20:17 AutoML
1:24:16 Why the first model run so slow on Kaggle GPUs
1:27:53 How much better can a new novel architecture improve the accuracy
1:28:33 Convnext
1:31:10 How to iterate the model with padding
1:32:01 What does our data augmentation do to images
1:34:12 How to iterate the model with larger images
1:36:08 pandas indexing
1:38:16 What data-augmentation does tta use?
Transcript thanks to fmussari, gagan, bencoman, mike.moloch on forums.fast.ai
Timestamps based on notes by daniel on forums.fast.ai
00:55 - Reminder to use the fastai book as a companion to the course
02:06 - aiquizzes.com for quizzes on the book
02:36 - Reminder to use fastai forums for links, notebooks, questions, etc.
03:42 - How to efficiently read the forum with summarizations
04:13 - Showing what students have made since last week
06:45 - Putting models into production
08:10 - Jupyter Notebook extensions
09:49 - Gathering images with the Bing/DuckDuckGo
11:10 - How to find information & source code on Python/fastai functions
12:45 - Cleaning the data that we gathered by training a model
13:37 - Explaining various resizing methods
14:50 - RandomResizedCrop explanation
15:50 - Data augmentation
16:57 - Question: Does fastai's data augmentation copy the image multiple times?
18:30 - Training a model so you can clean your data
19:00 - Confusion matrix explanation
20:33 - plot_top_losses explanation
22:10 - ImageClassifierCleaner demonstration
25:28 - CPU RAM vs GPU RAM (VRAM)
27:18 - Putting your model into production
30:20 - Git & Github desktop
31:30 - For Windows users
37:00 - Deploying your deep learning model
37:38 - Dog/cat classifier on Kaggle
38:55 - Exporting your model with learn.export
39:40 - Downloading your model on Kaggle
41:30 - How to take a model you trained to make predictions
43:30 - learn.predict and timing
44:22 - Shaping the data to deploy to Gradio
45:47 - Creating a Gradio interface
48:25 - Creating a Python script from your notebook with #|export
50:47 - Hugging Face deployed model
52:12 - How many epochs do you train for?
53:16 - How to export and download your model in Google Colab
54:25 - Getting Python, Jupyter notebooks, and fastai running on your local machine
1:00:50 - Comparing deployment platforms: Hugging Face, Gradio, Streamlit
1:02:13 - Hugging Face API
1:05:00 - Jeremy's deployed website example - tinypets
1:08:23 - Get to know your pet example by aabdalla
1:09:44 - Source code explanation
1:11:08 - Github Pages
Thanks to bencoman, mike.moloch, amr.malik, gagan, fmussari, kurianbenoy, and heylara on forums.fast.ai for creating the transcript.
Thanks to Raymond-Wu on forums.fast.ai for creating the timestamps.
00:01:59 - Linear model and neural net from scratch
00:07:30 - Cleaning the data
00:26:46 - Setting up a linear model
00:38:48 - Creating functions
00:39:39 - Doing a gradient descent step
00:42:15 - Training the linear model
00:46:05 - Measuring accuracy
00:48:10 - Using sigmoid
00:56:09 - Submitting to Kaggle
00:58:25 - Using matrix product
01:03:31 - A neural network
01:09:20 - Deep learning
01:12:10 - Linear model final thoughts
01:15:30 - Why you should use a framework
01:16:33 - Prep the data
01:19:38 - Train the model
01:21:34 - Submit to Kaggle
01:23:22 - Ensembling
01:25:08 - Framework final thoughts
01:26:44 - How random forests really work
01:28:57 - Data preprocessing
01:30:56 - Binary splits
01:41:34 - Final Roundup
Timestamps thanks to RogerS49 on forums.fast.ai.
Transcript thanks to azaidi06, fmussari, wyquek, heylara on forums.fast.ai.
01:36 "Lesson 0" How to fast.ai
02:25 How to do a fastai lesson
04:28 How to not self-study
05:28 Highest voted student work
07:56 Pets breeds detector
08:52 Paperspace
10:16 JupyterLab
12:11 Make a better pet detector
13:47 Comparison of all (image) models
15:49 Try out new models
19:22 Get the categories of a model
20:40 What’s in the model
21:23 What does model architecture look like
22:15 Parameters of a model
23:36 Create a general quadratic function
27:20 Fit a function by good hands and eyes
30:58 Loss functions
33:39 Automate the search of parameters for better loss
42:45 The mathematical functions
43:18 ReLu: Rectified linear function
45:17 Infinitely complex function
49:21 A chart of all image models compared
52:11 Do I have enough data?
54:56 Interpret gradients in unit?
56:23 Learning rate
1:00:14 Matrix multiplication
1:04:22 Build a regression model in spreadsheet
1:16:18 Build a neuralnet by adding two regression models
1:18:31 Matrix multiplication makes training faster
1:21:01 Watch out! it’s chapter 4
1:22:31 Create dummy variables of 3 classes
1:23:34 Taste NLP
1:27:29 fastai NLP library vs Hugging Face library
1:28:54 Homework to prepare you for the next lesson
Many thanks to bencoman, wyquek, Raymond Wu, and fmussari on forums.fast.ai for writing the transcript.
Timestamps thanks to "Daniel 深度碎片" on forums.fast.ai.
04:46 - Parameters in PyTorch
07:42 - Embedding from scratch
12:21 - Embedding interpretation
18:06 - Collab filtering in fastai
22:11 - Embedding distance
24:22 - Collab filtering with DL
30:25 - Embeddings for NLP
34:56 - Embeddings for tabular
44:33 - Convolutions
57:07 - Optimizing convolutions
58:00 - Pooling
1:05:12 - Convolutions as matrix products
1:08:21 - Dropout
1:14:27 - Activation functions
1:20:41 - Jeremy AMA
1:20:57 - How do you stay motivated?
1:23:38 - Skew towards big expensive models
1:26:25 - How do you homeschool children
1:28:26 - Walk-through as a separate course
1:29:59 - How do you turn model into a business
1:32:46 - Jeremy's productivity hacks
1:36:03 - Final words
Transcript thanks to fmussari and bencoman from forums.fast.ai
Timestamps based on notes by Daniel from forums.fast.ai
02:47 - What are the benefits of using larger models
05:58 - Understanding GPU memory usage
08:04 - What is GradientAccumulation?
20:52 - How to run all the models with specifications
22:55 - Ensembling
37:51 - Multi-target models
41:24 - What does `F.cross_entropy` do
45:43 - When do you use softmax and when not to?
46:15 - Cross_entropy loss
49:53 - How to calculate binary-cross-entropy
52:19 - Two versions of cross-entropy in pytorch
54:24 - How to create a learner for prediction two targets
1:02:00 - Collaborative filtering deep dive
1:08:55 - What are latent factors?
1:11:28 - Dot product model
1:18:37 - What is embedding
1:22:18 - How do you choose the number of latent factors
1:27:13 - How to build a collaborative filtering model from scratch
1:29:57 - How to understand the `forward` function
1:32:47 - Adding a bias term
1:34:29 - Model interpretation
1:39:06 - What is weight decay and How does it help
1:43:47 - What is regularization
Transcript thanks to nikem, fmussari, wyquek, bencoman, and gagan from forums.fast.ai
Timestamps based on notes by Daniel from forums.fast.ai
00:03:24 - Finetuning pretrained model
00:05:14 - ULMFit
00:09:15 - Transformer
00:10:52 - Zeiler & Fergus
00:14:47 - US Patent Phrase to Phase Matching Kaggle competition
00:16:10 - NLP Classification
00:20:56 - Kaggle configs, insert python in bash, read competition website
00:24:51 - Pandas, numpy, matplotlib, & pytorch
00:29:26 - Tokenization
00:33:20 - Huggingface model hub
00:36:40 - Examples of tokenized sentences
00:38:47 - Numericalization
00:41:13 - Question: rationale behind how input data was formatted
00:43:20 - ULMFit fits large documents easily
00:45:55 - Overfitting & underfitting
00:50:45 - Splitting the dataset
00:52:31 - Creating a good validation set
00:57:13 - Test set
00:59:00 - Metric vs loss
01:01:27 - The problem with metrics
01:04:10 - Pearson correlation
01:10:27 - Correlation is sensitive to outliers
01:14:00 - Training a model
01:19:20 - Question: when is it ok to remove outliers?
01:22:10 - Predictions
01:25:30 - Opportunities for research and startups
01:26:16 - Misusing NLP
01:33:00 - Question: isn’t the target categorical in this case?
Transcript thanks to wyquek, jmp, bencoman, fmussari, mike.moloch, amr.malik, kurianbenoy, gagan, and Raymond Wu on forums.fast.ai.
Timestamps thanks to RogerS49 and Wyquek on forums.fast.ai.
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-10/97530
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-9/97510
Discuss this session here: youtu.be/bRr7V38Oa7o
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-7/97417
Discuss this session here: forums.fast.ai/t/fast-ai-apl-study-session-6/97398
Discuss this session here: forums.fast.ai/t/apl-study-session-5/97379
arraycast.com/episodes/episode31-jeremy-howard
00:01:15 Dyalog Problem /solving Contest contest.dyalog.com/?goto=welcome
00:02:40 Jeremy Howard en.wikipedia.org/wiki/Jeremy_Howard_(entrepreneur)
00:04:30 APL Study Group forums.fast.ai/t/apl-array-programming/97188
00:10:20 AT Kearney en.wikipedia.org/wiki/AT_Kearney
00:12:33 MKL (Intel) en.wikipedia.org/wiki/Math_Kernel_Library
00:13:00 BLAS http://www.netlib.org/blas
00:13:11 Perl BQN mlochbaum.github.io/BQN/running.html
00:14:06 Raku en.wikipedia.org/wiki/Raku_%28programming_language%29
00:15:45 kaggle kaggle.com
00:16:52 R en.wikipedia.org/wiki/R_(programming_language)
00:18:50 Neural Networks en.wikipedia.org/wiki/Artificial_neural_network
00:19:50 Enlitic enlitic.com
00:20:01 Fast.ai fast.ai
00:21:02 Numpy numpy.org
00:21:26 Leading Axis Theory aplwiki.com/wiki/Leading_axis_theory
00:21:31 Rank Conjunction code.jsoftware.com/wiki/Vocabulary/quote
00:21:40 Einstein notation en.wikipedia.org/wiki/Einstein_notation
00:22:55 CUDA en.wikipedia.org/wiki/CUDA
00:28:51 Numpy Another Iverson Ghost dev.to/bakerjd99/numpy-another-iverson-ghost-9mc
00:30:11 Pivot Tables en.wikipedia.org/wiki/Pivot_table
00:30:36 SQL en.wikipedia.org/wiki/SQL
00:31:25 Larry Wall "The three chief virtues of a programmer are: Laziness, Impatience and Hubris."
00:32:00 Python python.org
00:36:25 Regular Expressions en.wikipedia.org/wiki/Regular_expression
00:36:50 PyTorch pytorch.org
00:37:39 Notation as Tool of Thought jsoftware.com/papers/tot.htm
00:37:55 Aaron Hsu codfns https://scholarworks.iu.edu/dspace/handle/2022/24749
00:38:40 J jsoftware.com/#
00:39:06 Eric Iverson on Array Cast arraycast.com/episodes/episode10-eric-iverson
00:40:18 Triangulation Jeremy Howard youtube.com/watch?v=hxB-rEQvBeM
00:41:48 Google Brain en.wikipedia.org/wiki/Google_Brain
00:42:30 RAPIDS rapids.ai
00:43:40 Julia julialang.org
00:43:50 llvm llvm.org
00:44:07 JAX jax.readthedocs.io/en/latest/notebooks/quickstart.html
00:44:21 XLA tensorflow.org/xla
00:44:32 MILAR tensorflow.org/mlir
00:44:42 Chris Lattner en.wikipedia.org/wiki/Chris_Lattner
00:44:53 Tensorflow tensorflow.org
00:49:33 torchscript pytorch.org/tutorials/beginner/Intro_to_TorchScript_tutorial.html
00:50:09 Scheme en.wikipedia.org/wiki/Scheme_(programming_language)
00:50:28 Swift en.wikipedia.org/wiki/Swift_(programming_language)
00:51:10 DragonBox Algebra dragonbox.com/products/algebra-12
00:52:47 APL Glyphs aplwiki.com/wiki/Glyph
00:53:24 Dyalog APL dyalog.com
00:54:24 Jupyter jupyter.org
00:55:44 Jeremy's tweet of Meta Math twitter.com/jeremyphoward/status/1543738953391800320
00:56:37 Power function aplwiki.com/wiki/Power_(function)
01:03:06 Reshape ⍴ aplwiki.com/wiki/Reshape
01:03:40 Stallman 'Rho, rho, rho' stallman.org/doggerel.html#APL
01:04:20 APLcart aplcart.info
01:06:12 J for C programmers jsoftware.com/help/jforc/contents.htm
01:07:54 Transpose episode arraycast.com/episodes/episode29-transpose
01:10:00 APLcart video youtube.com/watch?v=r3owA7tfKE8
01:12:28 Functional Programming en.wikipedia.org/wiki/Functional_programming
01:13:00 List Comprehensions docs.python.org/3/tutorial/datastructures.html#list-comprehensions
01:13:30 BQN to J mlochbaum.github.io/BQN/doc/fromJ.html
01:18:15 Einops cgarciae.github.io/einops/1-einops-basics
01:19:30 April Fools APL ci.tc39.es/preview/tc39/ecma262/sha/efb411f2f2a6f0e242849a8cc8d7e21bbcdff543/#sec-apl-expression-rules
01:20:35 Flask library flask.palletsprojects.com/en/2.1.x
01:21:22 JuliaCon 2022 juliacon.org/2022
01:28:05 Myelination en.wikipedia.org/wiki/Myelin
01:29:15 Sanyam Bhutani interview youtube.com/watch?v=g_6nQBsE4pU&t=2150s
01:31:27 Jo Boaler Growth Mindset youcubed.org/resource/growth-mindset
01:33:45 Discovery Learning en.wikipedia.org/wiki/Discovery_learning
01:37:05 Iverson Bracket en.wikipedia.org/wiki/Iverson_bracket
01:39:14 Radek Osmulski Meta Learning rosmulski.gumroad.com/l/learn_machine_learning
01:40:12 Top Down Learning medium.com/@jacksonbull1987/top-down-learning-4743f16d63d3
01:41:20 Anki apps.ankiweb.net
01:43:50 Lex Fridman Interview youtube.com/watch?v=J6XcP4JOHmk
Many thanks to Bob Therriault, Rodrigo Girão Serrão and Adám Brudzewsky for gathering these links.
Discuss this session here: forums.fast.ai/t/apl-study-session-4/97324
Discuss this session here: forums.fast.ai/t/apl-study-session-3/97307
Discuss this session here: forums.fast.ai/t/apl-study-session-2/97284