@ArtOfTheProblem
  @ArtOfTheProblem
Art of the Problem | Reinforcement Learning | A history from Tic-Tac-Toe to Humanoids @ArtOfTheProblem | Uploaded 3 weeks ago | Updated 11 hours ago
How did AI systems learn to act & "feel"? I follow the history of Reinforcement Learning and the development of Value, Q, Policy functions & TD Learning. Starting with learning tic tac toe, checkers, backgammon, as well as physical problems (cart and pole), walking, grasping (OpenAI's dexterous robotic hand). I found the history a bit of a mess so i tried to clean it up. Open AI o1

Thanks to Jane Street for sponsoring this video. They are hiring people interested in ML! learn more about their work and open roles (and support me), visit their website: jane-st.co/ml

I also follow the process of transferring simulated skills to the real world (domain randomization) and witness the emergence of human-like behaviors in AI agents. It leaves us with a provocative question: where is the line between actions and words? What is the role of an GPT for actions?

Featuring insights from:
Claude Shannon
Arthur Samuel
Gerald Tesauro
Richard Sutton
David Silver
Deep Mind/Open AI etc.

00:00 - Introduction
00:32 - Learning Tic Tac Toe
02:00 - Learning Cart and pole
04:20 - Shannon & Chess
06:50 - Samuel's Checkers
09:25 - TD Gammon (Gerald Tesaruo)
11:00 - TD Learning
14:30 - Learning Atari (DQN)
17:28 - DIrect Policy Gradiant
19:40 - Domain Randomization
Reinforcement Learning | A history from Tic-Tac-Toe to HumanoidsWhy Transformers Are So PowerfulHow AI Learns (Backpropagation 101)TEASER: Episode 5 (Artificial Intelligence/Deep Learning)Sieve of Eratosthenes (Prime Adventure part 4)P = NP Explained Visually  (Big O Notation & Complexity Theory)Discovery Of Electric Motors #physics #electricalengineering  #science #technologyGambling with Secrets: 8/8 (RSA Encryption)RSA Encryption (step 3)Public Key Cryptography: RSA Encryption AlgorithmFermat primality test (Prime Adventure Part 10)What is Logic?

Reinforcement Learning | A history from Tic-Tac-Toe to Humanoids @ArtOfTheProblem

SHARE TO X SHARE TO REDDIT SHARE TO FACEBOOK WALLPAPER