Convolutional Network Demo from 1989 @YannLeCunPhD

Yann LeCun This is a demo of "LeNet 1", the first convolutional network that could recognize handwritten digits with good speed and accuracy.

It was developed in early 1989 in the Adaptive System Research Department, headed by Larry Jackel, at Bell Labs in Holmdel, NJ.

This "real time" demo ran on a DSP card sitting in a 486 PC with a video camera and frame grabber card. The DSP card had an AT&T DSP32C chip, which was the first 32-bit floating-point DSP and could reach an amazing 12.5 million multiply-accumulate operations per second.

The network was trained using the SN environment (a Lisp-based neural net simulator, the predecessor of Lush, itself a kind of ancestor to Torch7, itself the ancestor of PyTorch).
We wrote a kind of "compiler" in SN that produced a self-contained piece of C code that could run the network. The network weights were array literals inside the C source code.

The network architecture was a ConvNet with 2 layers of 5x5 convolution with stride 2, and two fully-connected layers on top. There were no separate pooling layer (it was too expensive).
It had 9760 parameters and 64,660 connections.

Shortly after this demo was put together, we started working with a development group and a product group at NCR (then a subsidiary of AT&T). NCR soon deployed ATM machines that could read the numerical amounts on checks, initially in Europe and then in the US. The ConvNet was running on the DSP32C card sitting in a PC inside the ATM. Later, NCR deployed a similar system in large check reading machines that banks use in their back offices. At some point in the late 90's these machines were processing 10 to 20% of all the checks in the US.

The network shown in this demo is described in our NIPS 1989 paper "Handwritten digit recognition with a back-propagation network".
https://direct.mit.edu/neco/article-abstract/1/4/541/5515/Backpropagation-Applied-to-Handwritten-Zip-Code

The check reading system is described in our 1998 Proc. IEEE paper "Gradient-Based Learning Applied to Document Recognition" and in various shorter papers before that.

Thanks to Larry Jackel for digitizing and editing the old VHS tape (and for holding the camera). There are cameo appearances by Donnie Henderson (who put together much of the demo) and Rich Howard, our lab director.

updated 10 years ago

Convolutional Network Demo from 1989

Yann LeCun 2014-06-02 | This is a demo of "LeNet 1", the first convolutional network that could recognize handwritten digits with good speed and accuracy.

It was developed in early 1989 in the Adaptive System Research Department, headed by Larry Jackel, at Bell Labs in Holmdel, NJ.

This "real time" demo ran on a DSP card sitting in a 486 PC with a video camera and frame grabber card. The DSP card had an AT&T DSP32C chip, which was the first 32-bit floating-point DSP and could reach an amazing 12.5 million multiply-accumulate operations per second.

The network was trained using the SN environment (a Lisp-based neural net simulator, the predecessor of Lush, itself a kind of ancestor to Torch7, itself the ancestor of PyTorch).
We wrote a kind of "compiler" in SN that produced a self-contained piece of C code that could run the network. The network weights were array literals inside the C source code.

The network architecture was a ConvNet with 2 layers of 5x5 convolution with stride 2, and two fully-connected layers on top. There were no separate pooling layer (it was too expensive).
It had 9760 parameters and 64,660 connections.

Shortly after this demo was put together, we started working with a development group and a product group at NCR (then a subsidiary of AT&T). NCR soon deployed ATM machines that could read the numerical amounts on checks, initially in Europe and then in the US. The ConvNet was running on the DSP32C card sitting in a PC inside the ATM. Later, NCR deployed a similar system in large check reading machines that banks use in their back offices. At some point in the late 90's these machines were processing 10 to 20% of all the checks in the US.

The network shown in this demo is described in our NIPS 1989 paper "Handwritten digit recognition with a back-propagation network".
https://direct.mit.edu/neco/article-abstract/1/4/541/5515/Backpropagation-Applied-to-Handwritten-Zip-Code

The check reading system is described in our 1998 Proc. IEEE paper "Gradient-Based Learning Applied to Document Recognition" and in various shorter papers before that.

Thanks to Larry Jackel for digitizing and editing the old VHS tape (and for holding the camera). There are cameo appearances by Donnie Henderson (who put together much of the demo) and Rich Howard, our lab director.

lagrtest11 nsnyu run3

Yann LeCun 2023-10-16 | Early head-to-head competition between the
NYU/Netscale vs U-Penn mobile robots, as part of the DARPA-funded LAGR project.
The project ran from 2005 to 2009.
Papers:
scholar.google.com/citations?view_op=view_citation&citation_for_view=WLN3QrAAAAAJ:Wp0gIr-vW9MC
scholar.google.com/citations?view_op=view_citation&citation_for_view=WLN3QrAAAAAJ:TFP_iSt0sucC

lagrtest11 nsnyu run2

lagrtest11 nsnyu run1

20060331 mvi 0183 mpeg4

Yann LeCun 2023-10-16 | Early head-to-head competition between the
NYU/Netscale vs U-Penn mobile robots, as part of the DARPA-funded LAGR project.
Video is from March 2006.
The project ran from 2005 to 2009.
Papers:
scholar.google.com/citations?view_op=view_citation&citation_for_view=WLN3QrAAAAAJ:Wp0gIr-vW9MC
scholar.google.com/citations?view_op=view_citation&citation_for_view=WLN3QrAAAAAJ:TFP_iSt0sucC

20060331 hoh 00553029001

20060331 hoh 00535928001

20060331 hoh 00503028001

20060331 hoh 00484618001

20060331 hoh 00470903001

20060331 hoh 00440906001

20060331 hoh 00391102001

20060331 hoh 00370718001

Q&A sessions for A Path Towards Autonomous AI, by Yann LeCun, Baidu 2022-02-22

Yann LeCun 2022-03-01 | Q&A sessions for the talk by Yann LeCun:
"A Path Towards Autonomous AI"
Hosted virtually by Baidu on 2022-02-22.

Talk is here: youtu.be/DokLw1tILlw

Yann LeCun: A Path Towards Autonomous AI, Baidu 2022-02-22

Yann LeCun 2022-02-25 | Technical talk by Yann LeCun:
"A Path Towards Autonomous AI"
Hosted virtually by Baidu on 2022-02-22.

Video of Q&A sessions here: youtu.be/Qgh2IU_fRMs

TL;DR:
- autonomous AI requires predictive world models
- world models must be able to perform multimodal predictions
- solution: Joint Embedding Predictive Architecture (JEPA)
- JEPA makes prediction in representation space, and can choose to ignore irrelevant or hard-to-predict details.
- JEPA can be trained non-contrastively by (1) making the representations of input maximally informative, (2) making the representations predictable from each other, (3) regularizing latent variables necessary for prediction.
- JEPAs can be stacked to make long-term/long-range predictions in more abstract representation spaces.
- Hierarchical JEPAs can be used for hierarchical planning.

Explanatory blog post: ai.facebook.com/blog/yann-lecun-advances-in-ai-research

Topics:
- How to get machines to learn like humans and animals?
- Challenges in AI: self-supervised learning, reasoning, hierarchical planning
- Learning models of the world
- architecture for autonomous AI: world model, cost, actor, perception, configurator, short-term memory.
- perception-action cycle: Mode-1 (reactive) and Mode-2 (planning)
- Intrinsic Cost and Trainable Cost modules
- building and training a world model
- self-supervised learning (SSL)
- Energy-Based Models:
- contrastive and regularized training methods
- EBM architectures: Joint Embedding Predictive Architecture (JEPA)
- contrastive methods for training JEPA (bad)
- regularized (non-contrastive) methods for training JEPA (good)
- VICReg: Variance Invariance Covariance Regularization
- hierarchical JEPA for world models.
- hierarchical planning under uncertainty with hierarchical JEPA

Real-Time Object Recognition with Convolutional Net (2008)

Yann LeCun 2022-01-11 | A demo from 2008 of a convolutional network performing object detection and recognition in real time on a laptop.
This is a regular 2008 laptop (with no GPU) and a USB camera.

The ConvNet is relatively small, with about 90,000 parameters.
It is described in the following CVPR 2004 paper: http://yann.lecun.com/exdb/publis/pdf/lecun-04.pdf
It was implemented in Lush ( http://lush.sourceforge.net ).

The ConvNet was trained on the NORB dataset, which has 5 categories (animal, human, car, truck, airplane), and 5 object instance per category (5 toy airplanes, etc) painted with a uniform color. There are many images of each object under different viewpoints, lighting, and backgrounds.
https://cs.nyu.edu/~ylclab/data/norb-v1.0/

Yanns 60th Birthday video from his padawans.

Yann LeCun 2021-07-09 | A video of current and former students and postdocs of Yann LeCun's wishing him a happy 60th birthday on July 8th 2020.

Put together by Aishwarya Kamath.

Face detector demo with ConvNet (NEC Labs 2003)

Yann LeCun 2015-02-14 | Demo of the convolutional network face detector built at NEC Labs in 2003 by Rita Osadchy, Matt Miller and Yann LeCun.

M. Osadchy, Y. LeCun and M. Miller: Synergistic Face Detection and Pose Estimation with Energy-Based Models, Journal of Machine Learning Research, 8:1197-1215, May 2007

http://yann.lecun.com/exdb/publis/index.html#osadchy-07

DrLIM: learning an embedding with Siamese nets (2006)

Yann LeCun 2014-06-21 | Raia Hadsell, Sumit Chopra and Yann LeCun: Dimensionality Reduction by Learning an Invariant Mapping, Proc. Computer Vision and Pattern Recognition Conference (CVPR'06), IEEE Press, 2006

Off-Road robot navigation with convolutional networks(LAGR Project 2008))

Yann LeCun 2014-06-21 | Raia Hadsell, Pierre Sermanet, Marco Scoffier, Ayse Erkan, Koray Kavackuoglu, Urs Muller and Yann LeCun: Learning Long-Range Vision for Autonomous Off-Road Driving, Journal of Field Robotics, 26(2):120-144, February 2009

Semantic Segmentation (8 categories)

Yann LeCun 2014-06-21 | Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun: Learning Hierarchical Features for Scene Labeling, IEEE Transactions on Pattern Analysis and Machine Intelligence, August 2013.

Semantic Segmentation (33 categories)

Pedestrian detection with convolutional networks, part 1 (CVPR 2013)

Yann LeCun 2014-06-21 | Demo of ConvNet-based pedestrian detection as described in the paper:
"Pedestrian Detection with Unsupervised Multi-stage Feature Learning"
Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann LeCun;
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 3626-3633

openaccess.thecvf.com/content_cvpr_2013/html/Sermanet_Pedestrian_Detection_with_2013_CVPR_paper.html

Pedestrian detection with convolutional networks, part 2 (CVPR 2013)

Semantic Segmentation of videos from RGBD images with a ConvNet (2014)

Yann LeCun 2014-06-21 | Demo of system described in the paper:
C. Couprie, C Farabet, L Najman, Y LeCun:
"Toward Real-time Indoor Semantic Segmentation Using Depth Information"
JMLR, 2014
http://yann.lecun.com/exdb/publis/pdf/couprie-jmlr-14.pdf