CS 547 / ECE 547: Neural Networks (Fall 2000)
Text, Lectures and such
Instructor: Professor Barak A. Pearlmutter
Office: MechE 436.
Office hours: I'm always in. Officially: W 23.
Lectures: MW 5:306:45 TAPY 217
Text: Neural Networks for Pattern Recognition by
Christopher M. Bishop, Oxford Press, ISBN 0198538642.
Syllabus and Notes
 Mon Aug 21

Three paths to Neural Networks:
 biology / neuroscience
 psychology / cognitive science
 engineering / statistics
Neuroscience:
 Real neurons are very slow but the brain is fast.
 Other interesting properties of the nervous system.
 Neurons come in many shapes but share many common features.
 Wed Aug 23

Cognitive Science:
 reaction against AI's failed ``symbol system hypothesis''
 out of cartoon models emerge appropriate properties
 bistable Necker cube perception
 verblearning (Rummelhart & McClellan)
 relearning after damage: time course & reacquisition of
unrelated material
 NetTalk
Engineering / Statistics
 probabilistic formulations
 generalization
 spinoffs: EM, working speech recognition, OCR
 applications
 too many to list
 credit card fraud
 network routing
 backgammon (TDgammon)
 cold roll steel mill control
 implantable heart defibrillator control
 Tokamok magnetic confinement plasma fusion reactor
 remainder of course
 Mon Aug 28
 Notes, thanks to Ben Jones.
 Kinds of learning: supervised, unsupervised, reinforcement.
 WidrowHuff online LMS.
 first application of neural network (radar beam forming)
 weight space w, error surface E(w).
 deterministic vs online / stochastic
 forward reference: analysis of convergence
 Perceptrons and the perceptron learning rule.
 linear separability
 convergence proof and conditions
 implementation of threshold as bias input
 solution via linear programming
 Wed Aug 30

 Comparison of LMS and Perceptron Learning Rule
 both adjust weights by adding global quantity times input
 both can take big steps if zero error point exists
 only LMS converges if no zero error point exists (needs small steps)
 both can use ``replace threshold by negative bias'' trick
 both compute weighted sum y
 both use difference between output and target to scale weight change
 Q: when will linear decision surface be optimal?
 A: requires probabilistic framework
 Bayes Rule
 Wed Sep 6

 Bayes Rule continued
 Gaussian density function
 Quadratic forms
 Decision surface for equivariant Gaussian class densities
 On programming style

Many assignments will use the MNIST Database
of Handwritten Digits, a version of the MNIST digits database
massaged by Yann
LeCun. (A copy of the dataset will be available on the CS
machines, and maybe another on the CIRT machines.)
 Assignment 1 (due Fri Sep 22)
 Mon Sep 11
 Intro to generalization (guest lecturer: Fernando Lozano)
 Wed Sep 13
 Generalization continued (guest lecturer: Fernando Lozano)
 Mon Sep 18
 Decision surface for equivariant Gaussians (continued).
 Wed Sep 20
 Posterior probability for equivariant Gaussians: sigmoid.
Gradient descent for singlelayer sigmoidal network.
 Mon Sep 25
 The method of backward propagation of errors.
 Wed Sep 27
 What Entropy means to me.
 Mon Oct 2

 Maximum likelihood density estimation,
 squared error,
 estimation of Gaussians from data,
 mixture models, and
 the EM algorithm.
 Wed Oct 4

 EM (continued)
 competitive learning and biological plausibility
 Assignment: implement batch EM of means for a mixture of
one dimensional equivariant Gaussians.
 Mon Oct 16

Asymptotic analysis of deterministic simple gradient descent
 Wed Oct 18

Deterministic gradient descent (continued)
Assignment 3
 Mon Oct 23
Optimization.
Approximate line search using Hessian/vector product via R{backprop}.
 Wed Oct 25

Convolutional networks: local group invariances, weight sharing.
 Mon Oct 30

Tangent distance.
Tangent prop.
 Wed Nov 16

The ADOLC library for automatic backpropagification.
 Mon Nov 20

Boltzmann Machines
 Wed Nov 22

Relearning in a backprop net: Assignment 4.
Barak Pearlmutter
<bap@cs.unm.edu>