Oct 27, 2003
-------------

- Perceptron = single neuron
	- threshold perceptron
	- sigmoid perceptron

- Learning rules for perceptrons
	- threshold perceptron rule
	- sigmoid perceptron rule

- Properties of learning rules
	- threshold rule converges only when
          linearly separable
	- sigmoid rule converges irrespective
		- (i.e., even for XOR, to an "in between" value)
      
- Derivation of sigmoid rule
	- formulation of error function
	- differentiation of E w.r.t. weights
	- need for learning rate 

- Why do we need learning rate?
       - to take small steps toward goal, 
          rather than big leaps

- General structure of learning algorithm
	- while (not converged) do
		for each input do
			present input
			find error at output
			slightly nudge weights
		end for
	  end while

- Learning in multi-layer networks: backpropagation
        - how do we assign blame to hidden nodes?
        - use chain rule of differential calculus
                (one reason why you need differentiable units)

- Adapting perceptron equations for multi-layer networks
        - notion of "blame" (from output layer)
                - gets apportioned into blame (for hidden layers)

- Derivation of simple rule for a small multi-layer network
	- not guaranteed to converge, though