Oct 27, 2003 ------------- - Perceptron = single neuron - threshold perceptron - sigmoid perceptron - Learning rules for perceptrons - threshold perceptron rule - sigmoid perceptron rule - Properties of learning rules - threshold rule converges only when linearly separable - sigmoid rule converges irrespective - (i.e., even for XOR, to an "in between" value) - Derivation of sigmoid rule - formulation of error function - differentiation of E w.r.t. weights - need for learning rate - Why do we need learning rate? - to take small steps toward goal, rather than big leaps - General structure of learning algorithm - while (not converged) do for each input do present input find error at output slightly nudge weights end for end while - Learning in multi-layer networks: backpropagation - how do we assign blame to hidden nodes? - use chain rule of differential calculus (one reason why you need differentiable units) - Adapting perceptron equations for multi-layer networks - notion of "blame" (from output layer) - gets apportioned into blame (for hidden layers) - Derivation of simple rule for a small multi-layer network - not guaranteed to converge, though