CS 4804 Homework #6
Date Assigned: October 31, 2003
Date Due: November 10, 2003, in class, before class starts
- (20 points) Consider a perceptron that takes inputs (x1, x2, x3, x4)
and has weights (w1, w2, w3, w4) for these inputs respectively. In addition,
it has a "threshold"/constant input always set to 1 with weight "b".
Assume that the action of the perceptron is:
- output zero if (x1*w1 + x2*w2 + x3*w3 + x4*w4 < -b)
- output one if (x1*w1 + x2*w2 + x3*w3 + x4*w4 > b)
- output (1/2b)*(x1*w1 + x2*w2 + x3*w3 + x4*w4 + b) otherwise
In other words, instead of the usual threshold or sigmoid nonlinearity,
we have a "ramp" function. Derive the learning rule for this
perceptron. Does the weight space for this perceptron have a single
minimum (global) or does it have multiple local minima?
- (20 points) Consider the cascade network shown below. There
are three inputs to the network (x1, x2, and x3). Sigmoid unit number
1 receives all of these inputs as well as a threshold input (set to 1).
Sigmoid unit number 2 has as its inputs the output of sigmoid unit 1
as well as the original inputs (four of them). All links are weighted
with adjustable weights. Derive an appropriate backpropagation-style
incremental weight adjustment procedure for this network based on
minimizing the sum-of-squared error between the network's output
and the labels of a set of training input vectors. Then apply this
algorithm on the data given in cascadedata
and report the weights (first four columns in this file
are x1, x2, x3, and x4
respectively; last column is the network's output). Present your answer as a closed form expression
relating the output of the network to the inputs.
- (60 points) In this problem, you will train a regular neural network on
the Iris
dataset (the only files relevant are iris.data and iris.names). First take
a look at iris.names and familiarize yourself with the format of the
data in iris.data. Notice that there are four input variables (which
are continuous) and one class variable (that can take on three possible
values).
- Use at most one hidden layer of nodes. You have to decide for yourself
how many nodes you will need to use.
- Pick either distributed/local encoding for the output layer. Explain
the reason for your choice.
- Code up a backpropagation algorithm in a language of your choice.
Split up the given data into 2/3 training and 1/3 test. Make sure that the
distribution of classes in both training and test is the same as the original
distribution (i.e., the entropy is preserved).
- For each iteration of the backpropagation algorithm (i.e., one
sweep through each piece of training data), compute the
sum-of-squared errors on both training and test datasets, and track these
metrics.
For full credit, explain all design decisions you made, a printout of your
code, graphs displaying the performance, and a list of observations.