CS 4804 Homework #5
Date Assigned: November 5, 2007
Date Due: November 14, 2007, in class, before class starts
- (100 points)
Consider the "heart disease" dataset from the UCI machine learing repository. Use only the Cleveland portion
of this dataset (read the ".names" file) and only the 14 attributes out of the
entire 76 attributes (again, read the ".names" file). Use the "processed"
version of the Cleveland dataset, not the raw version. Divide the 303 entries
in this dataset into training, test, and validation sets, taking care to ensure
that the distribution of classes is approximately the same across the new
datasets (note that there are five classes).
Design a neural network to learn to classify patients. For full credit,
explain the choice of the number of neurons you use in each layer,
how you encoded the data as inputs/outputs to your network,
plots of performance as the network is trained, and final performance on the
test set. Make qualitative observations about the performance of your neural
network and compare it to information in the ".names" file.