CS 5014 Homework 1
Due October 29
The purpose of these exercises is to help you review the material
in Chapters 12-23 of Jain. You may work together on these problems, but
each student should turn in a solution.
- Suppose we collect data from 32 VT CS students on the number of times per
semester they stay up all night working.
Here is the raw data:
13 18 16 14 19 17 23 24
20 19 18 23 15 19 19 18
25 16 20 17 25 23 15 19
16 21 15 21 21 16 15 17
- Construct a histogram of this data. Use your own judgement in choosing
the cell size so that the histogram gives a useful indication of
the apparent distribution of the data.
- Construct a normal quantile-quantile plot for the data. Does the
distribution appear to be normal?
- Compute the sample mean, variance and standard deviation.
- Compute a 95% confidence interval for the mean.
- What sample size would be needed to estimate the mean with an
accuracy of 3% and a confidence level of 95%?
- Consider the following data set that records pain level of
CS 1044 GTA's as a function of lines-of-code in students' programs:
|30 || 73|
|20 || 50|
|40 || 87|
|30 || 69|
- Fit a simple linear regression model to this data, i.e., compute
and such that
is the best fit linear model to this data, in the least-squares sense.
Report the coefficient of determination, for your model.
- Prepare two plots to help evaluate the goodness of your model:
- A scatter plot of the data and the model (e.g., Figure 14.2 in Jain).
- A plot of residuals vs. predicted response (e.g., Figure 14.7 in Jain).
- Compute a 90% confidence interval for the mean pain level (taken over
many future observations) for a GTA grading a 50-line program.
- In an effort to predict the productivity of CS Department faculty
members, the data shown here
||Factor B -- Ofc. Space
|Factor A -- Salary
productivity (on a mysterious 100 point scale) for 16
different faculty members, grouped into four groups
depending on their salary level (low or high) and their
office size (small or big).
- Analyze this data by computing a simple model
(as in Chapter 18 of Jain). Compute the effects and the allocation
of variation (as in Example 18.3).
- Compute 90% confidence intervals for the four effects in this model.