Due October 29

The purpose of these exercises is to help you review the material in Chapters 12-23 of Jain. You may work together on these problems, but each student should turn in a solution.

- Suppose we collect data from 32 VT CS students on the number of times per
semester they stay up all night working.
Here is the raw data:
13 18 16 14 19 17 23 24 20 19 18 23 15 19 19 18 25 16 20 17 25 23 15 19 16 21 15 21 21 16 15 17

- Construct a histogram of this data. Use your own judgement in choosing
the cell size so that the histogram gives a useful indication of
the apparent distribution of the data.
- Construct a normal quantile-quantile plot for the data. Does the
distribution appear to be normal?
- Compute the sample mean, variance and standard deviation.
- Compute a 95% confidence interval for the mean.
- What sample size would be needed to estimate the mean with an accuracy of 3% and a confidence level of 95%?

- Construct a histogram of this data. Use your own judgement in choosing
the cell size so that the histogram gives a useful indication of
the apparent distribution of the data.
- Consider the following data set that records pain level of
CS 1044 GTA's as a function of lines-of-code in students' programs:
L.O.C. Pain 30 73 20 50 60 128 80 170 40 87 50 108 60 135 30 69 70 148 60 132 - Fit a simple linear regression model to this data, i.e., compute
the parameters
and such that
is the best fit linear model to this data, in the least-squares sense. Report the coefficient of determination, for your model.

- Prepare two plots to help evaluate the goodness of your model:
- A scatter plot of the data and the model (e.g., Figure 14.2 in Jain).
- A plot of residuals vs. predicted response (e.g., Figure 14.7 in Jain).

- Compute a 90% confidence interval for the mean pain level (taken over many future observations) for a GTA grading a 50-line program.

- Fit a simple linear regression model to this data, i.e., compute
the parameters
and such that
- In an effort to predict the productivity of CS Department faculty
members, the data shown here
was collected:
**Factor B -- Ofc. Space****Factor A -- Salary**Small Big Low (52,47,44,53) (69,63,70,70) High (70,77,71,68) (69,74,76,82) It records

*productivity*(on a mysterious 100 point scale) for 16 different faculty members, grouped into four groups depending on their salary level (low or high) and their office size (small or big).- Analyze this data by computing a simple model
(as in Chapter 18 of Jain). Compute the effects and the allocation
of variation (as in Example 18.3).
- Compute 90% confidence intervals for the four effects in this model.

- Analyze this data by computing a simple model
(as in Chapter 18 of Jain). Compute the effects and the allocation
of variation (as in Example 18.3).