Consider the following data from an experiment, where some quantity y was measured at intervals of one second, starting at time x = 1.0.
x y 1.0 5.0291 2.0 6.5099 3.0 5.3666 4.0 4.1272 5.0 4.2948 6.0 6.1261 7.0 12.5140 8.0 10.0502 9.0 9.1614 10.0 7.5677 11.0 7.2920 12.0 10.0357 13.0 11.0708 14.0 13.4045 15.0 12.8415 16.0 11.9666 17.0 11.0765 18.0 11.7774 19.0 14.5701 20.0 17.0440 21.0 17.0398 22.0 15.9069 23.0 15.4850 24.0 15.5112 25.0 17.6572If we do a simple plot of this data, we see this.
A general trend in the data can be seen by eye. But how can we know that we have found a good fit to the data? What does `good' even mean in this instance? One easy way to answer these questions is to try various models for the data and simply evaluate the norm (size) of the residual vector. Recall that the i-th residual is just ri = yi - F(xi), where (xi, yi) is the given data point and F(xi) is my approximation at that point.
Now suppose we suspect some of these points should be thrown out due to experimental error. How can we identify these bad points? An easy approach to answering this question is as follows. If we are doing a good job of fitting the data, statistical theory tells us that about 95% of the `scaled residuals' should lie in the interval [-2,2]. The i-th scaled residual is just ri/s, where s is a scaling factor defined by
|| residual ||_2 s = ---------------------, sqrt(m-n)where m is the number of data points and n is the number of basis functions used to define F(x).