### Statistics: Summarizing Measured Data

#### See Jain, Chapter 12

#### What Is Statistics?

- You wish to learn about a
*universal population*.
- You observe a
*sample* of the universal population.
- Statistics is a collection of techniques to make
*inferences*
about the universal population from the sample.

#### Example:

*Question:* What is the mean time required to complete HW1 in
CS5014 this semester?

*Answer:* Pick 10 students (the sample). Record their
completion times. Compute the *sample mean*, which is an
*estimate* of the mean for the entire population (all CS5014
students).

#### A Statistic

A statistic is a number that summarizes data. Jain asks [p. 177] ...

- How can the data be reduced to a single number?
- How should you report variability of the data?
- Should you be confident in the data if variability is high?
- How many data points (measurements) are needed for a desired level of
statistical significance?
- How should you summarize the results of an experiment?
- How should you compare two systems using experimental data?
- What model best describes the relationships among components of a system?

#### What should I know?

- Sample vs. population and distribution
- Mean vs. sample mean
- Variance vs. sample variance
- Two famous distributions: uniform, normal
- Mean vs. median vs. mode.
- Quantile, percentile.
- Coefficient of variation.

#### Can you explain ...

- quote on top of pg. 182, "An alpha-quantile of a unit normal ...
has a N(0,1) distribution."
- Example 12.4

CS 5014,
C. J. Ribbens,
09/17/2001