Introduction to Experimentation, Measurement, and Performance Evaluation
in CS Research
What role do experimental approaches play in CS research?
- Essential in many subdisciplines, e.g., architecture, operating
systems, networking, HCI, software engineering.
- Careful experimental practices are not always followed.
An overview of Jain
- Select appropriate evaluation techniques, performance metrics, and
workloads for a system.
- Conduct performance measurements correctly.
- Use proper statistical techniques to compare several alternatives.
- Design measurement and simulation experiments to provide the most
information with the least effort.
- Perform simulations correctly.
- Use simple queueing models to analyze the performance of systems.
What is performance? What do we want to measure?
There are many possible answers.
For example (see Jain, 3.3.), ...
- Which user interface is most usable?
- Which system works faster (e.g., has a lower response time)?
- Which system has the highest productivity (e.g., has highest rate
of work completion, or throughput)?
- Which system best utilizes resources (e.g., keeps CPU busy to
justify its cost, but not too busy to make users wait)?
- Which system is most reliable (e.g., is least likely to make an
- Which system is available most (e.g., has the least down-time)?
- Which system gives the best performance for the price?
Common mistakes [Jain, 2.1]
- No goals
- Biased goals
- Unsystematic approach
- Analysis without understanding the problem
- Incorrect measures
- Unrepresentative workload
- Wrong evaluation technique
- Overlooking important parameters
- Ignoring significant factors
- Inappropriate experiment design
- Inappropriate level of detail
- No analysis
- Erroneous analysis
- No sensitivity analysis
- Ignoring errors in input
- Improper treatment of outliers
- Assuming no change in the future
- Ignoring variablity
- Too complex analysis
- Improper presentation of results
- Ignoring social aspects
- Omitting assumptions and limitations
A systematic approach [Jain, 2.2]
- State goals; define the system
- List services and outcomes.
What does the system do, and what are the system's responses?
- Select metrics
- List parameters
- Select factors to study
(a factor is a parameter that will be varied)
- Select evaluation technique, e.g., simulation, analysis, measurement
- Select workload
- Design experiments
- Analyze and interpret data
- Present results
C. J. Ribbens,