& Class Notes
3724 - Human Computer Interaction - Summer 2005 -- Pardha S. Pyla
5: Formative Usability Evaluation
Due as per
This step in
your usability engineering process is the real payoff. It's where you
find how good the usability is and where you identify the usability problems.
Most students think this is the most fun and most rewarding part of the
project. The essence of this assignment is to carry out a full set of
formative usability evaluations on your computer-based prototype, following
the detailed techniques described in the class notes and discussed and
practiced in class.
What To Do
Data Collection Session
- Pilot test your high-fidelity
prototype some more.
- Select user participants.
- Decide team roles. Prepare
blank data collection forms and Quis-style user questionnaire.
Data Collection Session
- Get ready and have participants
from the appropriate user classes perform benchmark tasks using your
- Take quantitative and
qualitative usability data. Then finish up with each participant.
- Do usability data analysis.
- Compile a list of usability
problems identified in your testing.
- Do cost-importance analysis
to decide which problems to fix.
- Consider what you learned
in this formative evaluation process.
How To Do
- The first things to do
before you start any usability testing are to pilot test, pilot
test, and pilot test. Get at least one person outside your team
(it's OK to use people from other teams, but just for the pilot test)
to go through your testing routine as a participant. The primary goal
for pilot testing is to make sure the prototype is complete and bug-free
and that it will support usability evaluation without breaking. Also,
from just this quick and simple pre-evaluation activity, you will
learn a lot and it will help you tune and calibrate your formative
evaluation setup. As a result you may make changes to your usability
specifications, bench mark tasks, testing procedures, etc. Make sure
to do enough pilot testing to shakedown the benchmark tasks and usability
specifications you have chosen for evaluation. Don't yet fix usability
problems found in pilot testing.
- Select user participants.
Enlist at least 4 representative users from your client organization.
Your user participants should cover all of your defined user classes.
Use more participants as necessary for your own situation. The rule
of thumb is 3 to 5 users. This number is 'per user class', because
each user class requires potentially a new test, with different benchmark
tasks and different usability specifications. For this class you can
use as few as 2 participants per user class and not get penalized.
Sometimes, when even 2 participants are not available for a user class,
you should use what you can get. You should explain and justify this
kind of exception.
- Decide team roles. One
person should be the evaluation leader or facilitator, to meet and
greet the participants, to keep the evaluation sessions moving, and
to interact with participants. Another person should be responsible
for taking quantitative data (e.g. time on task and error counts).
Everyone else should record (write down) qualitative data (e.g. critical
incidents and comments by the participants).
Gather up the 5 or more benchmark tasks you created in Project 4 and
print each one on a separate sheet of paper. At this point you should
also prepare several blank data
collection forms. You'll need at least one form per benchmark
task per user participant
- Before each participant
arrives, get your prototype "booted" up and ready to go. Then
greet the participant. You will meet each participant at a predetermined
time and place, often at the client site. Explain the procedure; give
the participant any general written instructions about the overall
process. Show them any setup you have and answer any questions they
might have. If you are using an informed consent form, have them sign
Give the participant the written benchmark task description for the
first task, appearing alone on a single sheet of paper. Let them read
it, and answer any questions. (Reading time is not part of task performance
time.) If your plan to collect verbal protocol data, ask the user
to "think aloud" while performing the task. Tell them when
to start, so you can start timing, etc. Do not instruct or coach the
participant in details of how to perform a task as they are working.
If they get completely stuck and frustrated, then give them a hint,
but avoid telling them specifics. The idea is not to get them through
the tasks but to discover usability problems in your interaction design.
Repeat the same for all the benchmark tasks. You are expected to run
all the benchmark tasks you developed in (Project 4) with real participants.
- During the usability evaluation
session the team member selected for the role should take quantitative
data (e.g. time on task, error counts). Others should take extensive
notes on critical incidents and verbal protocol data; the
Data Collection Form is a good place to put these notes. Also,
after performance measurement is completed (so you don't interfere
with measurement), talk with the participant about the task, etc.,
as you saw in the video in class. Also have the participant do some
"free play" and generally interact with the design and take some verbal
protocol. This can catch some problems not seen in your benchmark
tasks, which are by nature more limited in scope. At the end of each
session, have the participant fill out the questionnaire (that you
created in Project 4) for your selected questions. Then thank the
participant, answer any final questions they may have, and give them
- After testing is over,
do a complete work-up and analysis of both the quantitative and the
qualitative data, following the techniques described in the class
notes and in-class activities. Compute the average values for your
quantitative measures and compare them with your usability specifications,
as described in the deliverables below. It is normal for a design
to not meet your usability specifications in the first round of usability
testing. That is why it is an iterative process. If you meet most
or all your usability specifications, we will be suspicious that the
design was too simple, the benchmark tasks were too easy, or you set
the target levels too leniently.
- From your notes, the critical
incident data, and the verbal protocol taking, compile a complete
list of usability problems discovered across all your usability evaluation
sessions. You should easily find a dozen critical incidents (indicating
usability problems) in your usability evaluation; more is not unusual.
If you don't, I would be suspicious of something -- your design is
not rich enough, your benchmark tasks are too easy, or there is something
wrong with how you did the process (e.g., not seeing all the critical
incidents). In such a case, reconsider and revise something and do
more formative evaluation until you have enough critical incidents.
- Do a complete cost-importance
analysis, as we did in class. Use a cost-importance
table to show, for each usability problem, your estimated importance
to fix and cost to fix and your priority ratios and rankings. Combine
related usability problems, as we did in class, representing the combination
as a single new problem in the table. Show the original separate problems
and the group problem together, as we did in the "Grouping Example"
in the class notes. Do the cost-importance analysis (ranking, etc.)
based on the grouped problems, not the original individual problems
that comprise a group. Pull out Must Fix problems, if any, and put
at the top. Sort the rest by descending values of priority ratio.
Assign yourself a fictitious number of person-hours as the available
resources to fix problems and draw a line in the cost-importance table
representing where you will run out of resources (i.e., the line between
"can fix" and "put off"). Show your final resolution for each problem
(e.g., must do, will do, do if time, next version, probably never,
- As a team, discuss process
problems, lessons learned, and what you got out of this process. Take
notes for reporting in the deliverables.
binder, create a "tabbed" section labeled "Project 5". Add this section
to the front of your team binder. This way, your binder becomes
a cumulative record of your whole project, with the most recent parts
first. This section should start with its own separate cover page with
(mostly the same as on the front of the binder):
Contents of Project 5 section.
- "Project 5: Formative
- Team number
- Project name
- Name of client organization
- One-line description
- Team member names
- "CS 3724– <current
label your items per this list:
- Begin after the tab for
this section, with a blank printed grading
form for this deliverable.
- Then include a Table of
Contents for this particular deliverable (not the whole folder).
- Then follow with these
items, numbered as they are here:
- To make this report
a stand-alone document, repeat the latest version of your product
concept statement, as a synopsis of your project.
- Describe your pilot
test, the results, what you learned, and what (if anything) it
led you to change in your prototype, benchmark tasks, or your
formative evaluation plans. If any of the usability specifications
you finally used are changed from the ones you reported in your
last deliverable, that is normal. Just say what has changed and
why. Do not include quantitative data from the pilot test in the
final results (even though this can be tempting if they turn out
- Describe your participants
for formative evaluation, how you selected them (very briefly),
and how they "cover" your user classes.
- Describe the team
member roles you decided on for the usability evaluation sessions.
- Make any brief comments
you would like to about how the process of setting up, greeting
the participants, and having the participants perform the task.
- Very briefly describe
your method(s) for taking quantitative data. Very briefly describe
the verbal protocol taking process that you did and how it worked
for you. Very briefly describe the critical incident taking process
that you did. This is to describe how you collected critical incident
data and how it worked for you. (The place for describing problems
found is coming up below.) Include a blank copy of the questionnaire
you used to gather subjective data.
- Give an executive
summary statement highlighting the overall results of the formative
evaluation activity, saying which usability specifications were
met and which were not and including any high-level conclusions
you might have. Next
show a more detailed summary of evaluation results in the form
of the usability specification table with a column added at the
right, containing the summary figures for observed data. Putting
the results together with the usability specifications allows
direct comparison. For objective measures, give a brief description
of the corresponding benchmark task as context for results (e.g.,
listed just below the table). For subjective measures, show the
corresponding questionnaire questions as context.
8. In a cost-importance
table list each usability problem discovered. The problem list should
contain at least a dozen usability problems and should be sorted by
ascending values of priority rank, with any Must Fix problems at the
top. If you found a great deal more than that, list the dozen or so
most significant (or most interesting) ones, showing this summary information:
- Problem (name and
- Importance rating
(determined by your team)
- Proposed solutions
and the estimated cost-to-fix of each, in person-hours
- Priority ratio and
any groupings you made, combining individual related usability problems
into one broader problem. List only the final merged problem in the
table. Indicate the number of person-hours you have allowed yourself
as a hypothetical limit and show it on the cost-importance table as
a "cutoff" line. Then show your final resolution (to fix or not) of
each problem, based on where each usability problem falls in the rankings
with respect to that limit.
the report with a brief statement reflecting on how the process
worked (or didn't) for your team, any important lessons learned, the
kinds of problems encountered by you and/or the participants, what the
participants liked and disliked, and reflections on the prototyping
and formative evaluation processes.