CS 3724
Homework 3: Sample Data
Here are two sample data sets. You will need to adjust them
to get them to work with the tools. Feel free to use other
data as well.
Below is a data file derived from some chat logs taken from the
Virtual School project. There are 8 entries per line, separated by
spaces, with each line representing a single message. The entries are
as follows:
- time (time of message in seconds since Jan 1, 1970)
- personid (a unique ID for each person, order has no meaning)
- chatid (a unique ID for each thread, order has no meaning)
- numchars (the number of characters in the message)
The final four values are boolean values derived from the messages
to determine the use of punctuation in the messages.
- ques (whether there is a question mark in the message)
- mques (whether there are multiple question marks)
- ques (whether there is an exclamation point)
- mques (whether there are multiple exclamation points)
Grab the data (in plain text format) here
Below is a data file from the experiment that many of you participated in
last semester. There were 61 participants, and a lot of information
was collected and calculated for each. The entries are as follows:
- experiment condition (used text only, the BT visualization tool,
or the inset visualization tool)
- total score on the 5 questions
- total score on the 2 procedural questions
- total score on the 3 conceptual questions
- the next 5 columns are the individual questions (1=correct, 0=incorrect)
- the remaining columns are personal data entered by each participant
Grab the data (in Excel format) here.
Other good examples are census data, sports data, and stock market data.
Do a Web search and you will find many good sources.
Or, use data from another source -- there's plenty of data in the world!
Contact Information:
Scott McCrickard
mccricks@cs.vt.edu