CS2104: Introduction to Problem Solving
Homework Assignment 12

Due at 8:00AM on Friday, November 11
50 Points

See the General Guidelines for homework assignments.

This assignment may optionally be done with a partner. You are strongly advised to use a partner. If you do use a partner, then:

There are many situations you encounter where something can happen with a probability 1/P, interpreted as the likelihood of one success in P tries. You can repeat the action, and each trial is independent (that is, success or failure the last time does not affect success or failure the next time). Examples of this include the lottery, and many computer games. For example, in a computer game you might try hitting something or making something, with a probability of one in five of succeeding.

Assuming that you can repeat the action, and that the events are independent, then the expected number of tries needed to succeed is P. (For a proof of this, see the discussion of Bernoulli random variables in PSNotes_math.pdf.) But that does not mean that it always takes P tries. You might succeed the first time, or it might take many, many times to succeed. Many people are easily mislead by superstitions related to independent random events. Like believing that if you just failed five times in a row, that the next trial has a different chance of succeeding or failing. The typical person also underestimates the probability that something will happen multiple times in a row. In fact, most people fundamentally misunderstand independent random events. So it is useful to get some feeling for how the probability distribution actually works.

Your task for this homework is to design and code a program to simulate this problem. The program can be written in any programming or scripting language, such as Fortran, C, C++, Java, awk, Perl, Python, Mathematica, Matlab, etc. Your program needs to be sufficently well commented, with good style such as meaningful variable names, indentation, etc., so that it is easy to read and understand.

Specifically, your program will take as input two values: P, the value for the expected number of tries needed to succeed, and N, the maximum number of tries-to-success that we want information about. For example, you might be given P as five, and N as ten. This means that the event is expected to succeed one time in five, and we want to know the probability for it succeeding in one try, two tries, three tries, ..., with the last line of the table showing the probability of needing 10 or more tries.

Your program will print out a series of N lines with each line containing the number of trials and the probability of success in exactly that many trials. That is, the first line shows the probability of succeeding in one try, the second line shows the probability of succeeding in two tries (one failure and then one success), and so on. The Nth line will state the probability that it requires N or more tries.

A key issue that you must face is how many times you need to run random numbers to get a reliable result. This might depend on the values of P and N.

Your homework submission will include the source code for the program (simply copy this into your homework submission document), along with the probability list for the values P = 5 and N = 10. Explain in your homework submission how you determined how many iterations you need to get a good answer, and what that number of iterations will be for these particular values of P and N.

Here is an example to illustrate what you should do. Consider the following series of events. A value of 0 means the event failed, while a value of 1 means that the event succeeded.

0 0 1 1 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1

(Note that the actual random numbers might be 1 to 5, where 1 is success, and 2 through 5 is failure. I am just showing here success and failure outcomes, not the actual random numbers.)

In this example, first it took three tries to get the first success. Then one more try to get the second success, 6 tries to reach the third success, 2 tries to reach the 4th success, 3 tries to the 5th success, 5 tries to reach the 6th success, 1 try for the 7th success, 2 tries for the 8th and 9th successes, 15 tries to reach the 10th success, 2 tries for the 11th success, and finally 4 tries to get the 12th success.

If the value of N is set to 10, then a table showing the number of attempts versus number of successes for this example would look like:

  1: 2
  2: 4
  3: 2
  4: 1
  5: 1
  6: 1
  7: 0
  8: 0
  9: 0
10+: 1

Your program should not output counts, but rather should output probabilities. So your output might look something like:

  1: 16.67
  2: 33.33
  3: 16.67
  4:  8.33
  5:  8.33
  6:  8.33
  7:  0.00
  8:  0.00
  9:  0.00
10+:  8.33