CS 2204: Homework #5

What to turn in: A legible paper copy giving your answers

A common medical condition is the so-called "Megaloheimer's syndrome". People with Megaloheimer's tend to forget everything about everybody else except themselves, and often remember good things about themselves that have actually never happened to them but happened to other people. The intellectual ability of these patients is usually below average, although in the early stages of the disease they can pool the wool over everybody's eyes to effectively conceal this fact. Often, with the help of their influential and wealthy friends, people with this condition attain high power positions in the society.

Biomedical research reveals that the DNA of people with Megaloheimer's syndrome is likely to contain a relatively high proportion (more than 0.25 % of the total DNA content, counted "by letters" ) of the following sub-sequence: ACCTT , spaced at regular intervals along the patient's DNA. In this homework you will write a series of short PERL programs and try to answer the following question: is the patient XX likely to develop Megaloheimer's? The patient's DNA fragment is here: Patient XX. DNA sequence from locus 17NI5 promoter region. Admitted for diagnostic tests April 1 2006.

To approach the problem, write two separate perl programs to read the DNA sequence above and answer the questions below. Write your answer next to each question and provide your perl codes (on a separate sheet, if neccessary). Note that it is possible that some of the ACCTT strings may be split between two lines in the DNA file you have just downloaded.

  1. (4 points) Read in the patient's DNA sequence and count the number of occurrences of the sub-sequence ACCTT in it.


  2. (4 points) Read in the the patient's DNA sequence and print the indexes (positions) for every occurrence of the sub-sequence ACCTT in it. For example, the sequence:
          ACCTTGCGATGAACCTTGATGCCG
    contains the sub-sequence ACCTT two times at indexes (positions) 0 and 12.


  3. (4 points) So, is this patient likely to end up at a high place in the society? Present your arguments based on your findings in 1) and 2) and the decription of the disease given above.