Program Assignment 1

100 Points
Due: 2/14 at the start of class

At some point in your schooling, you may have encountered the concept of diagramming sentences. A diagram of a sentence is a tree-like drawing representing the grammatical structure of the sentence, including parts such as the subject, verb, and object.

Here is an extended BNF grammar for some simple English sentences that we will use for the purposes of diagramming.

    <sentence>    -->  <subject> <verb_phrase> <object>
    <subject>     -->  <noun_phrase>
    <verb_phrase> -->  <verb> | <verb> <adv>
    <object>      -->  <noun_phrase>
    <verb>        -->  lifted | saw | found
    <adv>         -->  quickly | carefully | brilliantly
    <noun_phrase> -->  [<adj_phrase>] <noun> [<prep_phrase>]
    <noun>        -->  cow | alice | book
    <adj_phrase>  -->  <adj> | <adj> <adj_phrase>
    <adj>         -->  green | lean | mean
    <prep_phrase> -->  <prep> <noun_phrase>
    <prep>        -->  of | at | with

This grammar can generate an infinite number of sentences. One sample is:

    mean cow saw carefully green alice with book

(For simplicity, we ignore articles, punctuation, and capitalization, including Alice's name or the first word of the sentence.)

For this assignment, your program must be written in Pascal. Your solution may only use standard Pascal code, not extensions or special library routines provided with your compiler. If your Pascal compiler includes object-oriented features, you may not use them for this assignment. Your program must adhere to proper use of the Pascal language as well as good programming style, including embedded routines, scoping, parameter passing, identifier naming, routine length, etc.

This assignment will also give you more experience using BNF. Decide what the basic operations are that you need to implement, how those will be written as functions or procedures, and then how they can be used to build higher level functions/procedures.

An easy way to organize your solution is to construct one (parameterless) procedure for each nonterminal in the grammar. See Chapter 4 in Sebesta (in 4th ed., pp. 123-125 of Chapter 3) for details on recursive descent parsing. For example, the following procedure parses the non-terminal <adj>:

    procedure adj;
        { The global variable "next_token" holds the next
          token in the input stream.  See Sebesta for details }
    begin
        write ('(');
        if ((next_token = 'green') or
                   (next_token = 'lean' ) or
                   (next_token = 'mean' )) then
            begin
                lexical;       { A function you write to advance to
                                 the next token }
                write (')')
            end
        else
            error ('Input is not a sentence'); { A function you write
                                                 to handle errors }
    end;

Your program should read "candidate sentences", one per line, from its standard input. For each line of input, your program should attempt to interpret the line's contents (a whitespace-separated list of words) as a sentence. You will need to read each entire line in as a string. This string will have to be split into tokens by a lexical analyzer, which will then feed them to your parsing routine.

Each input line is guaranteed to fit into a String variable in FreePascal, and each input line is guaranteed to end in a newline. If you are developing your solution under Unix, be sure that your code works correctly whether the standard input stream uses Unix (a line feed) or Windows (a carriage return/line feed pair) line ending conventions. The best way to address this problem is simply to use the readln() operation to read input a line at a time. Alternatively, if you choose to read input one character at a time, you can use the eol function to detect line endings and then use readln() to advance over the line terminato, all in an OS-independent manner.

The output of your program should be produced on standard output (in the Pascal sense of the term). Your program should produce a "diagrammed" version of the input string, which means a sentence in properly parenthesized form. "Properly parenthesized" means that each nonterminal appearing in the input string now has parentheses around it (omitting all redundant parentheses). For instance, the input string:

    "alice found mean green book"

would be parenthesized as

    (("alice") ("found") (("mean" "green") "book"))

A more complicated example is

    "mean cow saw carefully green alice with book"

which returns

    ((("mean") "cow") ("saw" "carefully") (("green") "alice" ("with" ("book"))))

In addition, there are two distinct error conditions that your program must recognize. First, if a given string does not consist of valid tokens, then respond with this message:

    "Input has invalid tokens."

Second, if the parameter is a string consisting of valid tokens but it is not a legitimate sentence according to the grammar, then respond with the message:

    "Input is not a sentence."

Note that the "invalid tokens" message takes priority over the second message; the "not a sentence" message can only be issued if the input string consists entirely of valid tokens.

In addition to writing your program, you will also need to write a set of test cases that demonstrate that your program works correctly. The Curator-assessed portion of your program grade will depend on all three of these factors:

Review the test case input guidelines to see the format for your test cases. They should all be combined into a single ASCII file that will be submitted to the Curator along with your program. In addition, you can use the tddpas.pl tool to run test cases on your own machine.

Here are some random bits of Pascal information that you may or may not find useful as you implement this assignment:

Your Pascal implementation is to be organized in a single file. All documentation and identifying information should be placed in comments within the single source file. The comments at the beginning of the file should identify the assignment and give your full name. Every Pascal procedure/function should be preceded by comments explaining its purpose, the meaning of each parameter, and the general strategy of its implementation if applicable.

In addition to your Pascal source file, you will also need to submit a plain ASCII file containing your test cases.

Your program and test case file will be both be submitted electronically through the Web-CAT Curator for automatic grading. See the program guidelines for information on submitting. The Web-CAT Curator will grade one half of your program score (50 points), as well as assess any deductions based on the late policy described in the syllabus. The TA will assess the remaining 50 points based on the program grading criteria. There is no need to print or turn in a hard copy version of your program; the TA will mark up a PDF version of your program and e-mail it back to you.