Home  |   Notes  |   Languages  |   Programs  |   Homework
Program Assignment 1

80 Points
Due: 2/19/01 at the start of class

 Problem

At some point in your schooling, you may have encountered the concept of diagramming sentences. A diagram of a sentence is a tree-like drawing representing the grammatical structure of the sentence, including parts such as the subject, verb, and object.

Here is an extended BNF grammar for some simple English sentences that we will use for the purposes of diagramming.

    <sentence>    -->  <subject> <verb_phrase> <object>
    <subject>     -->  <noun_phrase>
    <verb_phrase> -->  <verb> | <verb> <adv>
    <object>      -->  <noun_phrase>
    <verb>        -->  lifted | saw | found
    <adv>         -->  quickly | carefully | brilliantly
    <noun_phrase> -->  [<adj_phrase>] <noun> [<prep_phrase>]
    <noun>        -->  cow | Alice | book
    <adj_phrase>  -->  <adj> | <adj> <adj_phrase>
    <adj>         -->  green | lean | mean
    <prep_phrase> -->  <prep> <noun_phrase>
    <prep>        -->  of | at | with

This grammar can generate an infinite number of sentences. One sample is:

    mean cow saw carefully green Alice with book

(For simplicity, we ignore articles, punctuation, and capitalizing the first word of a sentence.)

 Implementation Language

The language for implementation is Pascal. Structured design using procedures and functions should be used. If your Pascal compiler includes object-oriented features you are not to use them. You will also gain more experience using BNF. Decide what the basic operations are that you need to implement, how those will be written as functions/procedures, and then how they can be used to build higher level functions/procedures, etc. An easy way to organize your solution is to construct one (parameterless) procedure for each nonterminal in the grammar. See Sebesta pp123-125 for details on recursive descent parsing. For example, the following procedure parses the non-terminal <adj>:

    procedure adj;
        { The global variable "next_token" holds the next
          token in the input stream.  See Sebesta p.124 for details }
    begin
        write ('(');
        if ((next_token = 'green') or
                   (next_token = 'lean' ) or
                   (next_token = 'mean' )) then
            begin
                lexical;       { A function you write to advance to
                                 the next token }
                write (')')
            end
        else
            error ('Input is not a sentence'); { A function you write
                                                 to handle errors }
    end;

 Input Format

Your program should read "candidate sentences", one per line, from its standard input (in the sense that this exists in Pascal). For each line of input, your program should attempt to interpret the line's contents (a whitespace-separated list of words) as a sentence. You will need to read each entire line in as a string. This string will have to be split into tokens by a lexical analyzer, which will then feed them to your parsing routine.

 Output Format

The output of your program should be produced on standard output (in the Pascal sense of the term). Your program should produce a "diagrammed" version of the input string, which means a sentence in properly parenthesized form. "Properly parenthesized" means that the entire sentence has a pair of parentheses around it, and every phrase within the sentence is also surrounded by parentheses (i.e., every nonterminal ending in _phrase). The only exception is <adj_phrase>, where a single pair of parentheses are placed around the entire list of adjectives, rather than around each individual recursive application of the rule. For instance, the input string:

    "Alice found mean green book"

would be parenthesized as

    (("Alice") ("found") (("mean" "green") "book"))

A more complicated example is

    "mean cow saw carefully green Alice with book"

which returns

    ((("mean") "cow") ("saw" "carefully") (("green") "Alice" ("with" ("book"))))

In addition, there are two distinct error conditions that your program must recognize. First, if given a string does not consist of valid tokens, then respond with this message:

    Input has invalid tokens.

Second, if the parameter is a string consisting of valid tokens but it is not a legitimate sentence according to the grammar, then respond with the message:

    Input is not a sentence.

 Submitting Your Program
Your Pascal implementation is to be organized in a single file. All documentation and identifying information should be placed in comments within the single source file. The comments at the beginning of the file should identify the assignment and give your full name. Every Pascal procedure/function should be preceded by comments explaining its purpose, calling sequence, and implementation.

Your program will be submitted electronically through the Curator for automatic grading against a randomly generated test suite. To submit your program, login to the Curator using your PID and password. Be sure to select the 11374-CS 3304 11:15 MWF Edwards section when logging in. Click on "Submit", choose Pascal1 from the project drop-down list, click "Browse" to specify your solution, and then click "Upload."

In addition, you must also turn in a printed version of your program source in class according to the rules specified in the syllabus, which also describes the late policy.
Home  |   Notes  |   Languages  |   Programs  |   Homework

copyright © 2001 Virginia Tech, ALL RIGHTS RESERVED
Last modified: April 23, 2001, 15:39:01 EDT, by Stephen H. Edwards <edwards@cs.vt.edu>