TDD for Prolog

TDD Script for Prolog
Download:	tddplg.pl (requires renaming)
Latest Version:	1.5

tddplg.pl is a test-driven development tool for Prolog predicates that take arguments and produce a return value. It works well for SWI Prolog programs that do not perform file I/O. tddplg.pl is written in Perl and is intended to work on most operating systems.

Version 1.5 has been posted. It correctly handles arbitrary orderings of variable bindings in expected output.
Version 1.4 has been posted. It correctly prints out [true] or [fail] when displaying expected or actual results, instead of the less intuitive values of [[]] or [].
Version 1.3 has been posted. It now properly handles single-quoted string values in expected output sections.
Version 1.2 has been posted. It now properly deletes temporary files that are unnneeded.
Version 1.1 has been posted. It now properly flags test cases as erroneous if the expected output contains free variables that are not present in the input goal.
Version 1.0 has been posted.

You must have Perl installed on your machine (preferably v5.6 or above). If you do not have Perl, ActiveState provides a freely downloadable version that is highly recommended and easy to install/upgrade.
Download the script, place it on your path, and rename it to tddplg.pl by replacing the .txt extension (needed for plain downloading on our server). For Windows NT/2000/XP users who have installed ActiveState perl, the .pl extension will already be associated with Perl and the script can be run as-is. Unix users should chmod +x the script (you can optionally rename it without the .pl extension if desired). Users of other operating systems may need to invoke Perl directly to use the script (e.g., perl tddplg.pl ...args... instead of simply tddplg.pl ...args...).
Edit the script with a plain text editor to be sure that it reflects your Prolog installation correctly. In particular, check the definition of the variable $prolog in the User-configurable paths section near the top of the script. This path should reflect the path to the Prolog intepreter executable, where ever it is installed on your system. The value in the script by default is the default SWI-Prolog installation directory under Windows 2000/XP. Unix users will definitely need to change this path (perhaps to just pl, if pl is on your path by default).

The tddplg.pl script expects two command line arguments: the name of the Prolog source file containing the predicate(s) to test, and the name of a file of test cases. The test cases are plain text in a stylized format. Each test case is made up of two main parts: zero or more lines listing the predicate and arguments to test, and a corresponding set of zero or more solutions representing the values of variables bound on successful return.

The test file format uses "//" at the beginning of a line to identify lines with special meaning to the test script. Specifically, any line that starts with the character sequence "//==" denotes the start of a test case. Any text on the remainder of the line serves as a "name" or label for the test case (for the purposes of identification if that test case fails). Lines following this marker are input lines. A later line starting with "//--" marks the end of the series of input lines and the start of the corresponding output lines. Any other line starting with "//" are treated as comments and are ignored by the test script.

For example, suppose you want to test the Prolog predicate append/3, which concatenates two lists. Here is a simple test case for append/3:

//== Testing [a, b, c] + [d, e, f]
append( [a, b, c], [d, e, f], Result )
//-- This is the expected output:
[
    Result = [a, b, c, d, e, f]
]

The input section of the test case is the Prolog goal to satisfy (which could span over many lines if you want). The output section contains a list of one or more solutions separated by commas (In this example, there is only one solution). Each solution is a list of Variable = Value pairs giving the value for each unbound variable listed in the Prolog goal. Here, the only unbound variable is Result. Not that neither the order of the solutions in the output section nor the order of the variables within a solution (if the goal contains more than one unbound variable) matter in the comparison. If the goal has no solution, simply list fail as the expected output. If the goal contains no unbound variables but should still succeed, list true as the expected output. An empty output section will be interpreted the same as fail. Some more example test cases are:

//== Testing [a] + [a] = [a, b, c]
append( [a], [a], [a, b, c] )
//--
fail
//== Testing [a] + [b, c] = [a, b, c]
append( [a], [b, c], [a, b, c] )
//--
true
//== Testing A + B = [a, b, c]
append( A, B, [a, b, c] )
//-- 4 possible solutions, separated by commas
[ A = [],
  B = [a, b, c]
],
[ A = [a],
  B = [b, c]
],
[ A = [a, b],
  B = [c]
],
[ A = [a, b, c],
  B = []
]

In general, it is best to keep individual test cases as small and focused as possible--having lots of small test cases is preferred over having a few very large, complicated test cases. With a large test case, it is often hard to figure out exactly where or why a failure occurred. Further, you usually cannot run such large test cases until the entire program is complete--but smaller test cases can often be run sooner.

Your test case file can have as many test cases as you like. Just place them one after another. Unlike in our other TDD scripts, with tddplg.pl, blank lines are not significant.

Suppose you are testing append/3 in the source file my-append.plg, with your test data stored in the file append-tests.txt. You run the script like this:

    tddplg.pl my-append.plg append-tests.txt

The script runs tests using the following procedure:

It parses the test case file, stripping out all the comments and producing a new Prolog source file with some boilerplate definitions. Each of your test cases turns into a call to a testcase predicate, with your input section and your output section provided as string literals. The testcase predicate will parse, evaluate, and compare, producing its results in a temporary output file. This new Prolog file includes error handling to detect most kinds of errors and print them out instead of your predicate's results.
It runs this new input file using Prolog.
It walks through the actual output to count up the number of failures.
It prints out the results, including a summary of the number of test cases run, the number that failed, and the number of runtime errors that occurred.

Output from a successful test run appears this way:

tddplg.pl v1.3   (c) 2003 Virginia Tech. All rights reserved.
Testing my-append.plg using append-tests.txt

........................................

Tests Run: 40, Errors: 0, Failures: 0 (100.0%)

Suppose that the four sample append/3 test cases shown above were in append-tests.txt. Now, suppose we change the first test case to incorrectly expect a solution of [Result = [] ]. Rerunning the tests produces the following:

tddplg.pl v1.3   (c) 2003 Virginia Tech. All rights reserved.
Testing my-append.plg using append-tests.txt

F
case 1 FAILED: Testing [a, b, c] + [d, e, f]
  Expected: [[Result=[]]]
       Got: [[Result=[a, b, c, d, e, f]]]
...

Tests Run: 4, Errors: 0, Failures: 1 (75.0%)
Output has been saved in 1848.out.

If the set of all possible solutions produced by your assertion is not identical to the set of all possible solutions listed in the expected output section of that test case, the entire test case will be considered a failure and will be identified as such (the label from the //== line will be used in the message). In addition, the temporary file containing the actual output of the function will be retained for your reference (it has a temporary name based on the process id).

The tddplg.pl script does not match the console output of your predicates against the expected output section of the corresponding test case. Instead, it generates a temporary Prolog program that runs each test case as a separate goal. Each test case is executed in a way that forces all solutions to be generated through backtracking. For each solution that is produced, the corresponding variable bindings of any free variables are kept. The set of distinct solutions that are produced are then compared with the set of solutions you have provided as the expected output in your test case. Unlike with the Pascal TDD script, the failures reported for a test case in this script are always caused by execution of that individual test case (not by accidental output contamination produced by an earlier test case).

In practice, this iusually does not matter much if you are following a TDD practice. That is because you will be writing test cases one at a time, and adding code a little at a time (just enough to implement the features of that new test case). That means in general, all of your test cases except the newest one will be working. If all of your test cases were working, and suddenly you get multiple failures in many test cases, then your latest modification introduced a bug that broke something. Fortunately, if you test often--every time you add a little bit of code--then you know exactly where the bug is without having to search for it. It has to be in the portion of the code you were just working in.

That is one big benefit of TDD. Being able to run the tests often, and doing it after each small piece of behavior you add gives you confidence in whether or not the code so far works correctly. Combine that with the practice of adding a test case for each and every capability before you write the code gives you a big leg up on completing a working solution.

Send any bugs or questions regarding tddplg.pl to Dr. Edwards.