A Brief Introduction to Prolog

Because of its origins in logic and formal grammars, Prolog has a very simple abstract syntax which has been given a variety of different concrete realizations over the years. In this description, I will use the concrete syntax in the current draft ISO standard for Prolog, and show the correspondence of that syntax to the original Marseille syntax described in The Birth of Prolog.

Program Structure

A Prolog program is a sequence of clauses. A clause gives a condition under which certain entities, represented by terms are in a certain relation, represented by a predicate. These concepts are illustrated with the following program:

ancestor(Old,Young) :- parent(Old, Young).      (1)

ancestor(Old,Young) :- parent(Old, Middle), ancestor(Middle,Young).       (2)

parent (bertrand, kate).      (3)

parent (bertrand, john).      (4)

parent (katherine, bertrand).      (5)

Each of the numbered lines is a clause, terminated by a period followed by blank space. For instance, clause (1) states that a condition for the ancestor predicate to hold between two arbitrary entities, denoted by the variables 01d and Young, is that the parent predicate holds of the same two entities. In other words, someone's parent is one of that person's ancestors. The infix operator :- can thus be read as "if". Note also the names for parts of clauses, which will be used in what follows. Clause (2) states that if parent holds of arbitrary entities named 01d and Middle and ancestor holds of Midd1e and Young, ancestor holds of 01d and Young. In other words, a parent of one of someone's ancestors is another of that person's ancestors. Variables are local to clauses, need not be declared, and can hold any kind of value. For instance, the variable 01d in clause (1) and the variable of the same name in (2) are distinct. Finally, clauses (3)-(5) state unconditionally that parent holds of particular entities, named by atoms (symbolic constants) such as kate and bertrand. Unconditional clauses such as those are also called unit clauses.

In the original Marseille Prolog, clause (2) would have been written

+ANCESTOR(*OLD,*YOUNG) -PARENT(*OLD,*MIDDLE) -ANCESTOR(*MIDDLE,*YOUNG).

Informal Semantics

The logical foundation of Prolog is apparent in the reading of each clause of above program as an implication from a conjunction of conditions (goals) to a conclusion (the clause's head). Operationally, a predicate p is interpreted as a procedure, a goal p(…) as call of that procedure, and the clauses with head p(…) as alternative ways of executing the call. Those clauses are considered in textual order. A pattern-matching procedure, unification, is used to try to find the first clause head that can be made identical to the goal by substituting appropriate terms for the variables in the goal and the clause. If this succeeds, the substitutions found are added to the current set of substitutions, and the goals in the clause body, if any, are executed left to right under the new substitution set. If no matching clause is found, the goal fails, and Prolog backtracks to the most recent successful goal. Its clause choice and unification substitution are undone, and the next remaining clause for that goal is considered. If no more clauses match, that goal in turn fails and Prolog backtracks again. Executing the goal ancestor(katherine, Desc) with respect to the above program generates the following goals and substitutions:

 Goal Clause Substitutions ancestor(katherine, Desc) parent(katherine, Desc) (1) (5) 0ld1 = katherine, Young1 = Desc Desc = bertrand

giving the first answer to the goal Desc = bertrand. Clauses (3) and (4) were skipped over for the second goal because the two constants katherine and bertrand cannot be unified. If alternative solutions were requested, the second goal would fail, because no further clauses are available for it, and the choice of clause (1) for the first goal would be undone. Clause (2) would then be used for that goal, with appropriate substitutions' leading to a call to parent and a recursive call to ancestor, and eventually additional solutions.

Data Structures

Terms represent entities in Prolog. Atoms represent particular atomic entities (like symbols in Lisp), and variables represent as yet unspecified entities. Complex data structures are represented in Prolog by compound terms consisting of a functor (symbolic function) applied to some arguments. Functors are analogous to datatype constructors in other programming languages. In the following program, which defines the leaves relation between a binary tree and the list of its leaves, the binary functor tree represents a tree node, the unary functor leaf a labeled tree leaf, the binary functor cons a list element and the atom nil the empty list.

leaves(leaf(Leaf),cons(Leaf,Tail),Tail).       (6)

leaves(tree(Left,Right),Front,Back) :-

leaves(Left,Front,Middle), leaves(Right,Middle,Back).       (7)

For example, the goal leaves(tree(leaf(a),leaf(b)),List,nil) would suceeed with the substitution List = cons(a,cons(b,nil)), representing the sequence (a,b). Clause (6) for a tree leaf prepends the leaf label referred to by Leaf to the list Tail yielding the longer list cons(Leaf,Tai1). Clause (7) prepends the leaves of the left subtree Left to Middle giving list Front, and then prepends the leaves of the right subtree Right yielding list Middle. Thus, before the second goal is called, Front is bound to a term representing a list with its tail not yet specified (the variable Middle) and only during the execution of the second goal is Middle filled in. This ability to compute with partially specified data structures (the logical variable) is one of the most distinctive characteristics of Prolog.

Fernando Pereira, AT&T Bell Laboratories, Murray Hill, New Jersey

From: ACM SIGPLAN Notices, V28, N3, 1993.