# COMPILERS AND TRANSLATORS

Expression Analysis

With expressions that contain variables obviously the sub-expressions cannot be evaluated but the compiler still needs to determine the order in which they will be evaluated.

One solution is to fully parenthesize the expression so as to identify the sub-expressions.

ALGORITHM FOR EXPRESSIONS NOT CONTAINING PARENTHESES

Operator
Precedence
Complement
^
4
0
* or /
3
1
~ (unary)
2
2
1
3
=
0
4

1. Scan the expression and associate Precedence values with the operators;

2. Surround each operator with back-to-back parentheses equal in number to the Complement of its associated Precedence value;

3. Sum the total of the Complementary Precedence values in the expression;

4. Add parentheses around the whole expression, equal in number to the sum of the Complementary Precedence values plus 1.

At this point the expression is "over-parenthesized" but that is not a problem for the compiler or interpeter. However, if it is needed that the parenthesis count be minimized, then:

1. Remove matching parentheses around each operand;

2. Remove all but one set of parentheses around each subexpression.

EXAMPLE

Given the expression:

A +1 7 ^4 C *3 2

Complement the precedence values:

A +3 7 ^0 C *1 2

A )))+3((( 7 ^0 C )*1( 2

The sum of the modified precedences is 4; so add 1 plus this number of parentheses at the ends:

(((((A )))+3((( 7 ^0 C )*1( 2)))))

While the computer can handle this readily, we need to cancell out parentheses surrounding each operand (but never reducing the parentheses around a group to less than one):

(A +(( 7 ^ C )* 2))

MODIFIED ALGORITHM INCLUDING PARENTHESES

1. Using the hierarchy table as before, determine the precedence levels of the operators (using the parenthetical additions as necessary).

2. Find the largest hierarchy value (L) in the expression, and replace all hierarchy values (D) with (L-D), i.e. their complement with respect to the maximum hierarchy value.

3. Sum the total of the modified hierachies in the expression.

4. Surround each operator with back-to-back parentheses equal in number to the modified hierachy value.

5. Put parentheses (facing inwards) on each end of the expression equal in number to the sum of the modified hierarchies.

DRAWBACK: These parenthesizing algorithms have one drawback - that when there are several operators at the same hierarchial (precedence) level in an expression the left-to-right associativity is not applied. Thus an expression such as ~a+b^c-d is parenthesized to ((~a)+(b^c)-d), or in hierarchial form:

```
(    +     -d)

(~a) (b^c)
```

where at the top level the two operators are not distinguished by the parenthesizing. Other methodologies of analysis can overcome this failing.

Last updated 2001/05/01