CS 1104 Introduction to Computer Science


The Basic Structure of a Compiler

The five stages of a compiler combine to translate a high level language to a low level language, generally closer to that of the target computer. Each stage, or sub-process, fulfills a single task and has one or more classic techniques for implementation.

Lexical Analyzer

Analyzes the Source Code
Removes "white space" and comments

Formats it for easy access (creates tokens)
Tags language elements with type information

Begins to fill in information in the SYMBOL TABLE **

Linear Expressions
Finite State Machines
Syntactic Analyzer
Analyzes the Tokenized Code for structure
Amalgamates symbols into syntactic groups
Tags groups with type information
Backus-Naur Form
Top-down analyzers
Bottom-up analyzers
Expression analyzers
Semantic Analyzer
Analyzes the Parsed Code for meaning
Fills in assumed or missing information
Tags groups with meaning information
Attribute Grammars
Ad hoc analyzers
Code Generator
Linearizes the Qualified Code and produces the equivalent Object Code
Generally completed by hand-written code
Examines the Object Code to determine whether there are more efficient means of execution
Common-subexpression elimination
Loop unrolling
Operator reduction

** The Symbol Table is the data structure that all elements of the compiler use to collect and share information about symbols and groups of symbols in the program being translated.


Check your understanding of the structure of a compiler.


The textbook in use with this course, and several other elementary texts, promulgate the mistaken concept that a compiler translates a program in a high level language to assembly language and thence to machine language. While this is sometimes the case it is by no means the most common. Programming languages are frequently accompanied by well defined intermediate languages as the target of compilation - e.g. "p-code" for Pascal, and "byte-code" for Java. Thus the only requirement for a compiler is that it translates from one programming language to another. Generally this translation is between a high level language and a lower level language.


© J.A.N. LEE
Last Updated 2001/04/09