Suppose for a moment that you were given the following list of instructions to perform:

  1. 0001 0011 0011 1011
  2. 1101 0111 0001 1001
  3. 1111 0001 1101 1111
  4. 0000 1100 0101 1101
  5. 0001 0011 0011 1011

Of course, these instructions have no real meaning to you, but they are exactly the kind of instructions that a computer expects. Instructions like these are called "machine code" and each one represents a typical operation that a computer might perform. You can immediately see the difficulty with this language. While it might be very appropriate for a computer, it is extremely confusing for a computer programmer.

As computers have developed over the past few decades, new generations of programming languages have also been developed to bridge the gap between programmers and machine code. An early solution to simplifying programming was the use of hexadecimal notation to represent machine instructions. For example, in hexadecimal, the first instruction would be written 133B. While this representation is certainly easier to remember than a string of ones and zeros, it still fails to give us any idea of the purpose of the instruction.

Assembly language was the first programming language to address the problem of assigning meaningful names to computer instructions. This language used simple names like "LOAD", "ADD", or "STORE" to represent machine instructions. A program written with these names was then converted into machine code using an assembler, a program that translated the names into the binary instructions understood by the computer. Assembly languages are now referred to as "second generation" languages while machine code is considered to be a "first generation" language.

The next generation of computer languages further increased the ease of programming by grouping sets of machine instructions together to form common programming constructs. While it might take 3 or 4 lines of code to add two number using assembly language, this task could be accomplished with a single instruction in a "third generation" language. Languages such as Pascal, C, C++, Java, and Ada are all examples of third generation languages that are widely used today. These languages are also known as high-level languages since they abstract away the details of machine code and help programmers to concentrate on problem solving.

The animation below shows a comparison of the three generations of languages. For each language, the code for a simple tax program is shown. Click on the label on the left to view the corresponding code.

In this section on programming languages, you will be learning about the five basic concepts of a "third generation" language. These concepts are variables, expressions, control structures, input/output, and abstraction. By the end of this section, you should be able to do the following: