The fundamental unit of data storage in a computer is called a bit or binary digit. A bit is similar to a two-way switch. Just like a switch has two states (off or on), a bit also has two states (0 or 1). Often these two states represent the values false or true and are implemented inside a computer by using a low voltage value or a high voltage value. Since bits provide the foundation for all data storage, it is not surprising that the binary number system is very important to computers. If you are unfamiliar with binary numbers, it would be a good idea to review this topic before continuing. (For more information on binary numbers, see the Number Systems module.)

States of a Bit
0
FALSE

OFF

LOW VOLTAGE
1
TRUE

ON

HIGH VOLTAGE

By themselves, bits are not very interesting or useful. In order to store more complex forms of data, bits are joined together into larger groups known as bytes. Every byte is made up of eight bits and can be used to encode data such as numbers (integers and reals) or character symbols. The most common scheme used to represent integers is called Two's Complement. Using this scheme, it is possible to represent the integers from -128 to +127. (For more information on Two's Complement, see the Two's Complement lesson in the Number Systems module.) For real numbers, computers typically use a floating point representation similar to the one illustrated in the diagram below. With only eight bits, the range of real values that can be represented is very limited. To solve this problem, computers use two or more bytes when representing real numbers. Notice that the sixteen bits below are partitioned into three groups: the mantissa, the exponent, and two sign bits. This division allows the computer to represent floating point numbers such as .00947 in the binary equivalent of scientific notation. In this example, the mantissa (11101100112) corresponds to the decimal number 947 with a zero sign bit for positive value, and the exponent (00102) corresponds to 2, the power of ten, with a one sign bit for negative value. Notice that with 10 bits to represent the mantissa, this scheme only allows for 3 significant digits.

We can also assign particular patterns of bits to represent common symbols such as letters, punctuation marks, and numerals. One very common representation of these symbols is ASCII, the American Standard Code for Information Interchange. The applet below shows 16 bits as boxes to represent two bytes of computer memory. The left most bit is called the most significant bit while the right most bit is called the least significant bit. To see how integers would be stored in Two's Complement, enter an integer between -32,768 and 32,767 in the "Value" box, select the "Two's complement" radio button, and then press "Store Value." To see how real numbers would be stored, enter a real number with 3 significant digits and an exponent less than 15, select "Floating Point," and press "Store Value." To see how ASCII characters would be stored, enter a character from the keyboard, select "ASCII," and then press "Store Value" again.

The main memory of a computer is composed of millions of storage cells similar to the one illustrated in the applet. The size of the storage cells is known as the word size for the computer. In some computers, the word size is one byte while in other computers the word size is two, four, or even eight bytes. Each storage cell in main memory has a particular address which the computer can use for storing or retrieving data. This arrangement of cells is somewhat similar to a computer spreadsheet where each box of the spreadsheet can hold various data. Just like the boxes of the spreadsheet are identified by a row and column combination (e.g., A2, C4, etc.), the cells of a computer's main memory are identified by a particular address (e.g., Cell 1, Cell 2, etc.). The addresses begin at 0 and increase by 1 until the end of the main memory is reached. For simplicity, these addresses are shown below in decimal. However, in the computer, addressing is done using binary values.

One important result of "organizing a machine's main memory as small, addressable cells is that each cell can be referenced, accessed, and modified individually. A memory cell with a low address is just as accessible as one with a high address. In turn, data stored in a machine's main memory is often referred to as random access memory (RAM)" [Brookshear 1997]. Because computers have such large amounts of RAM, the size of the main memory is usually measured in megabytes (MB) rather than just bytes. One megabyte is equal to 220 bytes or 1,048,578 bytes. Some other common measures for quantities of bytes are listed in the table below.

Kilobyte (KB) 1024  or 210 bytes 1,024 bytes Thousands of bytes
Megabyte (MB) 10242 or 220 bytes 1,048,578 bytes Millions of bytes
Gigabyte (GB) 10243 or 230 bytes 1,073,741,824 bytes Billions of bytes
Terabyte (TB) 10244 or 240 bytes 1,099,511,627,776 bytes Trillions of bytes

References