Another part of the operating system is the file manager. While the memory manager is responsible for the maintenance of primary memory, the file manager is responsible for the maintenance of secondary storage (e.g., hard disks). Nutt [1997] describes the responsibility of the file manager and defines the file, the fundamental abstraction of secondary storage:

"Each file is a named collection of data stored in a device. The file manager implements this abstraction and provides directories for organizing files. It also provides a spectrum of commands to read and write the contents of a file, to set the file read/write position, to set and use the protection mechanism, to change the ownership, to list files in a directory, and to remove a file...The file manager provides a protection mechanism to allow machine users to administer how processes executing on behalf of different users can access the information in files. File protection is a fundamental property of files because it allows different people to store their information on a shared computer, with the confidence that the information can be kept confidential."

In addition to these functions, the file manager also provides a logical way for users to organize files in secondary storage. Brookshear [1997] elaborates:

"For the convenience of the machine's users, most file managers allow files to be grouped into a bundle called a directory or folder. This approach allows a user to organize his or her files according to their purpose by placing related files in the same directory. Moreover, by allowing directories to contain other directories, called subdirectories, a hierarchical organization can be constructed. For example, a user may create a directory called Records that contains subdirectories called Financial Records, Medical Records, and Household Records. Within each of these subdirectories could be files that fall within that particular category. A sequence of directories within directories is called a directory path" [Brookshear 1997]

While users may need to store complex data structures in secondary storage, most storage devices including hard disks "are capable of storing only linearly addressed blocks of bytes." Thus, the file manager needs to provide some way of mapping user data to storage blocks in secondary storage and vice versa. These blocks can be managed in at least three different ways: "as a contiguous set of blocks on the secondary storage device, as a list of blocks interconnected with links, or as a collection of blocks interconnected by a file index" [Nutt 1997]. These three methods are commonly referred to as Contiguous Allocation, Linked Allocation, and Indexed Allocation. For each of these methods, read the text below and use the simulation applet to watch how the blocks are managed. These materials were provided courtesy of Haritha Moosani [Moosani 1998], and were developed as part of a research project under Professor Franz Kurfess at the New Jersey Institute of Technology.

Contiguous Allocation Method

Summary:

In this method data in a file is stored in a contiguous section of the disk (that is, the file occupies a linear sequence of disk blocks). The directory entry for each file contains the file name, start block number and the file size.

Simulation:

                     [Help]

Advantages:

  1. Simple to implement
  2. Allows linear and sequential access with the same ease

Disadvantages:

  1. Does not allow simple expansion of files
  2. Risk of external fragmentation
  3. Solving the fragmentation problem requires compaction, a time consuming process
  4. The kernel must allocate and reserve contiguous space when the file is first created

Linked Allocation Method

Summary:

Here the data blocks of a file can be scattered anywhere on the disk. The directory entry of a file contains the start block number and the file name. Each data block uses 4 bytes of its space for a pointer to the next block of the file. This continues until the last block which has a special end-of-file (EOF) value.

Simulation:

                     [Help]

Advantages:

  1. No external fragmentation. Any free block can be used to satisfy the request
  2. A file can expand. No need to declare the file size when the file is first created
  3. No need to compact disk space

Disadvantages:

  1. Direct-access is very inefficient
  2. The pointers take up some space
  3. Scattering the pointers all over the disk poses a reliability problem

Indexed Allocation Method

Summary:

In this allocation method an index block is allocated for each file that is created. The index block of a file contains all the pointers that point to the data blocks of that file. The directory entry contains the index block number and the file name.

Simulation:

                     [Help]

Advantages:

  1. Grouping all the pointers into one location solves the problem of reliability.
  2. Direct-access is efficient. The 'i'th entry in the index block points to the 'i'th block of the file.

Disadvantages:

  1. The pointers may waste a lot of space since an entire disk block must be allocated to hold them even if few pointers are actually used.

References