Project 3 - Hints & Design Milestone

Design Milestone

For Project 3, we have included a design milestone. We ask that you submit a draft of your design by Wednesday, March 29, 11:59pm.
For students in both sections: No late days can be used. The milestone does not give any credit per se, but not submitting the design milestone will lead to an automatic 15 point deduction, reducing the total number of points you can get in this project to 85.

Before you start on this project, make sure you understand the requirements and the underlying abstractions you will have to deal with. In particular, make sure you understand the relationship between virtual pages, physical page frames, and the swap space. Make sure you understand what lazy loading of virtual pages means and what the semantics of mmap() is.

To successfully implement this project, you will need to design a number of data structures. Of particular importance are container data structures that

  1. keep track of a process's virtual pages -- this is referred to as page table management

  2. keep track of the machine's physical frames -- this is referred to as (user page) frame allocation or frame table management

  3. keep track of the disk sectors that form the swap disk -- this is referred to as swap space allocation or management

  4. keep track of the mmap'ped areas of a process -- this is referred to as mmap or memory region management

For each individual item that is contained in each of those data structures, consider what information the item in question must contain. Specifically, consider what information your page table entries must contain, what information must be stored about each physical page frame, what information must be stored regarding how swap space is used and what information must be stored for each mmap'd region.

Consider how the data structures will reference each other. All of these data structures will be kept in non-pageable kernel memory. This tremendously simplifies your job: you can use direct pointers to link those data structures and you can be sure that those pointers remain valid. (In a real OS, parts of the page table would themselves be allocated in pageable memory, but we do not require that you do that in Pintos.)

The Pintos document refers to items 1. and 2. collectively as "page table management", but understand that they are really two different, albeit related issues.

For item 2., you are only required to keep track of the subset of frames that are used to hold user pages. These page frames form the "user page pool", which should really be called "user page frame pool", since it is physical page frames that are managed by it, not virtual pages. In Pintos, only the user page frame pool is subject to possible eviction. By contrast, those frames that were designated to be part of the kernel pool when they were allocated are never evicted. You have (at least) two choices here: either break open palloc.c and take over the management of user pool frames there, or leave palloc.c untouched and implement a layer on top of it using palloc_get_page(PAL_USER).

When it comes to choosing a container for items 1. through 4., possible choices are bitmaps (check out Pintos's internal bitmap support), hash tables, lists, and simple arrays. More complex data structures may give you additional benefit, but they may also needlessly complicate your implementation.

When choosing the appropriate type of container, you should consider whether the number of elements that will be contained in it stays constant or not, and consider what type of access your container must support most efficiently.

Consider how many copies of a particular container are needed in your design: is it one or more per process, or is it one or more globally. Perform this consideration for each of items 1. through 4.

For item 4., the Pintos specification suggests that it is possible to implement lazy loading for code & data segments as special cases of memory-mapped regions. This is true, and this is how it is done in real OS - however, in my personal opinion, it may be overkill for this project. In particular, the changes to process.c::load() will be more substantial if you choose this route, and you will need to understand more about ELF segments.

For lazy loading & mmap, consider how you will represent partial pages.

Requirements for Design Milestone

Your design should address items 1. through 4. above. You should submit

a) the C declarations of all data structures you will be using in your design.
b) the C declarations of any global or static variables you intend to use.

Include a comment for each global variable, for each struct declaration, and include a comment for each struct field.

For each container data structure, describe

a) how it is populated
b) how many elements it will contain
c) how it is accessed
d) how many copies of that container exist in the system
e) how it is destroyed, if applicable

For part c) with respect to your page table, specifically address how you find out what action needs to be taken based on the fault address when a page fault occurs.

Do not submit code, aside from the requested declarations. Do not submit descriptions of algorithms. All we want to see in this design milestone are your data structures. Your design should fit on one page when printed. The required format is ASCII.

Suggested Implementation Order

After you have completed your design, you will need to implement it. I recommend the following order of implementation, which is a bit different from the order of presentation in the Pintos specification.

First, fix all bugs you may have left over from Project 2. This is required not only to pass the class, but necessary to even have a shot at Projects 3 and 4.

From here, you can implement the following in parallel: After all this works, set a CVS tag. You will be passing the vast majority of test cases now, but don't be deceived: you are only 1/2 done. You should now start implementing your page eviction strategy.

If a frame allocation request arrives, and no free user page frames are available, identify a frame to evict, and write it to swap space if necessary. Implement swapping pages in from swap space. Initially, you could implement a page eviction strategy that evicts a random page.

At this point, you will need to worry about

Finally, implement a eviction strategy such as the clock algorithm.

Additional Comments

Implementing shared pages, such as sharing of code segments or non-private mmap() mappings, is not quite as straightforward as the Pintos handout makes it out to be. Only attempt this is you are really comfortable with the base requirements.

Your design and implementation needs to support parallelism. Specifically, multiple processes should be able to simultaneously page frames in and out. This means, for instance, that a single global lock to protect your frame table is not acceptable. Rather, you will need to synchronize access to each physical frame individually. We advise that you do not try to micro-optimize your design.

I may add additional comments and/or hints here.