Before you start on this project, make sure you understand the requirements and the underlying abstractions you will have to deal with. In particular, make sure you understand the relationship between virtual pages, physical page frames, and the swap space. Make sure you understand what lazy loading of virtual pages means and what the semantics of mmap() is.
To successfully implement this project, you will need to design a number of data structures. Of particular importance are container data structures that
Consider how the data structures will reference each other. All of these data structures will be kept in non-pageable kernel memory. This tremendously simplifies your job: you can use direct pointers to link those data structures and you can be sure that those pointers remain valid. (In a real OS, parts of the page table would themselves be allocated in pageable memory, but we do not require that you do that in Pintos.)
The Pintos document refers to items 1. and 2. collectively as "page table management", but understand that they are really two different, albeit related issues.
For item 2., you are only required to keep track of the subset of frames that are used to hold user pages. These page frames form the "user page pool", which should really be called "user page frame pool", since it is physical page frames that are managed by it, not virtual pages. In Pintos, only the user page frame pool is subject to possible eviction. By contrast, those frames that were designated to be part of the kernel pool when they were allocated are never evicted. You have (at least) two choices here: either break open palloc.c and take over the management of user pool frames there, or leave palloc.c untouched and implement a layer on top of it using palloc_get_page(PAL_USER).
When it comes to choosing a container for items 1. through 4., possible choices are bitmaps (check out Pintos's internal bitmap support), hash tables, lists, and simple arrays. More complex data structures may give you additional benefit, but they may also needlessly complicate your implementation.
When choosing the appropriate type of container, you should consider whether the number of elements that will be contained in it stays constant or not, and consider what type of access your container must support most efficiently.
Consider how many copies of a particular container are needed in your design: is it one or more per process, or is it one or more globally. Perform this consideration for each of items 1. through 4.
For item 4., the Pintos specification suggests that it is possible to implement lazy loading for code & data segments as special cases of memory-mapped regions. This is true, and this is how it is done in real OS - however, in my personal opinion, it may be overkill for this project. In particular, the changes to process.c::load() will be more substantial if you choose this route, and you will need to understand more about ELF segments.
For lazy loading & mmap, consider how you will represent partial pages.
Requirements for Design Milestone
Your design should address items 1. through 4. above.
You should submit
a) the C declarations of all data structures you will be using in your design.
b) the C declarations of any global or static variables you intend to use.
Include a comment for each global variable, for each struct declaration, and include a comment for each struct field.
For each container data structure, describe
a) how it is populated
b) how many elements it will contain
c) how it is accessed
d) how many copies of that container exist in the system
e) how it is destroyed, if applicable
For part c) with respect to your page table, specifically address how you find out what action needs to be taken based on the fault address when a page fault occurs.
Do not submit code, aside from the requested declarations. Do not submit descriptions of algorithms. All we want to see in this design milestone are your data structures. Your design should fit on one page when printed. The required format is ASCII.
First, fix all bugs you may have left over from Project 2. This is required not only to pass the class, but necessary to even have a shot at Projects 3 and 4.
Or, leave palloc.c untouched and implement a layer on top of it: At start time, call palloc_get_page (PAL_USER) repeatedly to put all available frames from the user pool under the control of the frame allocator. Then never call palloc_get_page(PAL_USER) again. Hint: the global variable "ram_pages" contains the total number of physical page frames, which is at least as large as the number of pages in the user pool.
(The Pintos document suggests to use both bitmaps and palloc_get_page(PAL_USER). This is a bit misleading in my opinion: if you keep the abstraction boundary provided by palloc_get_page(), you probably won't use another bitmap. Think it through yourself.)
Do not implement swapping yet. If an attempt is made to allocate a user page frame, and you're out of user frames, panic with an appropriate error message.
If a frame allocation request arrives, and no free user page frames are available, identify a frame to evict, and write it to swap space if necessary. Implement swapping pages in from swap space. Initially, you could implement a page eviction strategy that evicts a random page.
At this point, you will need to worry about
Finally, implement a eviction strategy such as the clock algorithm.
Implementing shared pages, such as sharing of code segments or non-private mmap() mappings, is not quite as straightforward as the Pintos handout makes it out to be. Only attempt this is you are really comfortable with the base requirements.
Your design and implementation needs to support parallelism. Specifically, multiple processes should be able to simultaneously page frames in and out. This means, for instance, that a single global lock to protect your frame table is not acceptable. Rather, you will need to synchronize access to each physical frame individually. We advise that you do not try to micro-optimize your design.
I may add additional comments and/or hints here.