[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In the previous two assignments, you made extensive use of a
file system without actually worrying about how it was implemented
underneath. For this last assignment, you will fill in the
implementation of the file system. You will be working primarily in
the filesys
directory.
You may build project 4 on top of project 2 or project 3. In either
case, all of the functionality needed for project 2 must work in your
filesys submission. If you build on project 3, then all of the project
3 functionality must work also, and you will need to edit
filesys/Make.vars
to enable VM functionality. You can receive up
to 5% extra credit if you do enable VM.
Here are some files that are probably new to you. These are in the
filesys
directory except where indicated:
fsutil.c
filesys.h
filesys.c
directory.h
directory.c
inode.h
inode.c
file.h
file.c
lib/kernel/bitmap.h
lib/kernel/bitmap.c
Our file system has a Unix-like interface, so you may also wish to
read the Unix man pages for creat
, open
, close
,
read
, write
, lseek
, and unlink
. Our file
system has calls that are similar, but not identical, to these. The
file system translates these calls into physical disk operations.
All the basic functionality is there in the code above, so that the file system is usable from the start, as you've seen in the previous two projects. However, it has severe limitations which you will remove.
While most of your work will be in filesys
, you should be
prepared for interactions with all previous parts (as usual).
Before you turn in your project, you must copy the
project 4 design document template into your source tree under the name
pintos/src/filesys/DESIGNDOC
and fill it in. We recommend that
you read the design document template before you start working on the
project. See section D. Project Documentation, for a sample design document
that goes along with a fictitious project.
The basic file system allocates files as a single extent, making it vulnerable to external fragmentation. Eliminate this problem by modifying the on-disk inode structure. In practice, this probably means using an index structure with direct, indirect, and doubly indirect blocks. (You are welcome to choose a different scheme as long as you explain the rationale for it in your design documentation, and as long as it does not suffer from external fragmentation.)
You can assume that the disk will not be larger than 8 MB. You must support files as large as the disk (minus metadata). Each inode is stored in one disk sector, limiting the number of block pointers that it can contain. Supporting 8 MB files will require you to implement doubly-indirect blocks.
An extent-based file can only grow if it is followed by empty space, but with indexed inodes file growth is possible whenever free space is available. Implement file growth. In the basic file system, the file size is specified when the file is created. In UNIX and most other file systems, a file is initially created with size 0 and is then expanded every time a write is made off the end of the file. Your file system must allow this.
There should be no predetermined limit on the size of a file, except that a file cannot exceed the size of the disk (minus metadata). This also applies to the root directory file, which should now be allowed to expand beyond its initial limit of 16 files.
The user is allowed to seek beyond the current end-of-file (EOF). The seek itself does not extend the file. Writing at a position past EOF extends the file to the position being written, and any gap between the previous EOF and the start of the write must be filled with zeros. A read starting from a position past EOF returns no bytes.
Writing far beyond EOF can cause many blocks to be entirely zero. Some file systems allocate and write real data blocks for these implicitly zeroed blocks. Other file systems do not allocate these blocks at all until they are explicitly written. The latter file systems are said to support "sparse files." You may adopt either allocation strategy in your file system.
Implement a hierarchical name space. In the basic file system, all files live in a single directory. Modify this to allow directory entries to point to files or to other directories.
Make sure that directories can expand beyond their original size just as any other file can.
The basic file system has a 14-character limit on file names. You may retain this limit for individual file name components, or may extend it, at your option. You must allow full path names to be much longer than 14 characters.
The current directory is maintained separately for each process. At
startup, the initial process's current directory is the root directory.
When one process starts another with the exec
system call, the
child process inherits its parent's current directory. After that, the
two processes' current directories are independent, so that either
changing its own current directory has no effect on the other.
Update the existing system calls so that, anywhere a file name is provided by the caller, an absolute or relative path name may used.
Update the remove
system call so that it can delete empty
directories in addition to regular files. Directories can only be
deleted if they do not contain any files or subdirectories.
Implement the following new system calls:
mkdir("/a/b/c")
succeeds only if /a/balready exists and
/a/b/cdoes not.
stdout
, one
per line, in no particular order.
We have provided ls
and mkdir
user programs, which
are straightforward once the above syscalls are implemented. In Unix,
these are programs rather than built-in shell commands, but
cd
is a shell command.
The pintos
put
and get
commands should now
accept full path names, assuming that the directories used in the
paths have already been created. This should not require any extra
effort on your part.
You may support .
and ..
for a small amount of extra
credit.
Modify the file system to keep a cache of file blocks. When a request is made to read or write a block, check to see if it is stored in the cache, and if so, fetch it immediately from the cache without going to disk. Otherwise, fetch the block from disk into cache, evicting an older entry if necessary. You are limited to a cache no greater than 64 sectors in size.
Be sure to choose an intelligent cache replacement algorithm. Experiment to see what combination of accessed, dirty, and other information results in the best performance, as measured by the number of disk accesses. For example, metadata is generally more valuable to cache than data.
You can keep a cached copy of the free map permanently in memory if you like. It doesn't have to count against the cache size.
The provided inode code uses a "bounce buffer" allocated with
malloc()
to translate the disk's sector-by-sector interface into
the system call interface's byte-by-byte interface. You should get rid
of these bounce buffers. Instead, copy data into and out of sectors in
the buffer cache directly.
Your implementation should also include the following features:
filesys_done()
, so that halting Pintos flushes the cache.
If you have timer_sleep()
from the first project working, this is
an excellent application for it. If you're still using the base
implementation of timer_sleep()
, be aware that it busy-waits, which
is not an acceptable solution. If timer_sleep()
's delays seem too
short or too long, reread the explanation of the -r
option to
pintos
(see section 1.1.4 Debugging versus Testing).
Read-ahead is only really useful when done asynchronously. That means, if a process requests disk block 1 from the file, it should block until disk block 1 is read in, but once that read is complete, control should return to the process immediately. The read-ahead request for disk block 2 should be handled asynchronously, in the background.
We recommend integrating the cache into your design early. In the past, many groups have tried to tack the cache onto a design late in the design process. This is very difficult. These groups have often turned in projects that failed most or all of the tests.
The provided file system requires external synchronization, that is, callers must ensure that only one thread can be running in the file system code at once. Your submission must adopt a finer-grained synchronization strategy that does not require external synchronization. To the extent possible, operations on independent entities should be independent, so that they do not need to wait on each other.
Operations on different cache blocks must be independent. In particular, when I/O is required on a particular block, operations on other blocks that do not require I/O should proceed without having to wait for the I/O to complete.
Multiple processes must be able to access a single file at once.
Multiple reads of a single file must be able to complete without
waiting for one another. When writing to a file does not extend the
file, multiple processes should also be able to write a single file at
once. A read of a file by one process when the file is being written by
another process is allowed to show that none, all, or part of the write
has completed. (However, after the write
system call returns to
its caller, all subsequent readers must see the change.) Similarly,
when two processes simultaneously write to the same part of a file,
their data may be interleaved.
On the other hand, extending a file and writing data into the new section must be atomic. Suppose processes A and B both have a given file open and both are positioned at end-of-file. If A reads and B writes the file at the same time, A may read all, part, or none of what B writes. However, A may not read data other than what B writes, e.g. if B's data is all nonzero bytes, A is not allowed to see any zeros.
Operations on different directories should take place concurrently. Operations on the same directory may wait for one another.
Here's a summary of our reference solution, produced by the
diffstat
program. The final row gives total lines inserted
and deleted; a changed line counts as both an insertion and a deletion.
This summary is relative to the Pintos base code, but the reference solution for project 4 is based on the reference solution to project 3. Thus, the reference solution runs with virtual memory enabled. See section 5.3 FAQ, for the summary of project 3.
The reference solution represents just one possible solution. Many other solutions are also possible and many of those differ greatly from the reference solution. Some excellent solutions may not modify all the files modified by the reference solution, and some may modify files not modified by the reference solution.
Makefile.build | 5 devices/timer.c | 42 ++ filesys/Make.vars | 6 filesys/cache.c | 473 +++++++++++++++++++++++++ filesys/cache.h | 23 + filesys/directory.c | 99 ++++- filesys/directory.h | 3 filesys/file.c | 4 filesys/filesys.c | 194 +++++++++- filesys/filesys.h | 5 filesys/free-map.c | 45 +- filesys/free-map.h | 4 filesys/fsutil.c | 8 filesys/inode.c | 444 ++++++++++++++++++----- filesys/inode.h | 11 threads/init.c | 5 threads/interrupt.c | 2 threads/thread.c | 32 + threads/thread.h | 38 +- userprog/exception.c | 12 userprog/pagedir.c | 10 userprog/process.c | 332 +++++++++++++---- userprog/syscall.c | 582 ++++++++++++++++++++++++++++++- userprog/syscall.h | 1 vm/frame.c | 161 ++++++++ vm/frame.h | 23 + vm/page.c | 297 +++++++++++++++ vm/page.h | 50 ++ vm/swap.c | 85 ++++ vm/swap.h | 11 30 files changed, 2721 insertions(+), 286 deletions(-) |
You may implement Unix-style support for .
and ..
in
relative paths in their projects.
You may submit with VM enabled.
DISK_SECTOR_SIZE
change?
No, DISK_SECTOR_SIZE
is fixed at 512. This is a fixed property
of IDE disk hardware.
Forward slash (/
).
The disk we create will be 8 MB or smaller. However, individual files will have to be smaller than the disk to accommodate the metadata. You'll need to consider this when deciding your inode organization.
cd
a shell command?
The current directory of each process is independent. A cd
program could change its own current directory, but that would have no
effect on the shell. In fact, Unix-like systems don't provide any way
for one process to change another process's current working directory.
struct inode_disk
inside struct inode
?
The goal of the 64-block limit is to bound the amount of cached file
system data. If you keep a block of disk data--whether file data or
metadata--anywhere in kernel memory then you have to count it against
the 64-block limit. The same rule applies to anything that's
"similar" to a block of disk data, such as a struct inode_disk
without the length
or sector_cnt
members.
That means you'll have to change the way the inode implementation
accesses its corresponding on-disk inode right now, since it currently
just embeds a struct inode_disk
in struct inode
and reads the
corresponding sector from disk when it's created. Keeping extra
copies of inodes would subvert the 64-block limitation that we place
on your cache.
You can store a pointer to inode data in struct inode
, if you want,
and you can store other information to help you find the inode when you
need it. Similarly, you may store some metadata along each of your 64
cache entries.
You can keep a cached copy of the free map permanently in memory if you like. It doesn't have to count against the cache size.
byte_to_sector()
in filesys/inode.c
uses the
struct inode_disk
directly, without first reading that sector from
wherever it was in the storage hierarchy. This will no longer work.
You will need to change inode_byte_to_sector()
so that it reads the
struct inode_disk
from the storage hierarchy before using it.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |