CS 2506 Spring 2016 MIPS04 --------------------------------------------------------------------------------- 1. Consider a two-dimensional, 128 x 128 array of values of type double. Suppose we have such three arrays, A, B, and C. Assume that each array is laid out in memory in row-major order, and the beginning of each array is aligned to a page boundary (e.g., A[0][0] is located at the beginning of a page; B[0][0]starts at another page; etc.). Consider the following matrix multiplication (ijk) algorithm: for (int i = 0; i < 128; i++) { for (int j = 0; j < 128; j++) { double sum = 0.0; for (int k = 0; k < 128; k++) { sum += A[ i ][ k ] * B[ k ][ j ]; C[ i ][ j ] = sum; } } Assume that the variable sum is register allocated (i.e., an access to sum does not lead to a memory access, and thus has no virtual-to-physical page translation); Assume the virtual memory system uses 4 KiB pages, and that the number of virtual and physical pages is large enough that no page replacements must be performed. a) [6 points] How many pages are needed to store each array? Answer: Each array needs 2^7 * 2^7 * 2^3 (double) = 2^17 bytes Each page can store 2^12 bytes => 2^17 / 2^12 = 2^5 pages >>> 3 points for identifying the size of an array correctly >>> Another 3 points for correct final answer b) [6 points] By the time the first sum for C[0][0] is calculated and stored, how many page faults will have been triggered 1) by accessing the first row of array A; 2) by accessing the first column of array B; and 3) by performing a write to C[0][0]? Answer: Each row requires 2^7 (# of columns) * 2^3 (double type) bytes of storage. As a page can store 2^12 bytes, it can hold 4 (= 2^12 / 2^10) rows of an array. 1) Accessing the first row of A would result in ONE page fault as the whole row are in the same page. 2) Accessing the first column of B would result in 2^5 page faults based on a) 3) Accessing the first element of C would cause ONE page fault. >>> 2 points each (All or nothing) c) [6 points] How many page faults will have been triggered by array element accesses by the time the multiplication finishes? Answer: The multiplication algorithm accesses all elements of each array which spans 2^5 pages. Therefore, it would result in 2^5 * 3 page faults in total. >>> 3 points for identifying that each array access would cause 2^5 page faults >>> Another 3 points for correct final answer 2. Consider a system that uses 48-bit virtual addresses. a) [5 points] What is the maximum size, in bytes, of the virtual address space for a process, if the system uses 4KiB pages? Express your answer in TiB. Answer: 48-bit address space. The page size doesn't matter. 2^48 = 2^8 * 2^10 * 2^10 * 2^10 * 2^10 = 2^8 TiB >>> All or nothing b) [5 points] What is the maximum size, in bytes, of the virtual address space for a process, if the system uses 2MiB pages? Express your answer in TiB. Answer: 48-bit address space. The page size doesn't matter. 2^48 = 2^8 * 2^10 * 2^10 * 2^10 * 2^10 = 2^8 TiB >>> All or nothing c) [5 points] What is the maximum number of virtual pages a process might have, if the system uses 4KiB pages? Express your answers in terms of powers of 2. Answer: 2^48 bytes of address space. Each page has 2^12 bytes. 2^48 / 2^12 = 2^36 >>> All or nothing d) [5 points] What is the maximum number of virtual pages a process might have, if the system uses 2MiB pages? Express your answers in terms of powers of 2. Answer: 2^48 bytes of address space. Each page has 2^21 bytes. 2^48 / 2^21 = 2^27 >>> All or nothing 3. Answer each of the following questions in one or two sentences. a) [5 points] How do operating systems or hardware find the location of the page table of each process? Answer: By looking up the page table register (a special register). >>> All or nothing b) [5 points] Does a TLB miss always imply a page fault? Justify your answer. Answer: No. A TLB miss implies that the corresponding PTE is not in the TLB cache. The page could be present in a physical memory, but not cached in the TLB. >>> All or nothing c) [6 points] Explain one disadvantage of a physically-addressed cache in which physical addresses are used for cache set lookup and tag comparison. Answer: Performance overhead due to TLB lookup before cache accesses. >>> All or nothing d) [6 points] Explain one disadvantage of a virtually-addressed cache in which virtual addresses are used for cache set lookup and tag comparison. Answer: Complication due to aliasing between processes. This implies that the cache should be flushsed on a context switch or the cache should track a process id (in addition to a tag). >>> All or nothing 4. Suppose a system uses 16-bit physical and virtual addresses, with a page size of 4 KiB. The system use a page table to maintain the mapping of virtual page numbers to physical page numbers. Suppose that, for a particular process, the current page table is: +----------------------------------+ | Valid? | Dirty? | VPN | PPN/Disk | +----------------------------------+ | 1 | 1 | 0 | 6 | Virtual page 0 is in DRAM in physical page 6 +----------------------------------+ | 0 | 0 | 1 | Disk | Virtual page 1 is on disk, not in DRAM +----------------------------------+ | 1 | 0 | 2 | 5 | ... +----------------------------------+ | 0 | 0 | 3 | Disk | +----------------------------------+ | 1 | 0 | 4 | 7 | +----------------------------------+ | 1 | 1 | 5 | 4 | +----------------------------------+ | ... | ... | ... | ... | More, irrelevant PTEs not shown +----------------------------------+ The system also uses a translation lookaside buffer (TLB) to cache recently-accessed page table entries (PTEs); the TLB uses pure LRU replacement; and the TLB can holds up to four PTE entries. Suppose that, for the same particular process, the current TLB is: +----------------------------------------+ | Valid? | Dirty? | VPN | PPN | Accessed | +----------------------------------------+ | 1 | 1 | 0 | 6 | t1 | Virtual page 0 is in physical page 6, last access was time 1 +----------------------------------------+ | 1 | 1 | 5 | 4 | t2 | ... +----------------------------------------+ | 1 | 0 | 2 | 5 | t3 | +----------------------------------------+ | 0 | 0 | 8 | 3 | t0 | This entry is invalid (stale from a previous process). +----------------------------------------+ Assume that if a page must be brought from disk, the next available physical pages would be 8, 9, etc. a) [4 points] Suppose that at time t4 the process performs a write operation to virtual address 0x0123. What would the corresponding physical address be? Answer: The page size of 4 KiB implies that 12 bits are used for offset and most significant 4 bits (16-12=4) are used for PPN/VPN (physical/virtual) page number. Therefore, for virtual address 0x0123, VPN is 0. Due to the page table, VPN 0 is mapped to PPN 6. Therefore, physical address would be 0x6123. >>> 2 points for correct VPN to PPN translation >>> Another 2 points for offset b) [12 points] After the write to virtual address 0x0123 at time t4, show the contents of 1) the page table and 2) the TLB. Then, specify if this memory access results in 3) a TLB miss; and/or 4) a page fault. Answer: The access to VA 0x0123 would result in a TLB hit, thus the page table and TLB remains the same (except the timestamp), and there will be no page fault. 1) The page table remains the same 2) The TLB remains the same except that the accessed time of the first entry becomes t4 +----------------------------------------+ | Valid? | Dirty? | VPN | PPN | Accessed | +----------------------------------------+ | 1 | 1 | 0 | 6 | t4 | +----------------------------------------+ | 1 | 1 | 5 | 4 | t2 | +----------------------------------------+ | 1 | 0 | 2 | 5 | t3 | +----------------------------------------+ | 0 | 0 | 8 | 3 | t0 | +----------------------------------------+ 3) No TLB miss 4) No page fault >>> 3 points each (All or nothing). In total 12 points. c) [12 points] Suppose that at time t5 the process performs a read operation from virtual address 0x4321. Show the contents of 1) the page table and 2) the TLB. Then, specify if this memory access results in 3) a TLB miss and/or 4) a page fault. Answer: VA 0x4321 has VPN 4, which will result in a TLB miss. However, the page table has a mapping VPN 4 to PPN 7, so no page fault. The oldest entry in TLB will be replaced. 1) The page table reamin the same 2) TLB +----------------------------------------+ | Valid? | Dirty? | VPN | PPN | Accessed | +----------------------------------------+ | 1 | 1 | 0 | 6 | t4 | +----------------------------------------+ | 1 | 1 | 5 | 4 | t2 | +----------------------------------------+ | 1 | 0 | 2 | 5 | t3 | +----------------------------------------+ | 1 | 0 | 4 | 7 | t5 | +----------------------------------------+ 3) TLB miss 4) No page fault >>> 3 points each (All or nothing). In total 12 points. e) [12 points] Suppose that at time t6 the process performs a read operation from virtual address 0x3030. Show the contents of 1) the page table and 2) the TLB. Then, specify if this memory access results in 3) a TLB miss. and/or 4) a page fault. Answer: VA 0x3030 has VPN 3, which will result in a TLB miss. The page table does not have a valid entry either, leading to a page fault. The next available physical page 8 will be used for a new mapping. The oldest entry in TLB will be replaced. 1) page table +----------------------------------+ | Valid? | Dirty? | VPN | PPN/Disk | +----------------------------------+ | 1 | 1 | 0 | 6 | +----------------------------------+ | 0 | 0 | 1 | Disk | +----------------------------------+ | 1 | 0 | 2 | 5 | +----------------------------------+ | 1 | 0 | 3 | 8 | New mapping added +----------------------------------+ | 1 | 0 | 4 | 7 | +----------------------------------+ | 1 | 1 | 5 | 4 | +----------------------------------+ | ... | ... | ... | ... | +----------------------------------+ 2) TLB +----------------------------------------+ | Valid? | Dirty? | VPN | PPN | Accessed | +----------------------------------------+ | 1 | 1 | 0 | 6 | t4 | +----------------------------------------+ | 1 | 0 | 3 | 8 | t6 | New PTE added. +----------------------------------------+ | 1 | 0 | 2 | 5 | t3 | +----------------------------------------+ | 1 | 0 | 4 | 7 | t5 | +----------------------------------------+ 3) TLB miss 4) Page fault >>> 3 points each (All or nothing). In total 12 points.