CS 2506 Spring 2016 MIPS02 --------------------------------------------------------------------------------- 1. [6 points] Why is the Write register value to the Registers passed through the MEM/WB interstage buffer, instead of being sent directly from the instruction in the ID stage? Answer: In the pipelined design, the register update are performed by the instruction in the WB stage. The interstage buffers are used to synchronize an instruction and its control signals. The MEM/WB interstage buffer holds the destination register number of the instruction in the WB stage. 2. [6 points] Suppose we want to place the MUX controlled by the RegDst signal into the ID stage (instead of EX stage) to save the buffer space in the ID/EX interstage buffer. How many bits can we save with this design? (Note that you should modify the control signals accordingly as well). Answer: If we place the MUX in the ID stage, then we only need to store 5-bit desination register number (instead of two 5-bit instr[20:16] and instr[15:11]). Moreover, the 1-bit RegDst control signal doesn't need to be forwarded to EX stage as it would be used to choose the desination register number in the ID stage. Therefore, we can save 6 bits. 3. [8 points] Consider the following sequence of MIPS32 assembly instructions fetched and executed consecutively from cycle 1. Suppose all the interstage buffers were zero initially. Find the control signal values stored in the ID/EX interstage buffer at cycle 5. To be precise, mark “don’t care” if a control signal doesn’t matter. (Ignore the modifications assumed in question 2, and solve based on the above figure). Answer: The ID/EX interstage buffer includes control signals for the instruction being executed in the EX stage. At cycle 5, the sw instruction is in the EX stage (after fetched at cycle 3). Therefore, ID/EX.MemToReg = Don't care ID/EX.RegWrite = 0 ID/EX.Branch = 0 ID/EX.MemRead = 0 ID/EX.MemWrite = 1 ID/EX.RegDst = Don't care ID/EX.ALUSrc = 1 4. Consider the following sequence of MIPS32 assembly instructions: add $t0, $t0, $t1 # 4.1 lw $t1, 0($t0) # 4.2 sw $t1, 8($t0) # 4.3 sub $t2, $t0, $t3 # 4.4 lw $t2, 16($t3) # 4.5 add $t0, $t2, $t2 # 4.6 A data dependency occurs when a later instruction requires an input value that is set by an earlier instruction. A data hazard occurs when one instruction writes a value into a register that will be used as input by a later instruction, but that value does not actually appear in the register by the cycle on which the later instruction attempts to read it. Note that a data hazard always implies a data dependency, but some data dependencies do not imply a data hazard. Also remember that this pipeline design does not include any provision for forwarding operands. a) [6 points] Identify the data dependencies that would NOT prevent the given sequence of instructions from executing correctly on the given hardware design above, even if we do not insert nop instructions. For each such dependency, list the register involved, the writing instruction and the reading instruction. Answer: writer reader register ---------------------------- 4.1 4.4 $t0 b) [10 points] Identify the data hazards that would prevent the given sequence of instructions from executing correctly on the given hardware design above, unless we inserted one or more nop instructions. For each such dependency, list the register involved, the writing instruction and the reading instruction. Answer: Remember: unless the reader is at least 3 cycles behind the writer, we have a data hazard. writer reader register ---------------------------- 4.1 4.2 $t0 4.1 4.3 $t0 4.2 4.3 $t1 (sw writes $t1 to memory, so it reads $t1) 4.5 4.6 $t2 c) [10 points] Rewrite the given sequence of instructions adding nop instructions so that the modified sequence would execute correctly on the given hardware design. For full credit, accomplish this with the smallest possible number of nop instructions. Answer: The simplest way to do this is to start with the first hazard, insert enough nops to resolve it, and then proceed down the list: add $t0, $t0, $t1 # 4.1 nop MUST put 4.2 3 cycles behind 4.1 nop lw $t1, 0($t0) # 4.2 nop MUST put 4.3 3 cycles behind 4.2 nop sw $t1, 8($t0) # 4.3 sub $t2, $t0, $t3 # 4.4 lw $t2, 16($t3) # 4.5 nop MUST put 4.6 3 cycles behind 4.5 nop add $t0, $t2, $t2 # 4.6 5. [6 points] Why doesn't the Forwarding unit need the value of the destination register number in the EX stage (i.e., output of the MUX controlled by the RegDst signal) labelled 5 in the diagram? Or does it? Be precise. Answer: The the destination register number labelled 5 corresponds to the instruction that's currently in the EX stage of the pipeline. The forwarding unit should detect the need of forwarding by comparing the input register values (rs and rt) of the instruction in the EX stage and the output register values of the instruction in the MEM or WB stages (that is one or two cycles ahead, respectively). Therefore, the forwarding unit does not need the value of the destination register number in the EX stage. 6. [8 points] Suppose we have three add instructions to execute. Find one possible example of the second/third add instructions that make the Forwarding unit set the control signals for the MUXes labelled 6 to be 10 and 10 (for the upper one and lower one, respectively) when the second add is in the EX stage; and set 01 and 00 (for the upper one and lower one, respectively) when the third add is in the EX stage. Answer: The Forwarding unit sets the control signals for the MUXes 10 if there is a EX hazard (data hazard between immediately next instructions). As both MUXes are set to 10, the second add should use $t0 as both source operands. add $t3, $t0, $t0 (The dest reg doesn't matter) The Forwarding unit sets the control signals for the MUXes 01 if there is a MEM hazard (data hazard between instructions two cycles apart); and 00 if there is no data hazard. Therefore, the third add should use $t0 as the first source operand, and irrelevant, say $t4, as the second operand. add $t5, $t0, $t4 (The dest reg doesn't matter. The second operand should not be $t0 nor the dest reg of the previous add instrcution.) 7. The Forwarding unit performs the following checks to detect MEM hazard in which the data is forwarded from the MUX controlled by the MemtoReg signal. a) [6 points] Why should the Forwarding unit check the condition #1? Be precise. Answer: Some instructions do not write registers. We only need to detect MEM hazard if the instruction is meant to write registers (R-type and load instructions.) b) [6 points] Why should the Forwarding unit check the condition #2? Be precise. Answer: In MIPS, register 0 ($r0, $zero) is hard-wired to a value of zero and can be used as the target register for any instruction whose result is to be discarded. (e.g., add $r0, $r1, $r2 may be used to check if $r1+$r2 results in overflow, but did not want to store the result anywhere). For the case that an instruction in the pipeline has $r0 as its destination, we want to avoid forwarding its possible nonzero result value. c) [6 points] Why should the Forwarding unit check the condition #3? Be precise. Answer: When the source register of reading instruction depend on both the destination register of immediately preceding instruction (a cycle ahead) and that of the second preceding instruction (two cycles ahead), then we should forward the data from immediately preceding instruction to obey program order. 8. [6 points] Consider the following sequence of MIPS32 assembly instructions fetched and executed consecutively from cycle 1. Suppose all the interstage buffers were zero initially. Find the control signal values stored in the ID/EX interstage buffer at cycle 5. (Note that the instructions are different from question 3). Answer: There is a load-use harzard between lw and sw. At cycle 5, lw instruction is in the MEM stage; nop'ed (stalled) sw instruction is in the EX stage; and (re-decoded) sw instruction is in the ID stage. Therefore, all the control signals are set to 0 for stall. 9. [8 points] Suppose we execute the following lw (only) instructions. If we have the (load-use) Hazard Detection unit as described in the above figure, then do we still need the Forwarding unit? Justify your answer. lw $t1, 0($t0) # 9.1 lw $t2, 0($t1) # 9.2 Answer: Yes. The load-use Hazard Detection unit would insert a stall between two load instructions. This implies that when the first lw instruction is in the WB stage, the second lw instruction will be in the EX stage. Without the Forwarding unit, the second lw instruction would use the old $t1 value that was read in the ID stage a cycle earlier. Therefore, we still need the Forwarding unit that forwards the data from MEM/WB interstage buffer to the EX stage. 10. [8 points] Given the (load-use) Hazard Detection unit and the Forwarding unit, how many clock cycles would be required to execute the following sequences of instructions? add $t1, $t0, $t0 #10.1 lw $t2, ($t1) #10.2 lw $t3, ($t2) #10.3 add $t4, $t2, $t3 #10.4 add $t5, $t3, $t4 #10.5 Answer: The data hazard on $t1 between #10.1 and #10.2 is handled by the Forwarding unit. The (load-use) Hazard Detection unit will insert a stall for the load-use hazards on $t2 between #10.2 and #10.3; and on $t3 between #10.3 and #10.4. The data hazards on $t2 between #10.2 and #10.4, and on $t3 between #10.3 and #10.5 would not require a stall as there is another instruction between them. The Forwarding unit would detect MEM hazard and do forwarding accordingly. The data hazard on $t4 between #10.4 and #10.5 is handled by the Forwarding unit. Therefore, it would take 11 cycles (7 instr including two stalls + tail 4 cycles). add $t1, $t0, $t0 #10.1 lw $t2, ($t1) #10.2 stall lw $t3, ($t2) #10.3 stall add $t4, $t2, $t3 #10.4 add $t5, $t3, $t4 #10.5