HW 10 - Datapath, Pipeline, and Hazards
Due at 10:30pm on Monday, November 23, 2015

Collaboration: You must complete this assignment independently. You may not discuss your answers with any students in this or the other section of the course.
Submitting your work: Email me your answers with the subject [CSC 211.01] Assignment 10

Problem 1

Consider a pipelined MIPS processor executing the instruction add $10, $1, $2. Assume register $1 holds the value 5 and register $2 has the value 3.

Part A: Which pipeline registers hold the value 8, the result of the addition, at some point during the execution of this instruction?
Part B: Which pipeline registers hold the value of the MemWrite control line, which enables writes to data memory?
Part C: How many bubbles must be added to the pipeline before this processor can execute the next instruction, add $1, $10, $3, assuming this processor does not support forwarding?
Part D: How many bubbles must be added to the pipeline before this processor can execute the next instruction if the processor does support forwarding?

Problem 2

Each pipeline stage has some latency, and passing a value through each pipeline register adds some additional latency. Answer the following questions using the latency values below:

IF	ID	EX	MEM	WB	Each Pipeline Register
200ps	150ps	150ps	300ps	150ps	20ps

Part A: What is the latency of a lw instruction in a non-pipelined processor with these latencies?
Part B: What is the minimum clock cycle time for a non-pipelined processor with these latencies?
Part C: What is the latency of a lw instruction in a pipelined processor with these latencies?
Part D: What is the minimum clock cycle time for a pipelined processor with these latencies?
Part E: Ignoring hazards, what is the speed-up achieved by introducing pipelining?
Part F: If we could split one stage into two stages, each with half the latency of the original stage, which stage would you split?
Part G: What is the minimum clock cycle time for this new pipeline?

Problem 3

Consider the following sequence of instructions executed in a standard five-stage pipelined datapath:

lw  $1, 40($6)
add $2, $3, $1
add $1, $2, $6
sw  $2, 20($4)
and $1, $1, $4

Part A: Assuming there is no forwarding, manually insert the minimum number of nop instructions to eliminate all data hazards.
Part B: Draw a diagram of the execution of this instruction sequence with forwarding, similar to figure 4.59 in the text.

Problem 4 (Adapted from Patterson & Hennessy, exercise 4.24)

Consider the following sequences of branch outcomes. T means the branch is taken; N means the branch is not taken.

Sequence 1: T, T, T, N

Sequence 2: T, N, T, T, N

Part A: What is the accuracy of the always-taken branch predictor for each sequence?
Part B: What is the accuracy of the always-not-taken branch predictor for each sequence?
Part C: What is the accuracy of the 2-bit branch predictor for each sequence? Assume the predictor starts off in state 01 (weak not taken).
Part D: What is the accuracy of the 2-bit branch predictor for each sequence, if the sequence repeats thousands of times?
Part E: Write a new sequence of six branch outcomes that a 2-bit branch predictor will always predict correctly, assuming the predictor starts off in state 01.
Part F: Write a new sequence of six branch outcomes that a 2-bit branch predictor will always predict incorrectly, assuming the predictor starts off in state 01.

Problem 5 (Adapted from Patterson & Hennessy 4.28)

In this exercise we compare the performance of 1-issue and 2-issue processors, taking into account program transformations that can be made to optimize for 2-issue execution.

Consider the following loop:

loop:
  lw   $t0, 0($s0)
  addi $s0, $s0, 4
  sw   $t0, 0($s0)
  addi $s0, $s0, 4
  bne  $s0, $s1, loop
  nop

Part A: Show the schedule for two iterations of this loop on the static two-issue datapath shown in figure 4.69 of the text, assuming the processor has perfect branch prediction and forwarding.
Part B: Unroll this loop so each iteration performs four iterations of the original loop and rearrange instructions to improve the performance on the same static two-issue datapath.
Part C: Draw a pipelined execution diagram (as in Problem 3B) for your revised code on a static two-issue datapath. Show two iterations of the unrolled loop, assuming the processor has perfect branch prediction and forwarding.

License & Acknowledgements

This work is derived from that of Janet Davis, and is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.

HW 10 - Datapath, Pipeline, and HazardsDue at 10:30pm on Monday, November 23, 2015