Section 1
1. In this question, you are adding addi I-type instruction, which is in the format of “addi Rd, Rs1, imm”. The operation is to perform Reg[Rd] = Reg[Rs1] + Imm. Highlight the datapath used for this instruction based on the datapath figure shown below. (10 points)
Hint: the instruction is very similar to LW, except it needs to write the ALU output to the register, and you can use LW datapath as reference for this question. Check slides about how to highlight data path for instructions.
Section 2
Please download this Excel File Updated for Section 2, Question 2 and 3.
The answer sheets provide some example answers for both question 2 and 3. Please fill the missing blanks and submit the sheet as your solution, do NOT modify the sheet structure, e.g. do not add or removing any row or column, do not merge or unmerge cells.
2. CPU execute instructions; data movement in the datapath; and control signals.
Your work is to fill in the provided Excel sheet the values of the datapath that are relevant to the instruction CPU is executing, and the setting of control signals of each type of instruction. Datapaths are labeled as in the following diagram. If you need references, book chapter 4.4 and 4.5 provide detailed description about how each type of instruction is executed, what datapath each uses and what control signals each instruction set or reset. Some of the pictures of the textbook are already copied to the answer sheet to help you look up. Check the comments of some cell (if they have comments) for details how the value is calculated. To simplify answering with the sheet, we assume that the execution of the instruction does not change the actual value in the register files and memory.
Your answers should be in the yellow-colored and brown-colored areas of the sheet.
(50 points in total; Each instruction has 5 points for data path and 5 points for control signals).
3. Pipeline execution and RAW (Read-After-Write) data hazard and control hazards.
The following high-level C code is translated to RISC-V assembly. Instructions are executed on the CPU using the standard 5-stage pipeline (IF, ID, EXE, MEM, and WB). Register files can be read/write in the same cycle, and instruction memory and data memory are separated. Branch outcome is determined at the end of EX stage.
for (i=0; i!=n-2; i++) a[i] += a[i+1];
Variable n is stored in register x5 and i is in register x9. Array a is an array of integers (a word).
# Initialize registers for variable i and n-2
li x9, 0 # i=0
add x10, x5, -2 # x11 now has n-2
# branch check
loop: beq x9, x10, exit # if loop condition is NOT true, exit loop
# load a[i] to register
lw x7, a(x9) # load a[i] to x7
# load a[i+1] to register
addi x12, x9, 1 # x12 now has i+1
lw x8, a(x12) # load a[i+1] to x8
# Do the addition of a[i] + a[i+1] and store the result to a[i]
add x6, x7, x8 # a[i] + a[i+1], and in x9;
sw x6, a(x19) # store a[i] + a[i+1] into a[i]
# Increment loop index i and jump to the beginning of the loop
addi x9, x9, 1 # i=i+1
beq x0, x0, loop # back to condition check
# loop exit
exit:
a) Fill in the rest of the following table about the RAW data dependency between two instructions, and the register that introduces the dependency. Highlight it if it is Load-Use. (6 points in total)
Instruction that writes the register Instruction that reads the register The register
li x9, 0 beq x9, x10, exit x9
Answers to the following three questions in provided Excel sheet (Question 3 tab). Your answers should be in the yellow-colored area of the sheet. Fill in cell with red color for the stall cycles caused by data and control hazards. The sheet “Examples discussed at the class” tab shows how this is done using a more complicated examples and detailed comments are given in the sheet for you to work on the problem 3.
b) Draw the 5-stage pipeline execution of the first TWO iterations using stage labels (IF, ID, EXE, MEM, and WB) with no any data forwarding in the CPU. All RAW dependencies between consecutive two instructions (including AL-Use and Load-use) cause 2 cycle delay. BEQ cause two cycle delay for issuing next instruction. (10 points)
c) Draw the 5-stage pipeline execution of the first TWO iterations using stage labels on the CPU but with fully data forwarding. With forwarding, the AL-Use RAW dependencies between consecutive two instructions cause 0 cycle delay. The Load-use dependency between consecutive two instructions cause 1 cycle delay. BEQ still has two cycle delay for issuing next instruction. (12 points)
d) For the CPU with fully data forwarding, rearrange instructions using the techniques we discussed in the Chapter 4 lecture to eliminate the stall(s) from load-use hazard. All the load-use stall cycles should be completely eliminated, and then draw the 5-stage pipeline execution of the first TWO iterations using stage labels. When you do reschedule, you are allowed to change instruction to make sure the code is executed correctly. Hints, you are allowed to have “SW, a+/-ConstantOffset (base)” format, e.g. “sw x11 a-4(x10)” meaning to store x11 to address “a-4+x10”. (12 points)