Team member: 陈柏成 123090015 ; 陈玥彤 123090050

Part1

A. RISC-V32I Simulator

We change the following places:

  1. int64_t ->int32_t uint64_t->uint32_t Because RV64 uses 64-bit registers, so variables are typically int64_t or uint64_t. In RV32, registers are only 32 bits wide, so we changed it to 32 instead.

  2. Delete lwu ld sd case(ADDIW)(SUBW)(SLLIW)(SRAW) case OP_IMM32 .Because these instructions are specific to RV64. RV32 does not have these instructions.

  3. %llw %lld →%x %d Because %llx and %lld print 64-bit values. RV32 works with 32-bit values, so using %x and %d is correct and avoids printing garbage or truncation.

  4. Change blockSize from 64 to 32 and block number from 32 * 1024 / 64 to 32 * 1024 / 32. Because RV32 systems typically have smaller cache line sizes.

  5. Memory loading: return b1 + (b2 << 8) + (b3 << 16) + (b4 << 24) Because RV32 supports memory accesses up to 32 bits, we only need to combine the first 4 bytes (b1 to b4). Higher bytes (like b5 to b8) are used in RV64 for 64-bit loads, which are not applicable in RV32.

B. Fused Instructions

We changed Simulator.h and Stimulator.cpp to achieve this.

B1. R4 Instruction

  1. Add reg3 in RegId because R4-type instructions have four operands: rd, rs1, rs2, and rs3. A third source register is required for cases like fmadd, fmsub, etc..Adding reg3 allows the decoder and simulator to store and access the rs3 register for execution.
  2. Add case R4 in decode( ) , which implement fmadd.i fmadd.u fmsub.i fmsub.u fmnadd.i fnmsub.i . It extracts operands from all three source registers, and uses funct3 and funct2 to determine which specific instruction (fmadd.i, fmsub.u, etc.) is being used. Then it will construct a readable insist, which will be used in the execution ( ).
  3. Add these cases in switch( inst ) in execute( ).Each case firstly sets writeReg = true so the result goes into rd. Then it will computes out using op1, op2, and op3. Finally it will increments cycleCount to simulate the latency of these operations.

B2. Disable Data Forwarding

  1. Insimulate( ), we determine the stall number (e.g., from 2 → 1 → 0). While stall number > 0, the pipeline is paused (IF and ID stages are stalled).

  2. In execute( )

    2.1 We check for data hazards only when:

    2.2 If there exists a data hazard, we stall the IF and ID stages for two cycles and insert a bubble at the EX stage.

    The following figure shows the case when encountering data hazard in the EX stage.

    2b22813d1e8d8bb14b6d92393003dc5.png

  3. In memoryAccess( )

    3.1 We check for data hazards only when:

    The reasons are same to the data hazard checking condition in the execution stage.

    3.2 If there exists a data hazard, we stall the IF and ID stages for one cycle and insert a bubble at the MEM stage.

    The following figure shows the case when encountering data hazard in the MEM stage.

    3711523e4e92b55ab551f557e7bcfcc.png

Part 2 :Rearrange

A. Instruction Reordering

To reduce pipeline stalls caused by RAW (Read-After-Write) hazards, the instruction sequence was rearranged:

A1. RAW between mul a4, a1, a2 and add a3, a3, a4