Menu Close

RISC-V Bus and Pipeline(2)RISC-V CPU Bus Design

1. Learn bus design

Learning Design Bus is the learning purpose of this article, follow the design from simple to complex as you learn. This article will analyze the existing bus to explain why it is necessary to design a unique FII RISC-V BUS. First, the purpose of designing this bus is to transfer data within the CPU.

1.1. The complex but powerful AXI bus

AXI (Advanced eXtensible Interface) bus is ARM’s AMBA (Advanced Microcontroller Bus Architecture) part of the agreement. It is characterized by many bus interfaces, comprehensive protocols, complex operation, and flexible control.

On the other hand, the advantages of the AXI bus described above are also disadvantages. AXI is too large and complex to use or debug for a CPU designed for embedded IoT. Fully supporting a bus like AXI4 would consume too many resources, draining power.

1.2. Simple and easy to control SPI bus

The SPI bus is a synchronous serial bus invented by Motorola in 1979 and is often used in the communication of SD (secure digital) cards and LCD (liquid crystal display) displays. The hardware implementation of the SPI bus is also very simple, and there are relatively few communication lines, generally only 4, but the reason for not using SPI as the bus is that data transmission is generally parallel inside the CPU, and SPI is a serial bus. The SPI bus, as an internal bus of RISC-V, requires multiple serial-to-parallel and parallel-to-serial conversions.

 

It can be seen from the above two representative buses that the existing buses are either complicated and difficult to control, or too simple to be suitable for the internal transmission of the CPU. Therefore, after comprehensively considering the balance of performance, power consumption, and resources, we choose to implement a small and refined bus.

 

2. FII RISC-V bus design

 

Related reference articles:

RISC-V teaching plan

 

The following code block shows the FII RISC-V bus design. The specific details are explained in the detailed comments in the code module.

//RISCV bus design: RIB (riscv internal bus) bus definition

  • 32bit: rib_addr, // address bus
  • 32bit: rib_dout, // data bus
  • 32bit: rib_din, // data bus
  • 1bit: rib_valid, // control bus
  • 2bit: rib_ready, // control bus
  • 32bit: rib_we, // control bus
  • 1bit: rib_rd, // control bus
  • 1bit: rib_op, // control bus
//RISCV master bus design (cpu, etc.): verilog code 
output [31:0] o_rib_maddr,  // master send, 32-bit address 
output [31:0] o_rib_mdout,  // master sends, 32-bit data (write operation) 
input [31:0] i_rib_mdin,    // master receives, 32-bit data (read operation) 
output o_rib_mvalid,        // issued by the master to indicate that there is an operation (read/write) happening 
input [1:0] i_rib_mready,   // master receive, bit[1]=bus error, bit[0] peripheral ready 
output [3:0] o_rib_mwe,     // master send, write operation signal, each bit represents one (single edge) 
output o_rib_mrd,           // master send, read operation signal, (single edge) 
output o_rib_mop,           // sent by the master, indicating each operation (single edge) //RISCV slave bus design (peripherals, etc.): verilog code 
input [31:0] i_rib_saddr,   // slave receive, 32-bit address 
input [31:0] i_rib_sdin,    // slave receive, 32-bit data (write operation) 
output [31:0] o_rib_sdout,  // slave issued, 32-bit data (read operation) 
input i_rib_svalid,         // slave received, used to indicate that there is an operation (read/write) happening 
output [1:0] o_rib_sready,  // slave issued, bit[1]=bus error, bit[0] peripheral ready 
input [3:0] i_rib_swe,      // slave receive, write operation signal, each bit represents one (single edge) 
input i_rib_srd,            // slave receive, read operation signal, (single edge) 
input i_rib_sop,            // slave receives, indicating each operation (single edge)

 

3. Analysis of read and write operations

Next, the individual read and write operations are analyzed. The simplest single-clock cycle bus read operation sequence diagram is shown in Figure 1. Notice:

  • mvalid, mready[0] are generated at the same time
  • The read signal, address, read data and mop operation signals are all single cycle

In the figure, gray waveforms indicate invalid areas. The blue waveform represents the output (relative to the CPU) and the green waveform represents the input (relative to the CPU).

 

Figure 1 Timing diagram of single clock cycle read operation

 

A numerical example is as follows: For address 32’h9000_00xx, read data (mdin) 32’h1234_5678.

 

mrd = 1, 32'hxxxx_xx78, maddr = 32'h9000_0000, lb(u)
mrd = 1, 32'hxxxx_56xx, maddr = 32'h9000_0001, lb(u)
mrd = 1, 32'hxx34_xxxx, maddr = 32'h9000_0002, lb(u)
mrd = 1, 32'h12xx_xxxx, maddr = 32'h9000_0003, lb(u)

mrd = 1, 32'hxxxx_5678, maddr = 32'h9000_0000, lh(u)
mrd = 1, 32'h1234_xxxx, maddr = 32'h9000_0002, lh(u)

mrd = 1, 32'h1234_5678, maddr = 32'h9000_0000, lw

 

The single clock cycle bus write operation timing diagram is shown in Figure 2. Note that it is similar to a single clock cycle bus read operation:

  • mvalid, mready[0] are generated at the same time
  • Write signal, address, write data, and mop operation signals are all single clock cycle operation

 

Figure 2 Single clock cycle write operation timing diagram

 

Similar to the read data example, the write data example is as follows: For address 32’h9000_00xx, write data (mdout) 32’h1234_5678.

 

mwe = 4'b0001, 32'hxxxx_xx78, maddr = 32'h9000_0000, sb
mwe = 4'b0010, 32'hxxxx_56xx, maddr = 32'h9000_0001, sb
mwe = 4'b0100, 32'hxx34_xxxx, maddr = 32'h9000_0002, sb
mwe = 4'b1000, 32'h12xx_xxxx, maddr = 32'h9000_0003, sb

mwe = 4'b0011, 32'hxxxx_5678, maddr = 32'h9000_0000, sh
mwe = 4'b1100, 32'h1234_xxxx, maddr = 32'h9000_0002, sh

mwe = 4'b1111, 32'h1234_5678, maddr= 32'h9000_0000, sw

 

Single clock cycle bus continuous read operation, its timing diagram is shown in Figure 3. Same as single clock cycle read operation:

  • mvalid, mready[0] are generated at the same time
  • Read signal, address, read data, and mop operation signal are all single clock cycle operation signals

 

Figure 3. Timing diagram of continuous read operation in a single clock cycle

 

Single clock cycle bus continuous write operation, the timing diagram is shown in Figure 4:

  • mvalid, mready[0] are generated at the same time
  • Write signal, address, write data, and mop operation signal are all single clock cycle operation signals

 

Figure 4 Timing diagram of a continuous write operation in a single clock cycle

 

The timing diagram for a multiple clock cycle bus read operation is shown in Figure 5. Multiple cycles mean that ready cannot respond to valid immediately, i.e. the peripheral is slow.

  • If mvaild is high, only pull mready[0] high when ready to read data
  • Read signal, address is a multiple clock cycle operation signal, read data, mop operation signals are single clock cycle operation signal

 

Figure 5 Multiple clock cycle read operation timing diagram

 

The timing diagram for a multiple clock cycle bus write operation is shown in Figure 6.

  • If mvaild is high, only pull mready[0] high when data is ready to be written
  • Write signal, address, and write data are multiple clock cycle operation signals, and mop operation signals are single clock cycle

 

Figure 6 Multiple clock cycle write operation timing diagram

 

Figure 7 shows the timing diagram of the continuous read operation of the bus in multiple clock cycles .

  • If mvaild is high, only pull mready[0] high when ready to read data
  • Read signal, address is a multiple clock cycle operation signal, read data, mop operation signals are single clock cycle operation signal

 

Figure 7 Timing diagram of continuous read operation with multiple clock cycles

 

Figure 8 shows the timing diagram of the continuous write operation of the bus in multiple clock cycles .

  • If mvaild is high, only pull mready[0] high when data is ready to be written
  • Write signal, address, and write data are multiple clock cycle operation signals, and mop operation signals are single clock cycle

 

Figure 8 Timing diagram of continuous write operation with multiple clock cycles

 

The corresponding code modules are shown below, and the specific details are explained in the detailed comments in the code modules.

localparam DEV_NUM = 10;


wire [DEV_NUM - 1:0] s_cs;//Peripheral device chip selection 

assign s_cs[0] = ( i_rib_maddr[31:12] == DBG_BASEADDR[31:12] )  ? 1'b1 : 1'b0;//jtag debug 
assign s_cs[1] = ( i_rib_maddr[31:16] == PLIC_BASEADDR[31:16] ) ? 1'b1 : 1'b0;//external interrupt 
assign s_cs[2] = ( i_rib_maddr[31:16] == CPU_BASEADDR[31:16] )  ? 1'b1 : 1'b0;//CPU
assign s_cs[3] = ( i_rib_maddr[31:16] == MEM_BASEADDR[31:16] )  ? 1'b1 : 1'b0;//memory 
assign s_cs[4] = ( i_rib_maddr[31:16] == TMR_BASEADDR[31:16] )  ? 1'b1 : 1'b0;//timer interrupt 
assign s_cs[5] = ( i_rib_maddr[31:16] == GPIO_BASEADDR[31:16] ) ? 1'b1 : 1'b0;//GPIO
assign s_cs[6] = ( i_rib_maddr[31:16] == UART_BASEADDR[31:16] ) ? 1'b1 : 1'b0;//UART

//There are some unused options that can be extended later
assign s_cs[7] = 1'b0;
assign s_cs[8] = 1'b0;
assign s_cs[9] = 1'b0;

//===============================================================================
always @ ( * )
if(!rst_n)
begin
    o_rib_saddr  = 0;
    o_rib_sdin   = 0;
    o_rib_svalid = 0;
    o_rib_swe    = 0;
    o_rib_srd    = 0;
    o_rib_sop    = 0;
end
else
begin
    //The information sent by the host, each peripheral device can receive (broadcast)
    o_rib_saddr  = i_rib_maddr;
    o_rib_sdin   = i_rib_mdout;
    o_rib_svalid = i_rib_mvalid;
    o_rib_swe    = i_rib_mwe;
    o_rib_srd    = i_rib_mrd;
    o_rib_sop    = i_rib_mop;
end

//===============================================================================

wire bus_err_ack = (i_rib_maddr == i_PC) ? 1'b1 : 1'b0;

always @ ( * )
begin
    //Peripherals cannot send information at the same time
    //If the current chip select is pulled high, the corresponding peripheral will return data and ready
    // form a bus distributor/multiplexer together with host information transfer

    case (s_cs)
    10'b00_0000_0001:   // DBG_BASEADDR
    begin
        o_rib_mdin      = i0_rib_sdout;
        o_rib_mready    = i0_rib_sready; 
    end
    10'b00_0000_0010:   // PLIC_BASEADDR
    begin
        o_rib_mdin      = i1_rib_sdout;
        o_rib_mready    = i1_rib_sready; 
    end
    10'b00_0000_0100:   // CPU_BASEADDR
    begin
        o_rib_mdin      = i2_rib_sdout;
        o_rib_mready    = i2_rib_sready; 
    end
    10'b00_0000_1000:   // MEM_BASEADDR
    begin
        o_rib_mdin      = i3_rib_sdout;
        o_rib_mready    = i3_rib_sready; 
    end
    10'b00_0001_0000:   // TMR_BASEADDR
    begin
        o_rib_mdin      = i4_rib_sdout;
        o_rib_mready    = i4_rib_sready; 
    end
    10'b00_0010_0000:   // GPIO_BASEADDR
    begin
        o_rib_mdin      = i5_rib_sdout;
        o_rib_mready    = i5_rib_sready; 
    end
    10'b00_0100_0000:   // UART_BASEADDR
    begin
        o_rib_mdin      = i6_rib_sdout;
        o_rib_mready    = i6_rib_sready; 
    end
    10'b00_1000_0000:
    begin
        o_rib_mdin      = i7_rib_sdout;
        o_rib_mready    = i7_rib_sready; 
    end
    10'b01_0000_0000:
    begin
        o_rib_mdin      = i8_rib_sdout;
        o_rib_mready    = i8_rib_sready; 
    end
    10'b10_0000_0000:
    begin
        o_rib_mdin      = i9_rib_sdout;
        o_rib_mready    = i9_rib_sready; 
    end
    default:
    begin
        o_rib_mdin      = 0;
        o_rib_mready    = {1'b1, bus_err_ack};
    end
    endcase
end

 

Posted in FPGA, RISC-V, RISC-V Textbook, Textbook and Training Project

Related Articles

Leave a Reply

Your email address will not be published.

Leave the field below empty!