1. I-type integer register-immediate instruction
Related reference articles:
RISC-V teaching plan
The above RISC-V instruction set explanation (1) General-purpose registers and assembly instructions classification introduces general-purpose registers, program counters, and 6 kinds of assembly instructions. This article will start with the I-type integer register instructions and introduce each assembly instruction in detail. Include specific instructions.
Figure 1 Machine code format of 6 basic instructions[1]
With the exception of CSR instructions, all occurrences of immediate values are sign-extended and are usually on the leftmost available bit in the instruction[1] . As shown in Figure 1,
For all instruction types (I-type, S-type, B-type, U-type and J-type) where immediate values appear, the sign extension of the immediate value depends on bit 31 of the instruction (also the highest bit of the immediate value, such as imm[20] of J-type).
So the immediate mentioned in I-type are all sign-extended.
Here is an example to explain the signed bit extension, such as a 12-bit immediate value,
If the highest bit is 0, it means that the immediate value is a positive number,
If the highest bit is 1, it means that the immediate value is negative. When the positive number is sign-extended, the upper 20 bits are all filled with 0, and when the negative number is sign-extended, the upper 20 bits are filled with 1, and then the addition or comparison is performed. In the case of unsigned extension, the upper 20 bits are filled with 0.
Most integer arithmetic instructions operate on the XLEN bits stored in integer registers (the corresponding integer registers in the RV32I are 32 bits). Integer arithmetic instructions use either I-type instructions for register-immediate operations or R-type instructions for register-register operations.
The opcode corresponding to I-type is named OP-IMM,
The immediate corresponding to the I-type is fixed to 12 bits and is named I-immediate, as shown in Figure 2.
Figure 2 Integer register-immediate instruction machine code format [2]
I-type has a total of 15 instructions, here are the first 6 instructions, including the following:
- ADDI
- SLTI
- SLTIU
- ANDI
- ORI
- XORI
1.1. ADDI
The ADDI instruction format is ADDI rd, rs1, immediate. x[rd] = x[rs1] + sext(imm)
for example:
ADDI x13, x12, 5
Add the sign -extended immediate value 5 to the value in the x12 register and place the result in the x13 register.
It can be seen that its machine code is shown in Figure 3, ADDI’s
OP-IMM is 001_0011
funct3 is 000
immediate is 12’b0000_0000_0101
rs1 is 5’b0_1100
rd is 5’b0_1101
So the machine code corresponding to ADDI x13, x12, 5 is 0000_0000_0101_01100_000_01101_0010011, and the corresponding hexadecimal is 32’h0056_0693
Figure 3 ADDI machine encoding format [2]
After decoding, as long as the positions corresponding to machine code opcode and funct3 are 001_0011 and 000, then this instruction must be addi. In ADDI, “ADD” means addition, “I” means immediate data, and the full name of “ADDI” means immediate addition. This instruction is to add the value in the rs1 register to the signed-bit extended immediate value, and then store the result of the addition in rd (the overflow part is ignored. The overflow processing can be implemented by software, here is how to deal with it).
Here to introduce a concept, “pseudo-instruction (pseudo-instruction)”, a pseudo-instruction refers to an instruction that does not exist in the assembly instruction set. These directives are convenient for assembly users to express and are often used.
- For example, the movement between registers often occurs in assembly programs, and there is a pseudo-instruction MV(move),
Instruction writing format: MV rd, rs1
This instruction is to move the value in rs1 to rd (there is a move instruction in x86, and this instruction in mcs-51).
Its actual meaning is ADDI rd, rs1, 0, that is to say, the value in the rs1 register is added to the immediate value 0 that follows, and finally stored in the rd register. Because the value of rs1 does not change after adding zero to rs1, the MV instruction actually moves the value of rs1 into rd. The user can write such a pseudo-instruction MV when writing the assembler. When the program is compiled, the compiling software will translate this instruction into ADDI rd, rs1, 0, and then send the ADDI instruction to the CPU to run.
- There is also a pseudo-instruction that is frequently used, NOP (no operation, no operation),
Instruction writing format: NOP
Its actual meaning is that ADDI x0, x0, 0 has been mentioned before, the value of x0 cannot be rewritten, can only be read, and is always 0, then ADDI x0, x0, 0 does a resultless addition, adding The result of x0 + 0 is put into x0, whose only effect is to advance the PC value.
1.2. SLTI
The SLTI instruction format is SLTI rd, rs1, immediate . x[rd] = x[rs1] < sext(immediate)
Its machine code is shown in Figure 4, the OP-IMM of SLTI is 001_0011, and the funct3 is 010. In SLTI, S stands for set to be set, and the function is to set rd after the signed comparison of rs1 and the sign-extended immediate value. Set the value in the rd register to 0 or 1. It should be noted here that the instruction is set to 1 if the condition is satisfied, and 0 if the condition is not satisfied. The set condition of this instruction is LT: less than, less than. I stands for immediate. So the judgment condition is whether the value of rs1 is less than the immediate value. SLTI stands for set if less than immediate.
Command example:
SLTI x13, x12, 5
Compare the x12 register with the immediate value 5 (5 after sign extension ). If the number in the x12 register is less than 5 ( signed comparison ), set x13 to 1.
Figure 4 SLTI machine coding format [2]
1.3. SLTIU
The SLTIU instruction format is SLTIU rd, rs1, immediate . x[rd]= x[rs1] < sext(immediate)
Its machine code is shown in Figure 5, the OP-IMM of SLTIU is 001_0011, and the funct3 is 011. The only difference between SLTIU and SLTI is that “U” stands for unsigned number (unsigned number), the difference is that the immediate value is signed and set as an unsigned number after comparison. If comparing 8-bit binary numbers. -1: 8’b1111_1111 , -2: 8’b1111_1110. -2 < -1 when comparing as signed, still true if comparing as unsigned, but if comparing with -2 and +1: 8’b0000_0001 for unsigned comparison. At this time 1111_1110 > 0000_0001. (254 > 1).
Command example:
SLTIU x13, x12, -1
Compare the x12 register with the immediate value -1 ( 0xffffffff after sign extension ), if the number in the x12 register is less than 0xffffffff ( unsigned comparison ), set x13 to 1.
Figure 5 SLTIU machine encoding format [2]
Pseudo-instruction SEQZ (set if equal to zero, set if equal to 0): SEQZ rd, rs1 (equivalent to SLTIU rd, rs1, 1), this pseudo-instruction is a special case in SLTIU and will be frequently used use. Because there is only one case where rs1 is smaller than “1” in unsigned numbers, that is, rs1 = 0, so if SLTIU rd, rs1, 1 is established, it can also be regarded as SEQZ rd, rs1 = 0 in rs1 is established.
1.4. ANDI
The ANDI instruction format is ANDI rd, rs1, immediate . x[rd] = x[rs1] & sext(immediate)
Its machine code is shown in Figure 6, the OP-IMM of ANDI is 001_0011, and the funct3 is 111. The usage is to extend the sign bit of the immediate data into 32 bits and perform a bitwise and (&) with rs1 and write the result to rd.
Command example:
ANDI x13, x12, 5
Bitwise AND the number in the x12 register with the signed-bit-extended immediate value 5, and write the result to the x13 register.
Figure 6 ANDI machine encoding format [2]
1.5. ORI
The ORI instruction format is ORI rd, rs1, immediate . x[rd] = x[rs1] | sext(immediate)
Its machine code is shown in Figure 7, the OP-IMM of ORI is 001_0011, and the funct3 is 110. The usage is to expand the immediate value into 32-bit signed number and perform bitwise OR (or) with rs1, and write the result to rd.
Command example:
ORI x13, x12, 5
Bitwise OR the number in the x12 register with the sign-extended immediate value of 5 and write the result to the x13 register.
Figure 7 ORI machine encoding format [2]
1.6. XORI
The XORI instruction format is XORI rd, rs1, immediate . x[rd] = x[rs1] ^ sext(immediate)
Its machine code is shown in Figure 8, the OP-IMM of XORI is 001_0011, and the funct3 is 100. The usage is to expand the immediate value into 32-bit signed number and perform bitwise exclusive or (xor) with rs1, and write the result to rd.
Pseudo-instruction NOT: NOT rd, rs1 (equivalent to XORI rd, rs1, 12’hfff). NOT is an inversion instruction, which is used to invert the value in rs1 and put it in rd. Because the XOR of 1 and any number is the inversion, such as 1 xor 0 = 1, 1 xor 1 = 0, the value of the XOR with rs1 after 12’hfff sign extension is equivalent to the inversion.
Command example:
XORI x13, x12, 5
XOR the number in the x12 register with the sign-extended immediate value 5 and write the result to the x13 register.
Figure 8 XORI machine encoding format [2]
Notice:
RISC-V instructions are flexible. The machine code format in the above instructions does not mandate rs1, rd is a fixed register, the user can select the corresponding register from the 32 general-purpose registers as rs1 and rd, rs1 and rd or even as needed when writing the assembler. can be the same register.
2. Article reference
[1] Riscv.org , 2021. [Online]. Available: https://riscv.org/wp-content/uploads/2019/12/riscv-spec-20191213.pdf. [Accessed: 22- Feb- 2021] .
[2] D. Patterson and A. Waterman, The RISC-V reader. Berkeley: Strawberry Canyon LLC, 2018.