1. Introduction to assembly language format
Related reference articles:
RISC-V teaching plan
Take asm_run_seg.S as an example:
.equ BYTE_DELAY, 0x00100000 .equ GPIO_ADDR, 0xf0000000 .globl_start _start: LI t2, BYTE_DELAY ; # set counter ADDI t3, x0, 0; # t3 = 0 LI t0, GPIO_ADDR ; # set gpio base_address ...... LOOP: ADDI t3, t3, 1; # t3 = t3 + 1 BNE t3, t2, LOOP ; # if(x6 != x7) goto loop ADDI t3, x0, 0; # t3 = 0
- Assembly directives are formatted using “.” + keyword
- .globl (not .glob a l) is used to declare global labels, which can be accessed from other files. For example, .globl defines _start in the above code, then if other files in the project need to jump to the _start address, you can Use directly
- .equ defines a symbolic constant, which can be used in the program after the symbolic constant is defined. For example, BYTE_DELAY is defined as 0x00100000, and then 0x00100000 can be replaced by BYTE_DELAY, which gives the constant meaning, easy to understand, and easy to modify in batches. The format is .equ GPIO_ADDR, 0xf0000000
- Labels, such as _start, are addresses, marking the location of a certain program, and providing jump entry for jump and branch statements in the program. Labels are generally represented by uppercase letters or underscores followed by lowercase letters or words. When using labels, follow a colon “:”, such as LOOP: and _start:
- General assembly instruction format: instruction + space + register + “, “+ register + “, ” + register (or immediate data), such as ADDI t3, x0, 0;. There are also some assembly instructions that only connect one register and immediate data, the format is instruction + space + register + “, ” + immediate data, such as LI a3, 0x08;
- Assembly instructions end with a semicolon “;”, and constant definitions end with a carriage return and line feed, without a semicolon “;”. For example .equ GPIO_ADDR, 0xf0000000 and ADDI t3, x0, 0 ;
- The content followed by # is a comment statement, a description or explanation of the instruction or this section of the program, and the compiler ignores this part of the content when compiling. For example # set gpio base_address
2. ABI
ABI (Application Binary Interface), an application-to-binary interface, is generally used as the underlying interface between the application and the operating system, the application and the calling library (lib), and the application components, as shown in Figure 1. The applications here are mainly:
- Assembly language may use the alias defined in ABI as a register. For example, the x1 register is commonly used as a return address, so its ABI name is ra (return address). In assembly language, ra can be used directly to represent the x1 register.
- C language compilers generally follow the ABI principle to use registers. For example, if the JAL instruction omits rd, the compiler will use ra in ABI by default, that is, x1.
- After the C language is compiled, the ABI name will be used in the disassembler, because the name in the ABI is easy to understand, and the disassembler is mainly used to understand the generated machine code, so ABI is mostly used (it can also be used to display the digital name of the register), An example is shown in Figure 2.
Figure 1 Names of registers in ABI
Figure 2 ABI alias registers (left, red) and digital registers (right, green)
3. Article reference
[1] D. Patterson and A. Waterman, The RISC-V reader. Berkeley: Strawberry Canyon LLC, 2018.