Menu Close

FII RISC-V3.01 FII-PRX100-D (ARTIX-7, XC7A100T) XILINX FPGA Board Coremark Migration Guide

1.Introduction to Coremark


Coremark has been EEMBC’s CPU judging standard since 2009. EEMBC (Embedded Microprocessor Benchmark Consortium, the Embedded Microprocessor Benchmark Consortium) is a non-profit organization whose members include technology companies such as Huawei, Intel, ARM, and Analog Devices. EEMBC is an important criterion for evaluating embedded processors and compilers.

Coremark mainly detects ALU (Arithmetic Logic Unit, that is, arithmetic logic unit), memory reference, pipeline and branch operations. It is designed to make it impossible for the CPU to run benchmarks ahead of time, thus guaranteeing its fairness and impartiality. In the specified test time, Coremark does not allow third-party library calls, and the results are completely based on compiler optimization and CPU execution processing time. Because Coremark mainly provides the test of CPU architecture, in order to abandon the high and low level of hardware manufacturing process, the final test results of Coremark will be normalized, that is to say, the final test results will be equally divided into the clock of each Hz of the system, the unit is is Coremark/MHz. The main code of Coremark is written in C language, including list operation, state machine processing, matrix operation, CRC (Cyclic Redundancy Check, cyclic redundancy check) calculation.


2. Coremark porting


The first step is to download the C language source code folder from the EEMBC official website, or directly search the EEMBC github . There are 8 files in the source folder that need to be copied to the project workspace (here using FreedomStudio as the platform). As follows:

  • core_list_join.c
  • core_main.c
  • core_matrix.c
  • core_state.c
  • core_util.c
  • coremark.h
  • core_portme.c
  • core_portme.h

Three of the files in the column need to be changed. The three files are “core_portme.h”, “core_portme.c” and “coremark.h” (use core_portme.h and core_portme.c under the “simple” folder). Other files can be added directly to the project without modification.


2.1 Porting core_portme.h

There are a total of 14 macros in the entire project. They are shown in Table 1 as follows:




Defined as 1 if the porting platform supports floating point.


Defines whether the porting platform has the time.h header file and the implementation of its functions.


Defines whether the porting platform has the time.h header file and the implementation of its functions.


If the porting platform has stdio.h, define it as 1.


Define it as 1 if the porting platform has stdio.h and implements the printf function.


Put your compiler version here (eg GCC 7.2.0).


Put compiler flags here (eg O3).


Put the storage location of your code execution here (e.g. STACK).


Defines the return value type of the timing function.


Defines methods to obtain seed values ​​that cannot be computed at compile time.


Defines a method for acquiring memory.


Define parallel computing.


This flag is required if the porting platform does not support passing arguments to main (this flag only matters if the value of MULTITHREAD is defined as greater than 1).


This flag is required if the porting platform does not support returning values ​​from main.

Table 1 Macro definitions and their descriptions


The above macros should be modified and configured according to the transplant system and platform. For example, if floating point arithmetic is not supported, HAS_FLOAT should be set to 0. HAS_TIME_H and USE_CLOCK define whether the timer has an implementation of the time.h header file and its functions. The time.h header file is used to call the timer in core_portme.c and then calculate the time spent in the iterative loop in the main function. If the platform uses another function to define the time, it should be overridden. HAS_PRINTF defines whether the porting platform uses the standard I/O library for output printing, and can also be redefined as needed. MEM_LOCATION defines the memory location where the code is executed.

coremark has_float

Figure 1 Configuring HAS_FLOAT


In addition, the execution mode can also be configured. As shown in Figure 2, PROFILE_RUN, PERFORMANCE_RUN and VALIDATION_RUN can be selected. TOTAL_DATA_SIZE can be modified in coremark.h.

coremark configuration execution mode

Figure 2 Configuration execution mode


Some parameters can also be defined here, such as ITERATIONS, as shown in Figure 3.

coremark parameter configuration

Figure 3 Parameter configuration


The last thing to note is that the corresponding libraries should be added and modified according to the configuration.


2.2 Porting  core_portme.c

“EE_TICKS_PER_SEC” is the total ticks per second, relative to the system clock. If the timer needs to be modified, it should be modified accordingly. The most important functions of core_portme.c are related to timers, “start_time”, “stop_time” and “get_time”. As mentioned above, the time-dependent functions can be modified as needed. See Figure 4 for details of the time-related functions. The first function called in main in core_main.c is “portable_init”. Initialization and debug printing information can be implemented here, as shown in Figure 5. Note that the library should also be added accordingly.

coremark time-related functions

Figure 4 Time-dependent functions


coremark initial function

Figure 5 Initial function


2.3 transplant coremark.h

Memory definitions such as “malloc” and “free” can be modified here, as shown in Figure 6.

coremark memory-related functions

Figure 6 Functions related to memory


2.4 Others

The entire program must run for at least 10 seconds. The iteration time can be modified accordingly. In theory, the more iterations you run, the more accurate the results will be. In addition, is also included in the directory. More information can be found in the accompanying documents. For example, as shown in Figure 7, the README.MD lists some rules to ensure that Coremark results are valid.


coremark README

Figure 7



3. Evaluation results


The system clock of FII RISC-V3.01 on the FII-PRX100-S (ARTIX-7, XC7A100T) XILINX FPGA board is 50MHz, and the Coremark test score shown in Figure 8 is 3.38 (169/50 Coremark/MHz).


Be RISC-V Coremark

图 8 FII RISC-V3.01 Coremark


Figure 9 lists other CPU Coremarks that some of the EEMBC websites have certified.


EEMBC Partial CPU Coremark Score

Figure 9 EEMBC part of the CPU Coremark score


CPU Coremark comparison chart

Figure 10 CPU Coremark comparison chart


FII RISC-V3.01 is a single-core, 2-stage and 3-stage pipeline mixed CPU. Figure 10 plots the Coremarks of some single-core CPUs certified by EEMBC together with FII RISC-V3.01 as a line graph. Coremark for FII RISC-V3.01 has been highlighted in red. As can be seen, among the 15 CPUs listed, the FII RISC-V3.01 Coremark is above average. For the three CPUs with significantly higher Coremarks, they are STMicroelectronics’ STM32H72x/73x rev Z, STM32H7B3 rev Z, and Renesas Electronics’ RX66T (all marked blue). According to the official manual from STMicroelectronics, the STM32H72x/73x rev Z and STM32H7B3 rev Z both use Cortex-M7 and both have a 6-stage superscalar pipeline. The Renesas RX66T uses the RXv3 core, which has an improved 5-stage pipeline and optimized branch prediction.

Since the superscalar pipeline has parallel instructions, multiple instructions can be dispatched to execute simultaneously in the same clock cycle, and the performance of the CPU will undoubtedly be better. At the same time, FII RISCV-V3.01 has no branch prediction and out-of-order execution functions in its implementation, which may also be the reason why the Coremark score is lower than that of RX66T. The Texas Stellaris Cortex-M3 also implements branch prediction and has the same 3-stage pipeline processor as the FII RISC-V3.01. The comparison result is that the Coremark of FII RISC-V is much larger than the Coremark of Texas Stellaris Cortex-M3 (marked in blue), even more than double. Compared with another Cortex-M0-based processor Microchip ATSAML21J18B (marked in blue), which is also a 3-stage pipeline and has no branch prediction, the Coremark score of FII RISC-V3.01 is still much higher. This may be because the Cortex series chips are based on the ARM architecture, while the FII RISCV-V3.01 is based on the RISC-V architecture. From the perspective of the instruction set, RISC-V is more streamlined than ARM, so the CPU performance of the corresponding design may be higher. it is good. Without the function of branch prediction, the Coremark score of FII RISCV-V3.01 is higher than that of Texas Stellaris Cortex-M3 with branch prediction function, which also shows that the internal logic of FII RISCV-V3.01 is better than that of Texas Stellaris Cortex-M3. All in all, the excellent internal implementation logic of FII RISCV-V3.01 makes its Coremark score higher than the average level of other CPUs under the same functional implementation (three-stage pipeline, no branch prediction, no superscalar pipeline design).

Posted in FPGA, RISC-V, RISC-V Textbook, Textbook and Training Project

Related Articles

Leave a Reply

Your email address will not be published.

Leave the field below empty!