Page 27 - 2025S
P. 27
20 UEC Int’l Mini-Conference No.54
An Implementation and Evaluation of
V-extension Enabled RISC-V System-on-chip
The-Binh NGUYEN*, Cong-Kha PHAM
Department of Computer Network and Engineering
The University of Electro-Communications, Tokyo, Japan
Keywords: RISC-V, System-on-chip, Vector Extension, Out-of-order execution, Single-Instruction-Multiple-Data
1. Introduction
RISC-V is a flexible and modular Instruction Set Architecture
that can support a wide range of use-cases, from the low-power
embedded system to high-performance supercomputers. The
V-extension [1] can enable high-performance computing by
utilizing the concept of explicit parallel processing by using
specialized instruction to perform operation on multiple data
on a single clock cycle. However, due to its complexities,
there is a shortage in implementation and evaluation of a Fig 2. RISC-V Out-of-order core architecture
system that is taking advantage of RISC-V V-extension. 4. Memory load optimization
This project aims to implement and evaluate an Out-of- In general, all the memory operation can only be
order RISC-V Core on a System-on-chip. performed when they are the oldest, for instance, if the
2. Overall SoC Architecture
access is to the memory-mapped region. However, the
The figure (Fig 1) shows the overall SoC architecture with a V-
extension supported RISC-V Out-of-order core and common memory read to actual memory does not have side-effect
memory-mapped peripherals interfaces such as UART, SPI and and can be issue out-of-order to improve load-use latency,
GPIO. this implementation check each memory read for address
alias with the older memory write to determine if it can be
issue right away.
5. Coremark benchmark result
Fig 1. RISC-V Out-of-order core architecture
3. Core Architecture
The figure (Fig 2) shows the Out-of-order RISC-V core 6. Conclusion
architecture with a prefetch buffer and unified issue & reorder The current implementation of can still be improved by
buffer. The scheduling strategy is slightly modified from the many ways, such as multi-wide commit and rename &
Tomasulo Algorithm [2]. The register renaming and schedule
are performed there and placed directly on the reorder buffer, schedule to achieve multiple-issue and further improve
removing the need for a reservation station. Furthermore, each performance.
entry of the reorder buffer is though as a copy of the RISC-V References
state without the register files. To support precise-trap, all the [1] RISC-V International. (2022). RISC-V Vector Extension
operation is executed out of order (except for the memory-write (RVV) Specification.
operation) and committed in order, an entry can only be [2] R. M. Tomasulo, "An Efficient Algorithm for Exploiting
committed if it is in the RETIRE state and is the oldest of the Multiple Arithmetic Units," in IBM Journal of Research and
reorder buffer. Development, vol. 11, no. 1, pp. 25-33, Jan. 1967, doi:
10.1147/rd.111.0025. using Blueprint”- arXiv
preprintarXiv:1303.5972, 2013.
*The author is supported by (SESS) MEXT Scholarship