I recently attended the GNU Tools Cauldron in Manchester, where Roger Espasa from Esperanto Technologies and I ran a BoF session on GCC support for the RISC-V Vector (V) extension. This is an interesting topic, because the V extension has features that aren’t present in any other supported SIMD / Vector Architecture. This post is a short writeup of the current state of efforts towards supporting the extension in both GCC and LLVM, and some pointers to where things appear to be going.
The RISC-V Vector Extension has some interesting features. Some highlights are:
- A hardware vector length that is not just unknown at compile-time, but can also vary on a frequent basis,
- A vector register file that can be reconfigured for different data types / sizes, and
- Optional support for different data shapes in vector registers – e.g. scalar, vector, matrix.
A video of one of Roger’s previous tutorials gives a nice introduction and overview of the architecture.
At present the V extension design is not yet finalised – there are still possibilities for changes to the encodings and some instructions. The current working version of the proposal is kept in the riscv-v-spec Github repository.
Robin Kruppe from TU Darmstadt has been experimenting with how LLVM can be adapted to support the V extension, in particular thinking about how to effectively make use of the changing vector length and its implications for the ABI.
An RFC discussion on the llvm-dev mailing list lays out Robin’s initial ideas involving a new IR type for supporting the vector length, and the ensuing discussion received contributions from various developers working on the SVE support. Since then, another thread started by Graham Hunter of ARM moves towards the idea of using a single unified IR type for both SVE and V extension support.
Robin described his work so far at the recent EuroLLVM conference in Bristol, UK. At present all of his work is quite experimental due to the changing nature of the specification proposal, and the need for experimentation to guide the right implementation choices, so there is presently no public code repository containing an implementation of V extension support in LLVM.
Robin shared with me (in private correspondence) a summary of the current status of his work, reproduced here (with his permission) in edited form:
- An important divergence from the EuroLLVM presentation (and pre-EuroLLVM emails) is that a single unified IR type is used for both Arm SVE and RISC-V vectors.
- There seems to be consensus that this IR type needs to tackle the problem of the register sizes changing occasionally, but not the frequently-changing “active vector length” or VL register, which is more akin to predication (limiting processing to a subset of the lanes in the register).
- More specifically, Robin’s proposal to use the function boundary as the sole point where vector length may change seems to have found some support:
- A caller and callee may have different vector register sizes, and vector-type arguments are illegal, unless it’s ensured by the ABI that they match.
- However, vectors are never resized during the execution of a function.
- This trades IR complexity (far fewer and simpler changes required to passes that work on a single function at a time) with a bit of freedom in code generation (one can’t generally reconfigure the vector unit if a function contains two independent vectorized loops).
- Regarding the active vector length/VL: currently RISCV-specific intrinsics are used for vector operations controlled by VL, which take the value of VL as integer parameter – other approaches (e.g. instead using masks of the form
<1, 1, ..., 1, 0, 0, ..., 0>) are conceivable, but more practical experience is needed. The primary concern with representing VL as a mask is instruction selection and how to ensure that one can reliably identify when an instruction is predicated by VL (either solely or by the conjunction of VL and an ordinary mask), and code-generate as such, rather than inefficiently materializing the
<1, 1, ..., 1, 0, 0, ..., 0>mask. Robin is expecting to eventually make a proposal in this space.
- For loop vectorization, the new VPlan infrastructure is expected to be useful. A brief background on VPlan is given at the start of Diego Caballero’s talk “Extending LoopVectorize to Support Outer Loop Vectorization Using VPlan” at EuroLLVM this year. This too is common ground for Arm SVE and RISC-V, so there is also collaboration in this area – there will be another presentation on VPlan at the Bay Area LLVM Developers’ Meeting in October, which will include (among many other things) some thoughts about vectorization for scalable vector architectures.
At present there has been no implementation work or experimentation conducted with GCC, and the GNU Cauldron’s BoF session marks the start of discussions about how this work could be carried out. The session led to discussion summarised in a post to the GCC mailing list by Richard Henderson. The discussion touches on the following topics:
- The needs of the register allocator.
- The needs of a “SIMD” abi, in particular:
- The callee must know how many registers are enabled by vconfig.
- The callee must know MAXEL.
- The callee must be able to reset to a previous vconfig.
- The callee should be able to call-save registers.
Expect to see progress on all fronts:
- Further practical experimentation with LLVM, leading to an open-source repository of code containing experimental modifications and IR types,
- Continuation of discussions about how to develop implementations both in GCC and LLVM
- Eventual finalising and standardisation of the RISC-V Vector Extension.
Watch this space!