Is RISC-V Static Call worth implementing ?

Fri Sep 13 01:05:33 PDT 2024

Hi folks,

I’m interested in implementing Static Call for RISC-V, and I want to
know whether it is worth the efforts to implement Static Call for
RISC-V.

In summary, Static Call is a mechanism that works similarly to global
function pointers. A simple use case is as follows:

int func_a(int arg1, int arg2);
// define a static call named my_name pointing to func_a
DEFINE_STATIC_CALL(my_name, func_a); 
// call func_a through static call my_name
static_call(my_name)(arg1, arg2);
// make my_name point to a new function func_b
static_call_update(my_name, &func_b);
// call func_b
static_call(my_name)(arg1, arg2);

The advantage of a static call over a function pointer is that static
calls are direct calls whereas function pointers are indirect calls.
On x86, direct calls are much faster than indirect calls when you
consider speculation mitigation options such as retpoline. So Static
Call is meaningful and has already been implemented for x86.

For RISC-V, a general indirect call is like this:

auipc a5, imm
# load the value of function pointer into a5
ld    a5, imm(a5)
# with the address of target function in a5, we can now jump to it
jalr  ra, 0(a5)

There are two versions of Static Call: out-of-line and inline. Inline
version builds on top of out-of-line version, and is faster than
out-of-line version.

For an out-of-line static call, the static call first jumps to a
trampoline, then jumps to the actual function. The best approach I 
can come up with is a three-instruction trampoline. Three instructions
plus two instructions (AUIPC JALR) to jump to the trampoline equals
five.

Five instructions with no mem op versus three instructions with one
mem op. Not sure which one is faster.

For an inline static call, the static call directly jumps to the
target function. I discussed with Peter Zijlstra, one of the
maintainers of Static Call. I guess we can use the two regular
instructions AUIPC and JALR to jump to a target function.

Two instructions with no mem op versus three instructions with one mem
op. An inline static call should be faster than an indirect call.

Does the aforementioned benefits merit a RISC-V static call implementation 
(especially inline)? Or are the benefits so negligible that it’s simply not 
worth the effort to do a RISC-V implementation?

It should be noted that updating a static call is much more troublesome
than updating a function pointer. So static calls are suitable to
replace function pointers that don’t change often. One scenario is
tracepoints. With inline static calls, RISC-V tracepoint performance
should improve. Not sure by how much, though.

Best,
Juhan