[PATCH bpf-next 3/3] riscv, bpf: Add 32 bit atomic operations to RV32 JIT
bot+bpf-ci at kernel.org
bot+bpf-ci at kernel.org
Wed Apr 29 06:13:29 PDT 2026
> diff --git a/arch/riscv/net/bpf_jit_comp32.c b/arch/riscv/net/bpf_jit_comp32.c
> index f8509950fed4..710c56b20f98 100644
> --- a/arch/riscv/net/bpf_jit_comp32.c
> +++ b/arch/riscv/net/bpf_jit_comp32.c
[ ... ]
> @@ -902,11 +902,40 @@ static int emit_store_r64(const s8 *dst, const s8 *src, s16 off,
> case BPF_MEM:
> emit(rv_sw(RV_REG_T0, 0, lo(rs)), ctx);
> break;
> - case BPF_ATOMIC: /* Only BPF_ADD supported */
> - emit(rv_amoadd_w(RV_REG_ZERO, lo(rs), RV_REG_T0, 0, 0),
> - ctx);
> + case BPF_ATOMIC:
> + {
> + s8 fetch_reg = (imm & BPF_FETCH) ? lo(rs) : RV_REG_ZERO;
> +
> + switch (imm) {
> + case BPF_ADD:
> + case BPF_ADD | BPF_FETCH:
> + emit(rv_amoadd_w(fetch_reg, lo(rs), RV_REG_T0, 0, 0), ctx);
> + break;
> + case BPF_AND:
> + case BPF_AND | BPF_FETCH:
> + emit(rv_amoand_w(fetch_reg, lo(rs), RV_REG_T0, 0, 0), ctx);
> + break;
> + case BPF_OR:
> + case BPF_OR | BPF_FETCH:
> + emit(rv_amoor_w(fetch_reg, lo(rs), RV_REG_T0, 0, 0), ctx);
> + break;
> + case BPF_XOR:
> + case BPF_XOR | BPF_FETCH:
> + emit(rv_amoxor_w(fetch_reg, lo(rs), RV_REG_T0, 0, 0), ctx);
> + break;
> + case BPF_XCHG:
> + emit(rv_amoswap_w(fetch_reg, lo(rs), RV_REG_T0, 0, 0), ctx);
> + break;
Do the aq=0, rl=0 arguments provide the correct memory ordering for the
FETCH variants and XCHG?
Looking at the BPF specification in include/linux/filter.h, these BPF
atomic operations map to Linux kernel atomic_fetch_* and atomic_xchg
primitives, which require full memory ordering.
The RV64 BPF JIT in arch/riscv/net/bpf_jit_comp64.c emits these same
operations with aq=1, rl=1 for FETCH variants and XCHG:
case BPF_ADD | BPF_FETCH:
emit(is64 ? rv_amoadd_d(rs, rs, rd, 1, 1) :
rv_amoadd_w(rs, rs, rd, 1, 1), ctx);
case BPF_XCHG:
emit(is64 ? rv_amoswap_d(rs, rs, rd, 1, 1) :
rv_amoswap_w(rs, rs, rd, 1, 1), ctx);
The kernel's arch_atomic_fetch_add in arch/riscv/include/asm/atomic.h also
uses amoadd.w.aqrl (which corresponds to aq=1, rl=1).
With aq=0, rl=0, the atomic operations use relaxed memory ordering, which
could allow loads and stores to move across the atomic. Could this cause
BPF programs using atomic_fetch_*/atomic_xchg on RV32 to observe
memory-ordering violations that wouldn't occur on other architectures?
Note: For the non-FETCH variants of BPF_ADD/AND/OR/XOR (the return-void
operations), aq=0, rl=0 appears correct since those map to relaxed
atomic primitives.
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25109627626
More information about the linux-riscv
mailing list