[PATCH bpf-next v2 1/2] bpf, arm64: Remove redundant bpf_flush_icache() after pack allocator finalize
Xu Kuohai
xukuohai at huaweicloud.com
Mon Apr 13 18:55:56 PDT 2026
On 4/14/2026 3:11 AM, Puranjay Mohan wrote:
> bpf_flush_icache() calls flush_icache_range() to clean the data cache
> and invalidate the instruction cache for the JITed code region. However,
> since commit 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory
> management"), this flush is redundant.
>
> bpf_jit_binary_pack_finalize() copies the JITed instructions to the ROX
> region via bpf_arch_text_copy() -> aarch64_insn_copy() -> __text_poke(),
> and __text_poke() already calls flush_icache_range() on the written
> range. The subsequent bpf_flush_icache() repeats the same cache
> maintenance on an overlapping range, including an unnecessary second
> synchronous IPI to all CPUs via kick_all_cpus_sync().
>
So icache is flushed twice: once per instruction and again after all
instructions are copied. I think it's better to remove the per-instruction
flush and retain the single final flush to avoid repeating flush overhead
for each instruction.
> Remove the redundant bpf_flush_icache() call and its now-unused
> definition.
>
> Fixes: 1dad391daef1 ("bpf, arm64: use bpf_prog_pack for memory management")
> Acked-by: Song Liu <song at kernel.org>
> Signed-off-by: Puranjay Mohan <puranjay at kernel.org>
> ---
> arch/arm64/net/bpf_jit_comp.c | 12 ------------
> 1 file changed, 12 deletions(-)
>
> diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
> index adf84962d579..7417d24a5b6f 100644
> --- a/arch/arm64/net/bpf_jit_comp.c
> +++ b/arch/arm64/net/bpf_jit_comp.c
> @@ -18,7 +18,6 @@
>
> #include <asm/asm-extable.h>
> #include <asm/byteorder.h>
> -#include <asm/cacheflush.h>
> #include <asm/cpufeature.h>
> #include <asm/debug-monitors.h>
> #include <asm/insn.h>
> @@ -1961,11 +1960,6 @@ static int validate_ctx(struct jit_ctx *ctx)
> return 0;
> }
>
> -static inline void bpf_flush_icache(void *start, void *end)
> -{
> - flush_icache_range((unsigned long)start, (unsigned long)end);
> -}
> -
> static void priv_stack_init_guard(void __percpu *priv_stack_ptr, int alloc_size)
> {
> int cpu, underflow_idx = (alloc_size - PRIV_STACK_GUARD_SZ) >> 3;
> @@ -2204,12 +2198,6 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
> prog = orig_prog;
> goto out_off;
> }
> - /*
> - * The instructions have now been copied to the ROX region from
> - * where they will execute. Now the data cache has to be cleaned to
> - * the PoU and the I-cache has to be invalidated for the VAs.
> - */
> - bpf_flush_icache(ro_header, ctx.ro_image + ctx.idx);
> } else {
> jit_data->ctx = ctx;
> jit_data->ro_image = ro_image_ptr;
More information about the linux-riscv
mailing list