[RFC] Implementing the BPF dispatcher on ARM64

Florent Revest revest at chromium.org
Mon Mar 13 17:09:56 PDT 2023


Hey Puranjay! :)

On Mon, Mar 13, 2023 at 1:56 PM Puranjay Mohan <puranjay12 at gmail.com> wrote:
>
> [CC: Florent, KP]
>
> On Mon, Mar 13, 2023 at 7:50 AM Xu Kuohai <xukuohai at huawei.com> wrote:
> >
> > [ cc arm list ]
> >
> > On 3/10/2023 5:33 PM, Puranjay Mohan wrote:
> > > Hi,
> > > I am starting this thread to know if someone is implementing the BPF
> > > dispatcher for ARM64 and if not, what would be needed to make this
> > > happen.

As Alexei said, I've been doing some work on ftrace direct calls on
arm64 (so the trampolines can get called in tracing programs)

https://lore.kernel.org/all/20230207182135.2671106-1-revest@chromium.org/

It is currently blocked waiting for a review from the ftrace
maintainer. Steven has been quite busy but I regularly nag him to
review it :)

> > > The basic infra + x86 specific code was introduced in [1] by Björn Töpel.
> > >
> > > To make BPF dispatcher work on ARM64, the
> > > arch_prepare_bpf_dispatcher() has to be implemented in
> > > arch/arm64/net/bpf_jit_comp.c.
> > >
> > > As I am not well versed with XDP and the JIT, I have a few questions
> > > regarding this.
> > >
> > > 1. What is the best way to test this? Is there a selftest that will
> > > fail now and will pass once the dispatcher is implemented?
> > > 2. As there is no CONFIG_RETPOLINE in ARM64, will the dispatcher be useful.
> >
> > Hello,
> >
> > I have some thoughts for bpf dispatcher in arm64.
> >
> > bpf dispatcher uses static call to convert indirect call instructions to direct
> > call instructions, to avoid performance penalty introduced by retpoline. Since
> > there is no retpoline or static call in arm64, bpf dispatcher seems useless.

But I agree with Xu here. The reason why I did not look into bpf
dispatchers for arm64 is because there is no retpoline cost on arm64.

> > In addition, the range for a direct call instruction in arm64 is +-128MB, but
> > jited bpf image address is outside of +-128MB, so it may not be possible to call
> > a bpf prog with direct call instruction.
>
> So, to summarize all the information about BPF Dispatcher on ARM64:
> 1. The range for the B and BL instructions in arm64 is +-128MB, so we
> can't use direct jump.
> 2. Static Calls are not supported on ARM64 yet.
> 3. bpf_prog_pack allocator for ARM64 is not yet enabled because
> bpf_arch_text_copy()
> and bpf_arch_text_invalidate() are not implemented.
>
> Even if static calls are implemented the dispatcher can't be
> implemented because of point 1.

And even if they could, I don't see what value they would bring on arm64.

> What would be required to implement bpf_arch_text_copy()
> and bpf_arch_text_invalidate(). As enabling the bpf_prog_pack
> allocator for ARM64
> would be useful in the JIT as well.

I have not looked into this at all but ooc have you noticed the series
for powerpc sent just a few days back to the list ?

https://lore.kernel.org/bpf/20230309180028.180200-1-hbathini@linux.ibm.com/



More information about the linux-arm-kernel mailing list