[PATCH 2/2] arm64: bpf: add BPF XADD instruction

Will Deacon will.deacon at arm.com
Wed Nov 11 04:38:31 PST 2015


On Wed, Nov 11, 2015 at 01:21:04PM +0100, Daniel Borkmann wrote:
> On 11/11/2015 12:58 PM, Will Deacon wrote:
> >On Wed, Nov 11, 2015 at 11:42:11AM +0100, Daniel Borkmann wrote:
> >>On 11/11/2015 11:24 AM, Will Deacon wrote:
> >>>On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:
> >>>>On Tuesday 10 November 2015 18:52:45 Z Lim wrote:
> >>>>>On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov
> >>>>><alexei.starovoitov at gmail.com> wrote:
> >>>>>>On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:
> >>>>>>>On 11/10/2015 4:08 PM, Eric Dumazet wrote:
> >>>>>>>>On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:
> >>>>>>>>>aarch64 doesn't have native support for XADD instruction, implement it by
> >>>>>>>>>the below instruction sequence:
> >>>>>
> >>>>>aarch64 supports atomic add in ARMv8.1.
> >>>>>For ARMv8(.0), please consider using LDXR/STXR sequence.
> >>>>
> >>>>Is it worth optimizing for the 8.1 case? It would add a bit of complexity
> >>>>to make the code depend on the CPU feature, but it's certainly doable.
> >>>
> >>>What's the atomicity required for? Put another way, what are we racing
> >>>with (I thought bpf was single-threaded)? Do we need to worry about
> >>>memory barriers?
> >>>
> >>>Apologies if these are stupid questions, but all I could find was
> >>>samples/bpf/sock_example.c and it didn't help much :(
> >>
> >>The equivalent code more readable in restricted C syntax (that can be
> >>compiled by llvm) can be found in samples/bpf/sockex1_kern.c. So the
> >>built-in __sync_fetch_and_add() will be translated into a BPF_XADD
> >>insn variant.
> >
> >Yikes, so the memory-model for BPF is based around the deprecated GCC
> >__sync builtins, that inherit their semantics from ia64? Any reason not
> >to use the C11-compatible __atomic builtins[1] as a base?
> 
> Hmm, gcc doesn't have an eBPF compiler backend, so this won't work on
> gcc at all. The eBPF backend in LLVM recognizes the __sync_fetch_and_add()
> keyword and maps that to a BPF_XADD version (BPF_W or BPF_DW). In the
> interpreter (__bpf_prog_run()), as Eric mentioned, this maps to atomic_add()
> and atomic64_add(), respectively. So the struct bpf_insn prog[] you saw
> from sock_example.c can be regarded as one possible equivalent program
> section output from the compiler.

Ok, so if I understand you correctly, then __sync_fetch_and_add() has
different semantics depending on the backend target. That seems counter
to the LLVM atomics Documentation:

  http://llvm.org/docs/Atomics.html

which specifically calls out the __sync_* primitives as being
sequentially-consistent and requiring barriers on ARM (which isn't the
case for atomic[64]_add in the kernel).

If we re-use the __sync_* naming scheme in the source language, I don't
think we can overlay our own semantics in the backend. The
__sync_fetch_and_add primitive is also expected to return the old value,
which doesn't appear to be the case for BPF_XADD.

Will



More information about the linux-arm-kernel mailing list