[PATCH 2/2] arm64: bpf: add BPF XADD instruction
will.deacon at arm.com
Wed Nov 11 03:58:52 PST 2015
On Wed, Nov 11, 2015 at 11:42:11AM +0100, Daniel Borkmann wrote:
> On 11/11/2015 11:24 AM, Will Deacon wrote:
> >On Wed, Nov 11, 2015 at 09:49:48AM +0100, Arnd Bergmann wrote:
> >>On Tuesday 10 November 2015 18:52:45 Z Lim wrote:
> >>>On Tue, Nov 10, 2015 at 4:42 PM, Alexei Starovoitov
> >>><alexei.starovoitov at gmail.com> wrote:
> >>>>On Tue, Nov 10, 2015 at 04:26:02PM -0800, Shi, Yang wrote:
> >>>>>On 11/10/2015 4:08 PM, Eric Dumazet wrote:
> >>>>>>On Tue, 2015-11-10 at 14:41 -0800, Yang Shi wrote:
> >>>>>>>aarch64 doesn't have native support for XADD instruction, implement it by
> >>>>>>>the below instruction sequence:
> >>>aarch64 supports atomic add in ARMv8.1.
> >>>For ARMv8(.0), please consider using LDXR/STXR sequence.
> >>Is it worth optimizing for the 8.1 case? It would add a bit of complexity
> >>to make the code depend on the CPU feature, but it's certainly doable.
> >What's the atomicity required for? Put another way, what are we racing
> >with (I thought bpf was single-threaded)? Do we need to worry about
> >memory barriers?
> >Apologies if these are stupid questions, but all I could find was
> >samples/bpf/sock_example.c and it didn't help much :(
> The equivalent code more readable in restricted C syntax (that can be
> compiled by llvm) can be found in samples/bpf/sockex1_kern.c. So the
> built-in __sync_fetch_and_add() will be translated into a BPF_XADD
> insn variant.
Yikes, so the memory-model for BPF is based around the deprecated GCC
__sync builtins, that inherit their semantics from ia64? Any reason not
to use the C11-compatible __atomic builtins as a base?
> What you can race against is that an eBPF map can be _shared_ by
> multiple eBPF programs that are attached somewhere in the system, and
> they could all update a particular entry/counter from the map at the
> same time.
Ok, so it does sound like eBPF needs to define/choose a memory-model and
I worry that riding on the back of __sync isn't necessarily the right
thing to do, particularly as its fallen out of favour with the compiler
folks. On weakly-ordered architectures, it's also going to result in
heavy-weight barriers for all atomic operations.
More information about the linux-arm-kernel