[PATCH bpf-next v2 17/26] rqspinlock: Hardcode cond_acquire loops to asm-generic implementation

Peter Zijlstra peterz at infradead.org
Mon Feb 10 02:03:16 PST 2025


On Mon, Feb 10, 2025 at 10:53:25AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 06, 2025 at 02:54:25AM -0800, Kumar Kartikeya Dwivedi wrote:
> > Currently, for rqspinlock usage, the implementation of
> > smp_cond_load_acquire (and thus, atomic_cond_read_acquire) are
> > susceptible to stalls on arm64, because they do not guarantee that the
> > conditional expression will be repeatedly invoked if the address being
> > loaded from is not written to by other CPUs. When support for
> > event-streams is absent (which unblocks stuck WFE-based loops every
> > ~100us), we may end up being stuck forever.
> > 
> > This causes a problem for us, as we need to repeatedly invoke the
> > RES_CHECK_TIMEOUT in the spin loop to break out when the timeout
> > expires.
> > 
> > Hardcode the implementation to the asm-generic version in rqspinlock.c
> > until support for smp_cond_load_acquire_timewait [0] lands upstream.
> > 
> 
> *sigh*.. this patch should go *before* patch 8. As is that's still
> horribly broken and I was WTF-ing because your 0/n changelog said you
> fixed it.

And since you're doing local copies of things, why not take a lobal copy
of the smp_cond_load_acquire_timewait() thing?



More information about the linux-arm-kernel mailing list