[PATCH v5 5/5] rqspinlock: Use smp_cond_load_acquire_timeout()
Ankur Arora
ankur.a.arora at oracle.com
Fri Sep 12 11:06:45 PDT 2025
Catalin Marinas <catalin.marinas at arm.com> writes:
> On Thu, Sep 11, 2025 at 02:58:22PM -0700, Ankur Arora wrote:
>>
>> Kumar Kartikeya Dwivedi <memxor at gmail.com> writes:
>>
>> > On Thu, 11 Sept 2025 at 16:32, Catalin Marinas <catalin.marinas at arm.com> wrote:
>> >>
>> >> On Wed, Sep 10, 2025 at 08:46:55PM -0700, Ankur Arora wrote:
>> >> > Switch out the conditional load inerfaces used by rqspinlock
>> >> > to smp_cond_read_acquire_timeout().
>> >> > This interface handles the timeout check explicitly and does any
>> >> > necessary amortization, so use check_timeout() directly.
>> >>
>> >> It's worth mentioning that the default smp_cond_load_acquire_timeout()
>> >> implementation (without hardware support) only spins 200 times instead
>> >> of 16K times in the rqspinlock code. That's probably fine but it would
>> >> be good to have confirmation from Kumar or Alexei.
>> >>
>> >
>> > This looks good, but I would still redefine the spin count from 200 to
>> > 16k for rqspinlock.c, especially because we need to keep
>> > RES_CHECK_TIMEOUT around which still uses 16k spins to amortize
>> > check_timeout.
>>
>> By my count that amounts to ~100us per check_timeout() on x86
>> systems I've tested with cpu_relax(). Which seems quite reasonable.
>>
>> 16k also seems safer on CPUs where cpu_relax() is basically a NOP.
>
> Does this spin count work for poll_idle()? I don't remember where the
> 200 value came from.
Just reusing the value of POLL_IDLE_RELAX_COUNT which is is defined as
200.
For the poll_idle() case I don't think the value of 200 makes sense
for all architectures, so they'll need to redefine it (before defining
ARCH_HAS_OPTIMIZED_POLL which gates poll_idle().)
--
ankur
More information about the linux-arm-kernel
mailing list