[PATCH v10 3/5] locking/qspinlock: Introduce CNA into the slow path of qspinlock

Tue Jul 28 16:00:12 EDT 2020

On 4/3/20 4:59 PM, Alex Kogan wrote:
> In CNA, spinning threads are organized in two queues, a primary queue for
> threads running on the same node as the current lock holder, and a
> secondary queue for threads running on other nodes. After acquiring the
> MCS lock and before acquiring the spinlock, the lock holder scans the
> primary queue looking for a thread running on the same node (pre-scan). If
> found (call it thread T), all threads in the primary queue between the
> current lock holder and T are moved to the end of the secondary queue.
> If such T is not found, we make another scan of the primary queue when
> unlocking the MCS lock (post-scan), starting at the position where
> pre-scan stopped. If both scans fail to find such T, the MCS lock is
> passed to the first thread in the secondary queue. If the secondary queue
> is empty, the lock is passed to the next thread in the primary queue.
> For more details, see https://arxiv.org/abs/1810.05600.
>
> Note that this variant of CNA may introduce starvation by continuously
> passing the lock to threads running on the same node. This issue
> will be addressed later in the series.
>
> Enabling CNA is controlled via a new configuration option
> (NUMA_AWARE_SPINLOCKS). By default, the CNA variant is patched in at the
> boot time only if we run on a multi-node machine in native environment and
> the new config is enabled. (For the time being, the patching requires
> CONFIG_PARAVIRT_SPINLOCKS to be enabled as well. However, this should be
> resolved once static_call() is available.) This default behavior can be
> overridden with the new kernel boot command-line option
> "numa_spinlock=on/off" (default is "auto").
>
> Signed-off-by: Alex Kogan <alex.kogan at oracle.com>
> Reviewed-by: Steve Sistare <steven.sistare at oracle.com>
> Reviewed-by: Waiman Long <longman at redhat.com>
> ---

There is also a concern that the worst case latency for a lock transfer 
can be close to O(n) which can be quite large for large SMP systems. I 
have a patch on top that modifies the current behavior to limit the 
number of node scans after the lock is freed.

Cheers,
Longman

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0008-locking-qspinlock-Limit-CNA-node-scan-after-the-lock.patch
Type: text/x-patch
Size: 8054 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20200728/6dd0eec5/attachment.bin>