[PATCH v2 0/5] Switch arm64 over to qrwlock

Waiman Long longman at redhat.com
Mon Oct 9 14:19:46 PDT 2017


On 10/06/2017 09:34 AM, Will Deacon wrote:
> Hi all,
>
> This is version two of the patches I posted yesterday:
>
>   http://lists.infradead.org/pipermail/linux-arm-kernel/2017-October/534666.html
>
> I'd normally leave it longer before posting again, but Peter had a good
> suggestion to rework the layout of the lock word, so I wanted to post a
> version that follows that approach.
>
> I've updated my branch if you're after the full patch stack:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git qrwlock
>
> As before, all comments (particularly related to testing and performance)
> welcome!
>
> Cheers,
>
> Will
>
> --->8
>
> Will Deacon (5):
>   kernel/locking: Use struct qrwlock instead of struct __qrwlock
>   locking/atomic: Add atomic_cond_read_acquire
>   kernel/locking: Use atomic_cond_read_acquire when spinning in qrwlock
>   arm64: locking: Move rwlock implementation over to qrwlocks
>   kernel/locking: Prevent slowpath writers getting held up by fastpath
>
>  arch/arm64/Kconfig                      |  17 ++++
>  arch/arm64/include/asm/Kbuild           |   1 +
>  arch/arm64/include/asm/spinlock.h       | 164 +-------------------------------
>  arch/arm64/include/asm/spinlock_types.h |   6 +-
>  include/asm-generic/atomic-long.h       |   3 +
>  include/asm-generic/qrwlock.h           |  20 +---
>  include/asm-generic/qrwlock_types.h     |  15 ++-
>  include/linux/atomic.h                  |   4 +
>  kernel/locking/qrwlock.c                |  83 +++-------------
>  9 files changed, 58 insertions(+), 255 deletions(-)
>
I had done some performance test of your patch on a 1 socket Cavium
CN8880 system with 32 cores. I used my locking stress test which
produced the following results with 16 locking threads at various mixes
of reader & writer threads on 4.14-rc4 based kernels. The numbers are
the minimum/average/maximum locking operations done per locking threads
in a 10 seconds period. A minimum number of 1 means there is at least 1
thread that cannot acquire the lock during the test period.

                w/o qrwlock patch               with qrwlock patch
                -----------------               ------------------
16 readers   793,024/1,169,763/1,684,751  1,060,127/1,198,583/1,331,003
        
12 readers 1,162,760/1,641,714/2,162,939  1,685,334/2,099,088/2,338,461
 4 writers         1/        1/        1     25,540/  195,975/  392,232
 
 8 readers 2,135,670/2,391,612/2,737,564  2,985,686/3,359,048/3,870,423
 8 writers         1/   19,867/   88,173    119,078/  559,604/1,112,769
 
 4 readers 1,194,917/1,250,876/1,299,304  3,611,059/4,653,775/6,268,370
12 writers   176,156/1,088,513/2,594,534      7,664/  795,393/1,841,961

16 writers    35,007/1,094,608/1,954,457  1,618,915/1,633,077/1,645,637

It can be seen that qrwlock performed much better than the original rwlock
implementation.

Tested-by: Waiman Long <longman at redhat.com>

Cheers,
Longman







More information about the linux-arm-kernel mailing list