[PATCH V3 3/5] riscv: atomic: Optimize memory barrier semantics of LRSC-pairs
guoren at kernel.org
guoren at kernel.org
Wed Apr 20 07:44:15 PDT 2022
From: Guo Ren <guoren at linux.alibaba.com>
The current implementation is the same with 8e86f0b409a4 ("arm64:
atomics: fix use of acquire + release for full barrier semantics").
RISC-V could combine acquire and release into the AMO instructions
and it could reduce the cost of instruction in performance. Here
is RISC-V ISA 10.2 Load-Reserved/Store-Conditional Instructions:
- .aq: The LR/SC sequence can be given acquire semantics by
setting the aq bit on the LR instruction.
- .rl: The LR/SC sequence can be given release semantics by
setting the rl bit on the SC instruction.
- .aqrl: Setting the aq bit on the LR instruction, and setting
both the aq and the rl bit on the SC instruction makes the
LR/SC sequence sequentially consistent, meaning that it
cannot be reordered with earlier or later memory operations
from the same hart.
Software should not set the rl bit on an LR instruction unless the
aq bit is also set, nor should software set the aq bit on an SC
instruction unless the rl bit is also set. LR.rl and SC.aq
instructions are not guaranteed to provide any stronger ordering than
those with both bits clear, but may result in lower performance.
Signed-off-by: Guo Ren <guoren at linux.alibaba.com>
Signed-off-by: Guo Ren <guoren at kernel.org>
Cc: Palmer Dabbelt <palmer at dabbelt.com>
Cc: Mark Rutland <mark.rutland at arm.com>
Cc: Dan Lustig <dlustig at nvidia.com>
Cc: Andrea Parri <parri.andrea at gmail.com>
---
arch/riscv/include/asm/atomic.h | 6 ++----
arch/riscv/include/asm/cmpxchg.h | 6 ++----
2 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomic.h
index 20ce8b83bc18..4aaf5b01e7c6 100644
--- a/arch/riscv/include/asm/atomic.h
+++ b/arch/riscv/include/asm/atomic.h
@@ -382,9 +382,8 @@ static __always_inline int arch_atomic_sub_if_positive(atomic_t *v, int offset)
"0: lr.w %[p], %[c]\n"
" sub %[rc], %[p], %[o]\n"
" bltz %[rc], 1f\n"
- " sc.w.rl %[rc], %[rc], %[c]\n"
+ " sc.w.aqrl %[rc], %[rc], %[c]\n"
" bnez %[rc], 0b\n"
- " fence rw, rw\n"
"1:\n"
: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
: [o]"r" (offset)
@@ -404,9 +403,8 @@ static __always_inline s64 arch_atomic64_sub_if_positive(atomic64_t *v, s64 offs
"0: lr.d %[p], %[c]\n"
" sub %[rc], %[p], %[o]\n"
" bltz %[rc], 1f\n"
- " sc.d.rl %[rc], %[rc], %[c]\n"
+ " sc.d.aqrl %[rc], %[rc], %[c]\n"
" bnez %[rc], 0b\n"
- " fence rw, rw\n"
"1:\n"
: [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
: [o]"r" (offset)
diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 1af8db92250b..9269fceb86e0 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -307,9 +307,8 @@
__asm__ __volatile__ ( \
"0: lr.w %0, %2\n" \
" bne %0, %z3, 1f\n" \
- " sc.w.rl %1, %z4, %2\n" \
+ " sc.w.aqrl %1, %z4, %2\n" \
" bnez %1, 0b\n" \
- " fence rw, rw\n" \
"1:\n" \
: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
: "rJ" ((long)__old), "rJ" (__new) \
@@ -319,9 +318,8 @@
__asm__ __volatile__ ( \
"0: lr.d %0, %2\n" \
" bne %0, %z3, 1f\n" \
- " sc.d.rl %1, %z4, %2\n" \
+ " sc.d.aqrl %1, %z4, %2\n" \
" bnez %1, 0b\n" \
- " fence rw, rw\n" \
"1:\n" \
: "=&r" (__ret), "=&r" (__rc), "+A" (*__ptr) \
: "rJ" (__old), "rJ" (__new) \
--
2.25.1
More information about the linux-riscv
mailing list