[PATCH v2] Avoid memory barrier in read_seqcount() through load acquire
Thomas Gleixner
tglx at linutronix.de
Mon Sep 2 04:55:01 PDT 2024
On Wed, Aug 28 2024 at 10:15, Christoph Lameter wrote:
> On Fri, 23 Aug 2024, Thomas Gleixner wrote:
>
>> This all can be done without the extra copies of the counter
>> accessors. Uncompiled patch below.
>
> Great. Thanks. Tried it too initially but could not make it work right.
>
> One thing that we also want is the use of the smp_cond_load_acquire to
> have the cpu power down while waiting for a cacheline change.
>
> The code has several places where loops occur when the last bit is set in
> the seqcount.
>
> We could use smp_cond_load_acquire in load_sequence() but what do we do
> about the loops at the higher level? Also this does not sync with the lock
> checking logic.
Come on. It's not rocket science to figure that out.
Uncompiled delta patch below.
Thanks,
tglx
---
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -23,6 +23,13 @@
#include <asm/processor.h>
+#ifdef CONFIG_ARCH_HAS_ACQUIRE_RELEASE
+# define USE_LOAD_ACQUIRE true
+# define USE_COND_LOAD_ACQUIRE !IS_ENABLED(CONFIG_PREEMPT_RT)
+#else
+# define USE_LOAD_ACQUIRE false
+# define USE_COND_LOAD_ACQUIRE false
+#endif
/*
* The seqlock seqcount_t interface does not prescribe a precise sequence of
* read begin/retry/end. For readers, typically there is a call to
@@ -134,10 +141,13 @@ static inline void seqcount_lockdep_read
static __always_inline unsigned __seqprop_load_sequence(const seqcount_t *s, bool acquire)
{
- if (acquire && IS_ENABLED(CONFIG_ARCH_HAS_ACQUIRE_RELEASE))
- return smp_load_acquire(&s->sequence);
- else
+ if (!acquire || !USE_LOAD_ACQUIRE)
return READ_ONCE(s->sequence);
+
+ if (USE_COND_LOAD_ACQUIRE)
+ return smp_cond_load_acquire(&s->sequence, (s->sequence & 1) == 0);
+
+ return smp_load_acquire(&s->sequence);
}
/*
@@ -283,8 +293,12 @@ SEQCOUNT_LOCKNAME(mutex, struct m
({ \
unsigned __seq; \
\
- while ((__seq = seqprop_sequence(s, acquire)) & 1) \
- cpu_relax(); \
+ if (acquire && USE_COND_LOAD_ACQUIRE) { \
+ __seq = seqprop_sequence(s, acquire); \
+ } else { \
+ while ((__seq = seqprop_sequence(s, acquire)) & 1) \
+ cpu_relax(); \
+ } \
\
kcsan_atomic_next(KCSAN_SEQLOCK_REGION_MAX); \
__seq; \
More information about the linux-arm-kernel
mailing list