[PATCH v4 3/4] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

Arnd Bergmann arnd at arndb.de
Tue Mar 30 08:11:50 BST 2021


On Tue, Mar 30, 2021 at 4:26 AM Guo Ren <guoren at kernel.org> wrote:
> On Mon, Mar 29, 2021 at 9:56 PM Arnd Bergmann <arnd at arndb.de> wrote:
> > On Mon, Mar 29, 2021 at 2:52 PM Guo Ren <guoren at kernel.org> wrote:
> > > On Mon, Mar 29, 2021 at 7:31 PM Peter Zijlstra <peterz at infradead.org> wrote:
> > > >
> > > > What's the architectural guarantee on LL/SC progress for RISC-V ?
> >
> >    "When LR/SC is used for memory locations marked RsrvNonEventual,
> >      software should provide alternative fall-back mechanisms used when
> >      lack of progress is detected."
> >
> > My reading of this is that if the example you tried stalls, then either
> > the PMA is not RsrvEventual, and it is wrong to rely on ll/sc on this,
> > or that the PMA is marked RsrvEventual but the implementation is
> > buggy.
>
> Yes, PMA just defines physical memory region attributes, But in our
> processor, when MMU is enabled (satp's value register > 2) in s-mode,
> it will look at our custom PTE's attributes BIT(63) ref [1]:
>
>    PTE format:
>    | 63 | 62 | 61 | 60 | 59 | 58-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
>      SO   C    B    SH   SE    RSW   D   A   G   U   X   W   R   V
>      ^    ^    ^    ^    ^
>    BIT(63): SO - Strong Order
>    BIT(62): C  - Cacheable
>    BIT(61): B  - Bufferable
>    BIT(60): SH - Shareable
>    BIT(59): SE - Security
>
> So the memory also could be RsrvNone/RsrvEventual.

I was not talking about RsrvNone, which would clearly mean that
you cannot use lr/sc at all (trap would trap, right?), but "RsrvNonEventual",
which would explain the behavior you described in an earlier reply:

| u32 a = 0x55aa66bb;
| u16 *ptr = &a;
|
| CPU0                       CPU1
| =========             =========
| xchg16(ptr, new)     while(1)
|                                     WRITE_ONCE(*(ptr + 1), x);
|
| When we use lr.w/sc.w implement xchg16, it'll cause CPU0 deadlock.

As I understand, this example must not cause a deadlock on
a compliant hardware implementation when the underlying memory
has RsrvEventual behavior, but could deadlock in case of
RsrvNonEventual

> [1] https://github.com/c-sky/csky-linux/commit/e837aad23148542771794d8a2fcc52afd0fcbf88
>
> >
> > It also seems that the current "amoswap" based implementation
> > would be reliable independent of RsrvEventual/RsrvNonEventual.
>
> Yes, the hardware implementation of AMO could be different from LR/SC.
> AMO could use ACE snoop holding to lock the bus in hw coherency
> design, but LR/SC uses an exclusive monitor without locking the bus.
>
> RISC-V hasn't CAS instructions, and it uses LR/SC for cmpxchg. I don't
> think LR/SC would be slower than CAS, and CAS is just good for code
> size.

What I meant here is that the current spinlock uses a simple amoswap,
which presumably does not suffer from the lack of forward process you
described.

        Arnd



More information about the linux-riscv mailing list