[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Will Deacon will at kernel.org
Wed Jan 17 08:05:29 PST 2024


On Wed, Jan 17, 2024 at 11:28:22AM -0400, Jason Gunthorpe wrote:
> On Wed, Jan 17, 2024 at 02:07:16PM +0000, Mark Rutland wrote:
> 
> > > I believe this is for the same reason as doing so in all of our other IO
> > > accessors.
> > > 
> > > We've deliberately ensured that our IO accessors use a single base register
> > > with no offset as this is the only form that HW can represent in ESR_ELx.ISS.SRT
> > > when reporting a stage-2 abort, which a hypervisor may use for emulating IO.
> > 
> > FWIW, IIUC the immediate-offset forms *without* writeback can still be reported
> > usefully in ESR_ELx, so I believe that we could use the "o" constraint for the
> > __raw_write*() functions, e.g.
> > 
> > static __always_inline void __raw_writeq(u64 val, volatile void __iomem *addr)
> > {
> > 	asm volatile("str %x0, %1" : : "rZ" (val), "o" (*(volatile u64 *)addr));
> > }
> 
> "o" works well in the same simple memcpy loop:
> 
>         add     x2, x1, w2, uxtw 3
>         cmp     x1, x2
>         bcs     .L1
> .L3:
>         ldp     x10, x9, [x1]
>         ldp     x8, x7, [x1, 16]
>         ldp     x6, x5, [x1, 32]
>         ldp     x4, x3, [x1, 48]
>         str x10, [x0]
>         str x9, [x0, 8]
>         str x8, [x0, 16]
>         str x7, [x0, 24]
>         str x6, [x0, 32]
>         str x5, [x0, 40]
>         str x4, [x0, 48]
>         str x3, [x0, 56]
>         add     x1, x1, 64
>         add     x0, x0, 64
>         cmp     x2, x1
>         bhi     .L3
> .L1:
>         ret
> 
> Seems intersting to pursue?

I've seen the compiler struggle with plain "o" in the past ("Impossible
constraint in asm") so we might want "Qo" if we go down this route.

Will



More information about the linux-arm-kernel mailing list