[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64
Robin Murphy
robin.murphy at arm.com
Fri Nov 24 07:32:46 PST 2023
On 24/11/2023 1:45 pm, Jason Gunthorpe wrote:
> On Fri, Nov 24, 2023 at 12:58:11PM +0000, Robin Murphy wrote:
>>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>>> index 3b694511b98f..73ab91913790 100644
>>> --- a/arch/arm64/include/asm/io.h
>>> +++ b/arch/arm64/include/asm/io.h
>>> @@ -135,6 +135,26 @@ extern void __memset_io(volatile void __iomem *, int, size_t);
>>> #define memcpy_fromio(a,c,l) __memcpy_fromio((a),(c),(l))
>>> #define memcpy_toio(c,a,l) __memcpy_toio((c),(a),(l))
>>> +static inline void __memcpy_toio_64(volatile void __iomem *to, const void *from)
>>> +{
>>> + const u64 *from64 = from;
>>> +
>>> + /*
>>> + * Newer ARM core have sensitive write combining buffers, it is
>>> + * important that the stores be contiguous blocks of store instructions.
>>> + * Normal memcpy does not work reliably.
>>> + */
>>> + asm volatile("stp %x0, %x1, [%8, #16 * 0]\n"
>>> + "stp %x2, %x3, [%8, #16 * 1]\n"
>>> + "stp %x4, %x5, [%8, #16 * 2]\n"
>>> + "stp %x6, %x7, [%8, #16 * 3]\n"
>>> + :
>>> + : "rZ"(from64[0]), "rZ"(from64[1]), "rZ"(from64[2]),
>>> + "rZ"(from64[3]), "rZ"(from64[4]), "rZ"(from64[5]),
>>> + "rZ"(from64[6]), "rZ"(from64[7]), "r"(to));
>>
>> Is this correct for big-endian? LDP/STP are kinda tricksy in that regard.
>
> Uh.. I didn't think about it at all..
>
> By no means do I have any skill reading the ARM documents, but I think
> it is OK, it says:
>
> Mem[address, dbytes, AccType_NORMAL] = data1;
> Mem[address+dbytes, dbytes, AccType_NORMAL] = data2;
>
> So I understand that as
>
> Mem[%8, #16 * 0, 8, AccType_NORMAL] = from64[0]
> Mem[%8, #16 * 0 + 1 , 8, AccType_NORMAL] = from64[1]
> Mem[%8, #16 * 1, 8, AccType_NORMAL] = from64[2]
> Mem[%8, #16 * 1 + 1, 8, AccType_NORMAL] = from64[3]
> ..
>
> Which is the same on BE/LE?
>
> But I don't know the pitfall to watch for here. This is memcpy so we
> don't have to swap, the order of the bits in the register doesn't
> matter.
Indeed you're right - all the way back to Armv7 LDRD/STRD, I always get
caught out by remembering the path which does an endian-dependent swap
of the target registers, but forgetting that that's there to
*counteract* the byteswap in Mem[] itself.
Cheers,
Robin.
More information about the linux-arm-kernel
mailing list