[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Robin Murphy robin.murphy at arm.com
Fri Nov 24 07:32:46 PST 2023


On 24/11/2023 1:45 pm, Jason Gunthorpe wrote:
> On Fri, Nov 24, 2023 at 12:58:11PM +0000, Robin Murphy wrote:
>>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>>> index 3b694511b98f..73ab91913790 100644
>>> --- a/arch/arm64/include/asm/io.h
>>> +++ b/arch/arm64/include/asm/io.h
>>> @@ -135,6 +135,26 @@ extern void __memset_io(volatile void __iomem *, int, size_t);
>>>    #define memcpy_fromio(a,c,l)	__memcpy_fromio((a),(c),(l))
>>>    #define memcpy_toio(c,a,l)	__memcpy_toio((c),(a),(l))
>>> +static inline void __memcpy_toio_64(volatile void __iomem *to, const void *from)
>>> +{
>>> +	const u64 *from64 = from;
>>> +
>>> +	/*
>>> +	 * Newer ARM core have sensitive write combining buffers, it is
>>> +	 * important that the stores be contiguous blocks of store instructions.
>>> +	 * Normal memcpy does not work reliably.
>>> +	 */
>>> +	asm volatile("stp %x0, %x1, [%8, #16 * 0]\n"
>>> +		     "stp %x2, %x3, [%8, #16 * 1]\n"
>>> +		     "stp %x4, %x5, [%8, #16 * 2]\n"
>>> +		     "stp %x6, %x7, [%8, #16 * 3]\n"
>>> +		     :
>>> +		     : "rZ"(from64[0]), "rZ"(from64[1]), "rZ"(from64[2]),
>>> +		       "rZ"(from64[3]), "rZ"(from64[4]), "rZ"(from64[5]),
>>> +		       "rZ"(from64[6]), "rZ"(from64[7]), "r"(to));
>>
>> Is this correct for big-endian? LDP/STP are kinda tricksy in that regard.
> 
> Uh.. I didn't think about it at all..
> 
> By no means do I have any skill reading the ARM documents, but I think
> it is OK, it says:
> 
> Mem[address, dbytes, AccType_NORMAL] = data1;
> Mem[address+dbytes, dbytes, AccType_NORMAL] = data2;
> 
> So I understand that as
> 
> Mem[%8, #16 * 0, 8, AccType_NORMAL] = from64[0]
> Mem[%8, #16 * 0 + 1 , 8, AccType_NORMAL] = from64[1]
> Mem[%8, #16 * 1, 8, AccType_NORMAL] = from64[2]
> Mem[%8, #16 * 1 + 1, 8, AccType_NORMAL] = from64[3]
> ..
> 
> Which is the same on BE/LE?
> 
> But I don't know the pitfall to watch for here. This is memcpy so we
> don't have to swap, the order of the bits in the register doesn't
> matter.

Indeed you're right - all the way back to Armv7 LDRD/STRD, I always get 
caught out by remembering the path which does an endian-dependent swap 
of the target registers, but forgetting that that's there to 
*counteract* the byteswap in Mem[] itself.

Cheers,
Robin.



More information about the linux-arm-kernel mailing list