[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Niklas Schnelle schnelle at linux.ibm.com
Wed Jan 17 05:20:00 PST 2024


On Tue, 2024-01-16 at 13:33 -0400, Jason Gunthorpe wrote:
> On Tue, Nov 28, 2023 at 05:28:36PM +0100, Niklas Schnelle wrote:
> > On Mon, 2023-11-27 at 13:51 -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 27, 2023 at 06:43:11PM +0100, Niklas Schnelle wrote:
> > > 
> > > > Also it turns out the writeq() loop we had so far does not produce the
> > > > needed 64 byte TLP on s390 either so this actually makes us newly pass
> > > > this test.
> > > 
> > > Ooh, that is a significant problem - the userspace code won't be used
> > > unless this test passes. So we need this on S390 to fix a bug as well
> > > :\
> > 
> > Yes ;-(
> > 
> > In the meantime I also found out that zpci_write_block(dst, src, 64) is
> > not correct for all cases because the PCI store block requires the
> > (pseudo-)MMIO write not to cross a 4K boundary and we need src/dst to
> > be double word aligned. In rdma-core this is neatly handled by the
> > get_max_write_size() but the kernel variant of that
> > zpci_get_max_write_size() isn't just a lot harder to read and likely
> > less efficient but also too strict thus breaking the 64 byte write up
> > needlessly.
> > 
> > In total we have 5 conditions for the PCI block stores:
> > 
> > 1. The dst+len must not cross a 4K boundary in the (pseudo-)MMIO space
> > 2. The length must not exceed the maximum write size
> > 3. The length must be a multiple of 8
> > 4. The src needs to be double word aligned
> > 5. The dst needs to be double word aligned
> > 
> > So I think a good solution would be to improve zpci_memcpy_toio() with
> > an enhanced zpci_get_max_write_size() based on the code in rdma-core
> > extended to also handle the alignment and length restrictions which in
> > rdma-core are already assumed (see comment there). Then we can use
> > zpci_memcpy_toio(dst, src, 64) for memcpy_toio_64() and rely on the
> > compiler to optimize out the unnecessary checks (2, 3 and possibly 4,
> > 5).
> > 
> > So yeah this is getting a bit more  complicated than originally
> > thought. Let me cook up a patch.
> 
> Did you come up with something?
> 
> Jason

Hi Jason,

Sorry I haven't replied. Yes, I have a fix for zpci_memcpy_toio()
titled "s390/pci: fix max size calculation in zpci_memcpy_toio()" that
I tested with this series plus the following define added to
arch/s390/include/asm/io.h:

#define memcpy_toio_64 zpci_memcpy_toio(dst, src, 64)

It's already in the s390 tree's feature branch and linux-next.

Thanks,
Niklas



More information about the linux-arm-kernel mailing list