[PATCH] arm64: Enable PCI write-combine resources under sysfs

Jason Gunthorpe jgg at nvidia.com
Thu Sep 10 19:29:38 EDT 2020

On Fri, Sep 11, 2020 at 07:46:47AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2020-09-10 at 14:10 -0300, Jason Gunthorpe wrote:
> > Can you explain what this actually does on ARM? 
> > 
> > Can it ever speculate loads across page boundaries, or speculate
> > loads
> > that never exist in the program? ie will we get random unpredicable
> > MemRds?
> Probably, at least on powerpc you will as well, that's the only way to
> get write combine.

If I remove the PROT_READ in the user space mmap will it block it?

Read TLPs are not harmful but I suspect they would cause an
undesirable random performance anomaly.

> > Does it/could it "combine writes"?
> I assume so for ARM, definitely for powerpc.

Various IBM PPC chips I know work, we do test that.

> > > That's why I looped you in - that's what worries me about
> > > "enabling"
> > > arch_can_pci_mmap_wc() on arm64. If we enable it and we have perf
> > > regressions that's not OK.
> > > 
> > > Or we *can* enable arch_can_pci_mmap_wc() but force the mellanox
> > > driver (or more broadly all drivers following this message push
> > > semantics) to use "something else" for WC detection.
> > 
> > arch_can_pci_mmap_wc() really only controls the sysfs resource file
> > and it seems very unclear who in userspace uses that these days.
> dpdk under some circumstances afaik.

And something gross for DMA then? Not sure dpdk is useful without
DMA. Why not use CONFIG_VFIO_NOIOMMU for such a non-secure thing?
> > vfio is now the right way to do that stuff. I don't see an obvious
> > way to get WC memory in VFIO though...
> Which would be a performance issue on a number of things I suppose...

Almost nothing uses pci_iomap_wc(), so I'd be surpried if userspace
DPDK was an important user when an in-kernel driver for the same HW
doesn't use it?


