[PATCH] arm64: Enable PCI write-combine resources under sysfs

Jason Gunthorpe jgg at nvidia.com
Tue Sep 15 19:40:06 EDT 2020


On Wed, Sep 16, 2020 at 09:17:38AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2020-09-15 at 08:05 -0300, Jason Gunthorpe wrote:
> > > To sum it up:
> > > 
> > > (1) RDMA drivers need a new mapping function/attribute to define their
> > >      message push model. Actually the message model is not necessarily related
> > >      to write combining a la x86, so we should probably come up with a better
> > >      and consistent naming. Enabling this patchset may trigger performance
> > >      regressions on mellanox drivers on arm64 - this ought to be
> > >      addressed.
> > 
> > It is pretty clear now that the certain ARM chips that don't do write
> > combining with pgprot_writecombine will performance regress if they
> > are running a certain uncommon Mellanox configuration. I suspect these
> > deployments are all running the out of tree patch for DEVICE_GRE
> > though.
> 
> I'm not sure I understand...
> 
> Today those ARM chips will not use pgprot_writecombine (at least not
> using that code path, they might still use it as the result of the
> other path in the driver that can enable it). 

Not quite, upstream kernel will never use WC on those
devices. DEVICE_GRE is not supported in upstream,
arch_can_pci_mmap_wc() is always false and the WC tester will always
fail.

> With the patch, those device will now use MT_DEVICE_NC.

Which doesn't do WC at all on some ARM implementations.
 
> Why would that be a regression ? 

Using the WC submission flow when it doesn't work costs something like
10% performance vs using the non-WC flow.

Like I said, the case where the driver can't self test probably
doesn't intersect with the ARM implementations that can't do write
combining, and if it did, the users probably run the out of tree
driver that has the hacky stuff to make it use DEVICE_GRE.

> BTW. Lorenzo, why don't we use MT_DEVICE_GRE for pgprot_writecombine ?
> Its not supported on some chips ?

It has alignment requirements drivers don't meet. We need a new
concept of "write combining and I promise to do aligned access"

> What on earth is pgprot_device() ? This is new ? On ARM it will be
> MT_DEVICE_nGnRE, so it allows posted write. It seems to match what
> ioremap does. Should then ioremap use it as well ?
> 
> But it's only ever used for PCI mmap. Why is it different from
> pgprot_noncached() which disables posted writes (nE) ?
> 
> Because a whole lot of drivers will use pgprot_noncached() explicitly
> in either mmap or vmap, with the expectation that it's somewhat the
> same as what ioremap does...

*boggle*

Only sysfs uses pci_mmap_resource_range() any other driver exposing
BAR pages, like VFIO dies not. Makes no sense at all it is different.

Delete the ill defined pgprot_device() ? Nobody has complained
something is wrong with VFIO in the 6 years since it was added...

Jason



More information about the linux-arm-kernel mailing list