[PATCH] arm64: Enable PCI write-combine resources under sysfs
Catalin Marinas
catalin.marinas at arm.com
Wed Sep 16 04:48:52 EDT 2020
On Wed, Sep 16, 2020 at 09:33:16AM +0100, Will Deacon wrote:
> On Tue, Sep 15, 2020 at 08:40:06PM -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 16, 2020 at 09:17:38AM +1000, Benjamin Herrenschmidt wrote:
> > > On Tue, 2020-09-15 at 08:05 -0300, Jason Gunthorpe wrote:
> > > > > To sum it up:
> > > > >
> > > > > (1) RDMA drivers need a new mapping function/attribute to define their
> > > > > message push model. Actually the message model is not necessarily related
> > > > > to write combining a la x86, so we should probably come up with a better
> > > > > and consistent naming. Enabling this patchset may trigger performance
> > > > > regressions on mellanox drivers on arm64 - this ought to be
> > > > > addressed.
> > > >
> > > > It is pretty clear now that the certain ARM chips that don't do write
> > > > combining with pgprot_writecombine will performance regress if they
> > > > are running a certain uncommon Mellanox configuration. I suspect these
> > > > deployments are all running the out of tree patch for DEVICE_GRE
> > > > though.
> > >
> > > I'm not sure I understand...
> > >
> > > Today those ARM chips will not use pgprot_writecombine (at least not
> > > using that code path, they might still use it as the result of the
> > > other path in the driver that can enable it).
> >
> > Not quite, upstream kernel will never use WC on those
> > devices. DEVICE_GRE is not supported in upstream,
> > arch_can_pci_mmap_wc() is always false and the WC tester will always
> > fail.
> >
> > > With the patch, those device will now use MT_DEVICE_NC.
> >
> > Which doesn't do WC at all on some ARM implementations.
>
> Is that just TX2? I remember that thing being weird where GRE performed
> better than NC, but I thought that was a one off (and the thing is dead).
I recall something along these lines. Hopefully ARM updated the guidance
to licensees.
> NC is more permissive than GRE, so I think that's the right one to use; i.e.
> we go for the fewest number of restrictions on the hardware. If somebody
> screws up the uarch, that's up to them.
I agree, Normal NC is better as long as the BAR can tolerate read
side-effects.
--
Catalin
More information about the linux-arm-kernel
mailing list