[PATCH] arm64: Enable PCI write-combine resources under sysfs

Clint Sbisa csbisa at amazon.com
Fri Sep 11 17:42:25 EDT 2020

On Fri, Sep 11, 2020 at 10:39:16AM +1000, Benjamin Herrenschmidt wrote:
> > > > > That's why I looped you in - that's what worries me about
> > > > > "enabling"
> > > > > arch_can_pci_mmap_wc() on arm64. If we enable it and we have perf
> > > > > regressions that's not OK.
> > > > >
> > > > > Or we *can* enable arch_can_pci_mmap_wc() but force the mellanox
> > > > > driver (or more broadly all drivers following this message push
> > > > > semantics) to use "something else" for WC detection.
> > > >
> > > > arch_can_pci_mmap_wc() really only controls the sysfs resource file
> > > > and it seems very unclear who in userspace uses that these days.
> > >
> > > dpdk under some circumstances afaik.
> >
> > And something gross for DMA then? Not sure dpdk is useful without
> > DMA. Why not use CONFIG_VFIO_NOIOMMU for such a non-secure thing?
> Clint, can you elaborate on the use case ?

The use-case I'm targeting is the ENA pmd in DPDK. For performance reasons
(many of which are very similar to what Jason has described for mlx5), we need
to generate full-sized TLPs instead of many partial TLPs to improve efficiency.

Here's an excerpt describing the write-combine usage from

- Low Latency Queue (LLQ) mode or "push-mode".
  * In this mode the driver pushes the transmit descriptors and the
    first 128 bytes of the packet directly to the ENA device memory
    space. The rest of the packet payload is fetched by the
    device. For this operation mode, the driver uses a dedicated PCI
    device memory BAR, which is mapped with write-combine capability.

There's no DMA involved with this BAR-- the driver writes a portion of the
packet contents in addition to the descriptors, which generally increases the
number of TLPs if write-combine isn't used. Furthermore, this BAR is only used
for writes and never for reads.

As Jason noted in the other reply to this email, the Linux ENA driver makes use
of WC by using devm_ioremap_wc(). The DPDK code here uses the same mechanism in
user-space to enable write-combining by mapping the resourceX_wc file if the


More information about the linux-arm-kernel mailing list