[RFC 1/2] vfio/pci: keep the prefetchable attribute of a BAR region in VMA

Marc Zyngier maz at kernel.org
Fri Apr 30 16:31:02 BST 2021


On Fri, 30 Apr 2021 15:58:14 +0100,
Shanker R Donthineni <sdonthineni at nvidia.com> wrote:
> 
> Hi Marc,
> 
> On 4/30/21 6:47 AM, Marc Zyngier wrote:
> >
> >>>> We've two concerns here:
> >>>>    - Performance impacts for pass-through devices.
> >>>>    - The definition of ioremap_wc() function doesn't match the host
> >>>> kernel on ARM64
> >>> Performance I can understand, but I think you're also using it to mask
> >>> a driver bug which should be resolved first.  Thank
> >> We’ve already instrumented the driver code and found the code path
> >> for the unaligned accesses. We’ll fix this issue if it’s not
> >> following WC semantics.
> >>
> >> Fixing the performance concern will be under KVM stage-2 page-table
> >> control. We're looking for a guidance/solution for updating stage-2
> >> PTE based on PCI-BAR attribute.
> > Before we start discussing the *how*, I'd like to clearly understand
> > what *arm64* memory attributes you are relying on. We already have
> > established that the unaligned access was a bug, which was the biggest
> > argument in favour of NORMAL_NC. What are the other requirements?
> Sorry, my earlier response was not complete...
> 
> ARMv8 architecture has two features Gathering and Reorder
> transactions, very important from a performance point of view. Small
> inline packets for NIC cards and accesses to GPU's frame buffer are
> CPU-bound operations. We want to take advantages of GRE features to
> achieve higher performance.
> 
> Both these features are disabled for prefetchable BARs in VM because
> memory-type MT_DEVICE_nGnRE enforced in stage-2.

Right, so Normal_NC is a red herring, and it is Device_GRE that you
really are after, right?

Now, I'm not convinced that we can do that directly from vfio in a
device-agnostic manner. It is userspace that places the device in the
guest's memory, and I have the ugly feeling that userspace needs to be
in control of memory attributes.

Otherwise, we change the behaviour for all existing devices that have
prefetchable BARs, and I don't think that's an acceptable move
(userspace ABI change).

	M.

-- 
Without deviation from the norm, progress is not possible.



More information about the linux-arm-kernel mailing list