[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Jason Gunthorpe jgg at nvidia.com
Tue Dec 5 11:51:30 PST 2023


On Tue, Dec 05, 2023 at 07:34:45PM +0000, Catalin Marinas wrote:

> > 2) You want to #define __iowrite512_copy() to memcpy_toio() on ARM and
> >    implement some quad STP optimization for this case?
> 
> We can have the generic __iowrite512_copy() do memcpy_toio() and have
> the arm64 implement an optimised version.
> 
> What I'm not entirely sure of is the DGH (whatever the io_* barrier name
> is). I'd put it in the same __iowrite512_copy() function and remove it
> from the driver code. Otherwise when ST64B is added, we have an
> unnecessary DGH in the driver. If this does not match the other
> __iowrite*_copy() semantics, we can come up with another name. But start
> with this for now and document the function.

I think the iowrite is only used for WC and the DGH is functionally
harmless for non-WC, so it makes sense.

In this case we should just remove the DGH macro from the generic
architecture code and tell people to use iowrite - since we now
understand that callers basically have to in order to use DGH on new
ARM CPUs.

> > 3) A future ST64B and the x86 version would be put under
> >    __iowrite512_copy()?
> 
> Yes, arch-specific override.
> 
> > 4) A future ST64B would come with some kind of 'must do 64b copy or
> >    oops' to support the future HW that must have this instruction? eg
> >    we already see on Intel that HW must use ENQCMD and nothing else.
> 
> I don't agree with the oops part. We can't guarantee it on arm64, ST64B
> I think is optional in the architecture. If you do need such guarantees,
> we'd need the driver to probe for the feature (e.g. arch_has_...()) and
> invoke a new macro. 

Yes, exactly. The driver must check. The new macro should oops if it
is invoked wrong, the same way enqcmd will oops if invoked wrong on
Intel.

Jason



More information about the linux-arm-kernel mailing list