[PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64

Catalin Marinas catalin.marinas at arm.com
Wed Dec 6 03:09:18 PST 2023


On Tue, Dec 05, 2023 at 03:51:30PM -0400, Jason Gunthorpe wrote:
> On Tue, Dec 05, 2023 at 07:34:45PM +0000, Catalin Marinas wrote:
> > > 2) You want to #define __iowrite512_copy() to memcpy_toio() on ARM and
> > >    implement some quad STP optimization for this case?
> > 
> > We can have the generic __iowrite512_copy() do memcpy_toio() and have
> > the arm64 implement an optimised version.
> > 
> > What I'm not entirely sure of is the DGH (whatever the io_* barrier name
> > is). I'd put it in the same __iowrite512_copy() function and remove it
> > from the driver code. Otherwise when ST64B is added, we have an
> > unnecessary DGH in the driver. If this does not match the other
> > __iowrite*_copy() semantics, we can come up with another name. But start
> > with this for now and document the function.
> 
> I think the iowrite is only used for WC and the DGH is functionally
> harmless for non-WC, so it makes sense.
> 
> In this case we should just remove the DGH macro from the generic
> architecture code and tell people to use iowrite - since we now
> understand that callers basically have to in order to use DGH on new
> ARM CPUs.

That works for me but what would the semantics be for __iowrite64_copy()
for example? Is there a DGH at the end of the whole write or after each
iteration? I'd go with the former since e.g. hns3_tx_push_bd() does
that (and doesn't seem to be a 64 byte copy). Similarly for
__iowrite512_copy(), if you want the DGH after each iteration you should
only pass a count of 1.

-- 
Catalin



More information about the linux-arm-kernel mailing list