[PATCH v4 5/7] arm64: Add support for FEAT_{LS64, LS64_V}

Catalin Marinas catalin.marinas at arm.com
Wed Sep 17 07:20:42 PDT 2025


On Wed, Sep 17, 2025 at 11:51:20AM +0800, Yicong Yang wrote:
> On 2025/9/16 22:56, Catalin Marinas wrote:
> > On Mon, Sep 15, 2025 at 04:29:25PM +0800, Yicong Yang wrote:
> >> in my understanding the hwcap only describes the capabilities of the CPU but not
> >> the whole system. the users should make sure the function works as expected if the
> >> CPU supports it and they're going to use it. specifically the LS64 is intended for
> >> device memory only, so the user should take responsibility of using it on supported
> >> memory.
> > 
> > We have other cases like MTE where we avoid exposing the HWCAP to user
> > if we know the memory system does not support MTE, though we intercepted
> > this early and asked the (micro)architects to tie the CPU ID field to
> > what the system supports.
> 
> but we lack the same identification mechanism as CPU for the memory system, so it's just a
> restriction for the hardware vendor that if certain feature is not supported for the whole
> system (SoC) then do not advertise it in the CPU's ID field. otherwise i think we're currently
> doing in the manner that if capability mismatch or cannot work as expected together then a
> errata/workaround is used to disable the feature or add some workaround on this certain
> platform.
> 
> this is also the case for LS64 but a bit more complex, since it involves the completer outside
> the SoC (the device) and could be a hotplug one (PCIe). from the SoC part we can restrict to
> advertise the feature only if it's fully supported (what we've already done on our hardware).

That's good to know. Hopefully other vendors do the same.

I think the ARM ARM would benefit from a note here that the system
designers should not advertise this if the interconnect does not support
it. I can raise this internally.

> > Arguably, the use of LD/ST64B* is fairly specialised and won't be used
> > on the general purpose RAM and by random applications. It needs a device
> > driver to create the NC/Device mapping and specific programs/libraries
> > to access it. I'm not sure the LS64 properties are guaranteed by the
> > device alone or the device together with the interconnect. I suspect the
> > latter and neither the kernel driver nor user space can tell. In the
> > best case, you get a fault and realise the system doesn't work as
> > expected. Worse is the non-atomicity with potentially silent corruption.
> 
> will be the latter one, both interconnect and the target device need to
> support it. but I think the driver developer (kernel driver or userspace
> driver) must have knowledge about the support status, otherwise they
> should not use it.
[...]
> my thoughts is that the driver developer should have known whether their
> device support it or not if going to use this. the information in the
> firmware table should be fine for platform devices, but cannot describe
> information for hotpluggable ones like PCIe endpoint devices which may
> not be listed in a firmware table.

There's a risk of such instructions ending up in more generic
copy_to/from_io implementations but it's not much we can do other than
not enabling the feature at all.

So, I think a HWCAP bit is useful but we need (a) clarification that the
CPUID field won't be set if the system doesn't support it and (b)
document the Linux bit that it's a per-device capability even if the
CPU/system supports it (the HWCAP is only a prerequisite to be able to
use the instructions; the driver can fall back to non-atomic ops, maybe
with a DGH if it helps performance).

An alternative would have been for the kernel driver to communicate to
the user that the device supports the 64-byte atomic accesses but I'm
not aware of any fairly generic way to do this.

-- 
Catalin



More information about the linux-arm-kernel mailing list