[PATCH 0/4] arm64: Support the TSO memory model

Alex Bennée alex.bennee at linaro.org
Tue May 7 03:24:18 PDT 2024


Will Deacon <will at kernel.org> writes:

> Hi Hector,
>
> On Thu, Apr 11, 2024 at 09:51:19AM +0900, Hector Martin wrote:
>> x86 CPUs implement a stricter memory modern than ARM64 (TSO). For this
>> reason, x86 emulation on baseline ARM64 systems requires very expensive
>> memory model emulation. Having hardware that supports this natively is
>> therefore very attractive. Such hardware, in fact, exists. This series
>> adds support for userspace to identify when TSO is available and
>> toggle it on, if supported.
>
> I'm probably going to make myself hugely unpopular here, but I have a
> strong objection to this patch series as it stands. I firmly believe
> that providing a prctl() to query and toggle the memory model to/from
> TSO is going to lead to subtle fragmentation of arm64 Linux userspace.
>
> It's not difficult to envisage this TSO switch being abused for native
> arm64 applications:
>
>   * A program no longer crashes when TSO is enabled, so the developer
>     just toggles TSO to meet a deadline.
>
>   * Some legacy x86 sources are being ported to arm64 but concurrency
>     is hard so the developer just enables TSO to (mostly) avoid thinking
>     about it.
>
>   * Some binaries in a distribution exhibit instability which goes away
>     in TSO mode, so a taskset-like program is used to run them with TSO
>     enabled.

These all just seem like cases of engineers hiding from their very real
problems. I don't know if its really the kernels place to avoid giving
them the foot gun. Would it assuage your concerns at all if we set a
taint flag so bug reports/core dumps indicated we were in a
non-architectural memory mode?

> In all these cases, we end up with native arm64 applications that will
> either fail to load or will crash in subtle ways on CPUs without the TSO
> feature. Assuming that the application cannot be fixed, a better
> approach would be to recompile using stronger instructions (e.g.
> LDAR/STLR) so that at least the resulting binary is portable. Now, it's
> true that some existing CPUs are TSO by design (this is a perfectly
> valid implementation of the arm64 memory model), but I think there's a
> big difference between quietly providing more ordering guarantees than
> software may be relying on and providing a mechanism to discover,
> request and ultimately rely upon the stronger behaviour.

I think the main use case here is for emulation. When we run x86-on-arm
in QEMU we do currently insert lots of extra barrier instructions on
every load and store. If we can probe and set a TSO mode I can assure
you we'll do the right thing ;-)

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



More information about the linux-arm-kernel mailing list