[PATCH 0/4] arm64: Support the TSO memory model

Sat Apr 20 04:37:25 PDT 2024

On Fri, 19 Apr 2024 17:58:09 +0100,
Will Deacon <will at kernel.org> wrote:
> 
> On Thu, Apr 11, 2024 at 11:19:13PM +0900, Hector Martin wrote:
> > On 2024/04/11 22:28, Will Deacon wrote:
> > >   * Some binaries in a distribution exhibit instability which goes away
> > >     in TSO mode, so a taskset-like program is used to run them with TSO
> > >     enabled.
> > 
> > Since the flag is cleared on execve, this third one isn't generally
> > possible as far as I know.
> 
> Ah ok, I'd missed that. Thanks.
> 
> > > In all these cases, we end up with native arm64 applications that will
> > > either fail to load or will crash in subtle ways on CPUs without the TSO
> > > feature. Assuming that the application cannot be fixed, a better
> > > approach would be to recompile using stronger instructions (e.g.
> > > LDAR/STLR) so that at least the resulting binary is portable. Now, it's
> > > true that some existing CPUs are TSO by design (this is a perfectly
> > > valid implementation of the arm64 memory model), but I think there's a
> > > big difference between quietly providing more ordering guarantees than
> > > software may be relying on and providing a mechanism to discover,
> > > request and ultimately rely upon the stronger behaviour.
> > 
> > The problem is "just" using stronger instructions is much more
> > expensive, as emulators have demonstrated. If TSO didn't serve a
> > practical purpose I wouldn't be submitting this, but it does. This is
> > basically non-negotiable for x86 emulation; if this is rejected
> > upstream, it will forever live as a downstream patch used by the entire
> > gaming-on-Mac-Linux ecosystem (and this is an ecosystem we are very
> > explicitly targeting, given our efforts with microVMs for 4K page size
> > support and the upcoming Vulkan drivers).
> 
> These microVMs sound quite interesting. What exactly are they? Are you
> running them under KVM?
> 
> Ignoring the mechanism for the time being, would it solve your problem
> if you were able to run specific microVMs in TSO mode, or do you *really*
> need the VM to have finer-grained control than that? If the whole VM is
> running in TSO mode, then my concerns largely disappear, as that's
> indistinguishable from running on a hardware implementation that happens
> to be TSO.

Since KVM has been mentioned a few times, I'll give my take on this.

Since day 1, it was a conscious decision for KVM/arm64 to emulate the
architecture, and only that -- this is complicated enough. Meaning
that no implementation-defined features should be explicitly exposed
to the guest. So I have no plan to expose any such feature for
userspace to configure TSO or anything else of the sort.

However, that doesn't preclude VMs from running in TSO mode if the HW
is configured as such at boot time. From what I have understood, this
is a per translation regime setting (EL1 and EL2 have separate knobs).

So it should be possible to set ACTLR_EL1.TSO=1 from firmware (using
the non-architected ACTLR_EL12 accessor), and let things work without
touching anything else (KVM doesn't context switch this register and
traps accesses to it). This would keep KVM out of the loop, the host
side would be unaffected, and only VMs would pay the overhead of TSO.

I appreciate that this is not the ideal situation, and very much an
all-or-nothing approach. But that's what we can reasonably manage from
an upstream perspective given the variability of the arm64 ecosystem.

	M.

-- 
Without deviation from the norm, progress is not possible.