[PATCH 1/6] arm64/sme: Flush foreign register state in do_sme_acc()
Mark Brown
broonie at kernel.org
Tue Dec 3 09:24:39 PST 2024
On Tue, Dec 03, 2024 at 05:00:08PM +0000, Dave Martin wrote:
> On Tue, Dec 03, 2024 at 04:00:45PM +0000, Mark Brown wrote:
> > It's to ensure that the last recorded CPU for the current task is
> > invalid so that if the state was loaded on another CPU and we switch
> > back to that CPU we reload the state from memory, we need to at least
> > trigger configuration of the SME VL.
> OK, so the logic here is something like:
> Disregarding SME, the FPSIMD/SVE regs are up to date, which is fine
> because SME is trapped.
> When we take the SME trap, we suddenly have some work to do in order to
> make sure that the SME-specific parts of the register state are up to
> date, so we need to mark the state as stale before setting TIF_SME and
> returning.
We know that the only bit of register state which is not up to date at
this point is the SME vector length, we don't configure that for tasks
that do not have SME. SVCR is always configured since we have to exit
streaming mode for FPSIMD and SVE to work properly so we know it's
already 0, all the other SME specific state is gated by controls in
SVCR.
> fpsimd_flush_task_state() means that we do the necessary work when re-
> entering userspace, but is there a problem with simply marking all the
> FPSIMD/vector state as stale? If FPSR or FPCR is dirty for example, it
> now looks like they won't get written back to thread struct if there is
> a context switch before current re-enters userspace?
> Maybe the other flags distinguish these cases -- I haven't fully got my
> head around it.
We are doing fpsimd_flush_task_state() in the TIF_FOREIGN_FPSTATE case
so we know there is no dirty state in the registers.
> (Actually, the ARM ARM says (IMHTLZ) that toggling PSTATE.SM by any
> means causes FPSR to become 0x800009f. I'm not sure where that fits in
> -- do we handle that anywhere? I guess the "soft" SM toggling via
Urgh, not seen that one - that needs handling in the signal entry path
and ptrace. That will have been defined while the feature was being
implemented. It's not relevant here though since we are in the SME
access trap, we might be trapping due to a SMSTART or equivalent
operation but that SMSTART has not yet run at the point where we return
to userspace.
> ptrace, signal delivery or maybe exec, ought to set this? Not sure how
> that interacts with the expected behaviour of the fenv(3) API... Hmm.
> I see no corresponding statement about FPCR.)
Fun. I'm not sure how the ABI is defined there by libc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20241203/5a94b72e/attachment.sig>
More information about the linux-arm-kernel
mailing list