[PATCH] arm64/signal: Avoid corruption of SME state when entering signal handler
Mark Brown
broonie at kernel.org
Wed Oct 30 11:10:28 PDT 2024
On Wed, Oct 30, 2024 at 05:34:16PM +0000, Mark Rutland wrote:
> I originally just had a few comments on the commit message, but I
> believe I've found a logic issue in this patch, and more general issue
> throughout our FPSIMD/SVE/SME manipulation -- more details below.
I'm fairly sure there's at least one other issue lurking somewhere with
TIF_SVE tearing, yes. I've not been able to get that to reproduce, and
I've probably stared at this code too much to see it by pure inspection
however it looks like you might've spotted the issue here.
> On Wed, Oct 23, 2024 at 10:31:24PM +0100, Mark Brown wrote:
> It would be nice to have the signature of the failure as well, e.g.
> | This is intermittently detected by the fp-stress test, which
> | intermittently reports "ZA-VL-*-*: Bad SVCR: 0".
That's a common one for timing reasons, but it does also manifest with
other outputs (eg, if we turn off ZA while trying to execute
instructions that access ZA).
> I don't think this is correct in the TIF_FOREIGN_FPSTATE case. We don't
> unbind the saved state from another CPU it might still be resident on,
> and so IIUC there's a race whereby the updates to the saved state can
> end up discarded:
...
> ... and either:
> * A subsequent return to userspace will see TIF_FOREIGN_FPSTATE is
> clear and not restore the in-memory state.
> * A subsequent context-switch will see TIF_FOREIGN_FPSTATE is clear an
> save the (stale) HW state again.
> It looks like we have a similar pattern all over the place, e.g. in
> do_sve_acc():
Yes, indeed - I think that's a separate bug caused by the recalcuation
of TIF_FOREIGN_FPSTATE.
> This is going to need a careful audit and a proper series of
> fixes that can be backported to stable.
It feels like a separate thing at any rate. We can do a simple and
robust but performance impacting fix by having fpsimd_thread_switch()
only ever set TIF_FOREIGN_FPSTATE, never clear it. That'd cause extra
reloads in the case where we switch to a thread but stay in kernel mode
which probably happens often enough to be palatable.
Otherwise I'm not sure it's *too* hard, TIF_FOREIGN_FPSTATE is a bit of
a giveaway for places that could have issues.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20241030/1a42e26d/attachment.sig>
More information about the linux-arm-kernel
mailing list