[PATCH 1/8] KVM: arm64: Make EL2 exception entry and exit context-synchronization events

Fuad Tabba tabba at google.com
Thu Apr 30 05:18:48 PDT 2026


Hi Will,


On Thu, 30 Apr 2026 at 10:08, Will Deacon <will at kernel.org> wrote:
>
> On Tue, Apr 28, 2026 at 11:30:01AM +0100, Fuad Tabba wrote:
> > SCTLR_EL2.EIS and SCTLR_EL2.EOS control whether exception entry and
> > exit at EL2 are Context Synchronisation Events (CSEs). Per ARM DDI
> > 0487 M.b, EIS is governed by D1.4.2 rule RBBSRF (p. D1-7205) and EOS
> > by D1.4.4.1 rule RBWCFK (p. D1-7209). D24.2.175 (p. D24-9754):
> >
> >   - !FEAT_ExS: the bit is RES1, so the entry/exit is unconditionally
> >     a CSE.
> >   - FEAT_ExS: the reset value is architecturally UNKNOWN; software
> >     must set the bit to make the entry/exit a CSE.
> >
> > INIT_SCTLR_EL2_MMU_ON in arch/arm64/include/asm/sysreg.h sets neither
> > bit. KVM/arm64 hot paths rely on ERET from EL2 being a CSE, and on
> > synchronous EL1->EL2 entry being a CSE, to elide explicit ISBs after
> > MSRs to context-switching system registers (HCR_EL2, HFGxTR_EL2,
> > HCRX_EL2, ZCR_EL2, CPACR_EL1, CPTR_EL2, SCTLR_EL1, ptrauth keys,
> > etc.); examples include the activate-traps path,
> > ptrauth_switch_to_guest, and the FPSIMD trap re-enable in
> > kvm_hyp_handle_fpsimd. On FEAT_ExS hardware those reliances are not
> > architecturally backed unless EOS=1 (and, for entry, EIS=1), and
> > whether they hold today depends on firmware initialisation outside
> > the kernel's control.
> >
> > Make the guarantee explicit: include SCTLR_ELx_EIS | SCTLR_ELx_EOS in
> > INIT_SCTLR_EL2_MMU_ON so that EL2 exception entry and exit are
> > unconditionally CSEs regardless of whether FEAT_ExS is implemented.
> > This matches the pairing in arch/arm64/kvm/config.c which treats EIS
> > and EOS together as RES1 under !FEAT_ExS.
> >
> > INIT_SCTLR_EL2_MMU_OFF is left unchanged: that path is used during
> > very early EL2 init and the EL2 MMU-off transition, neither of which
> > relies on these bits in the same way.
> >
> > Fixes: fe2c8d19189e ("KVM: arm64: Turn SCTLR_ELx_FLAGS into INIT_SCTLR_EL2_MMU_ON")
>
> I don't think this Fixes: tag is accurate:
>
> 1. That commit doesn't do anything with EIS/EOS afaict.
> 2. Back in 5.12 (when that thing landed), SCTLR_EL2_RES1 did actually
>    include EIS and EOS
>
> so I think the issue here might be that the auto-generated sysreg file
> quietly changes the RES1 definitions as bits get allocated, but the
> macros using the RES1 definition don't get updated. That's a pretty
> horrible pit that it feels like we might keep falling into :/
>
> Looking at 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg
> infrastructure"), I think we ended up dropping a whole bunch of fields
> from the RES1 mask (which became 0!). Have you checked all of those?

You're right, fe2c8d19189e didn't touch EIS/EOS: the SCTLR_EL2_RES1
mask it pulled into INIT_SCTLR_EL2_MMU_ON already contained
BIT(11)/BIT(22). Looking at it, I _think_ it's this one:

  0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg infrastructure")

After that commit SCTLR_EL2_RES1 is auto-generated. Because the
sysreg tooling can only model unconditional RES1, and EIS/EOS are
RES1 only when !FEAT_ExS, the generated mask is UL(0). I'll fix the
Fixes: tag in v2.

On the wider question of the other bits dropped from the old mask,
I went through them against DDI 0487 M.b §D24.2.175. The summary
(SCTLR_EL2 with E2H=0):

  bit  field    E2H=0 status                  kernel cares?
  -------------------------------------------------------------
   4   SA0      RES1 unconditionally          no
   5   CP15BEN  RES1 unconditionally          no
  11   EOS      RES1 iff !FEAT_ExS, else RW   yes (this fix)
  16   nTWI     RES1 unconditionally          no
  18   nTWE     RES1 unconditionally          no
  22   EIS      RES1 iff !FEAT_ExS, else RW   yes (this fix)
  23   SPAN     RES1 unconditionally          no
  28   nTLSMD   RES1 unconditionally          no
  29   LSMAOE   RES1 unconditionally          no

The seven non-EIS/EOS bits all fall under the "Otherwise: Reserved,
RES1" clause for the E2H=0 layout, with no feature guard. Writing 0
to them is a no-op, so dropping them from the mask should be harmless
I think. EIS and EOS are the only positions where the bit
becomes RW (with UNKNOWN reset) on FEAT_ExS hardware and the
kernel actively relies on the value being 1, which is what this
patch addresses.

I agree the auto-generator silently zeroing previously hand-rolled
RES1 masks is a real problem. Happy to look at either teaching the
sysreg infrastructure to express conditional RES1 (so config.c's
AS_RES1/FEAT_X facts can flow back into the header masks), or at
least adding a build-time check that flags any auto-generated
<REG>_RES1 that shrinks. After this series, though. Let me know if
you'd like me to take a stab.

What plan to changechange in v2:

1. Fixes: 0a35bd285f43 ("arm64: Convert SCTLR_EL2 to sysreg infrastructure").
2. Add one paragraph in the commit message explaining that the bug
landed when SCTLR_EL2_RES1 was auto-generated to UL(0), with a
one-line justification that the other seven dropped bits are
unconditionally RES1 at E2H=0 and so harmless.
3. Code diff unchanged (still just adding SCTLR_ELx_EIS |
SCTLR_ELx_EOS to INIT_SCTLR_EL2_MMU_ON).

What do you think?

Cheers,
/fuad

> Will



More information about the linux-arm-kernel mailing list