[PATCH 0/8] KVM: arm64: EL2 synchronisation and pKVM stage-2 error propagation fixes
Fuad Tabba
tabba at google.com
Tue Apr 28 03:30:00 PDT 2026
Hi folks,
This is yet another series of fixes I'd like to land before posting a
follow-up to Will's pKVM infrastructure series [1].
I found these while developing KVM and arm64 system guides for
review-prompts [2], an open-source set of AI-assisted review prompts
used by sashiko [3]. While writing the guides I tried to find cases
that would be easy to miss or trip up an LLM, and stumbled on these
bugs. A local run with the updated guides flagged all of them
correctly (some of the commit messages incorporate feedback from that
run, e.g., the impact of WARN_ON() in hyp). I plan to upstream the
guides once they are complete.
The patches fall into three groups:
EL2 context-synchronisation (patches 1-2):
Patch 1 sets SCTLR_EL2.EIS and SCTLR_EL2.EOS in
INIT_SCTLR_EL2_MMU_ON. On FEAT_ExS hardware these bits are
UNKNOWN at reset; without them EL2 exception entry and exit are
not architecturally guaranteed to be Context Synchronisation
Events. KVM/arm64 hot paths rely on that guarantee implicitly to
elide explicit ISBs after MSRs to context-switching sysregs.
Patch 2 adds an explicit ISB after write_sysreg_hcr() on the
__deactivate_traps() path. The activate path is covered by the
ERET that follows (a CSE, guaranteed by patch 1); on the
deactivate path, subsequent EL2 sysreg accesses run before any
natural CSE.
Minor fixes (patches 3-4):
Patch 3 fixes a parameter-name typo in __deactivate_fgt() that
causes it to silently capture a variable from the enclosing scope
rather than use its declared parameter.
Patch 4 guards the VHE hyp panic path against a NULL vcpu pointer;
the nVHE counterpart already has this guard.
pKVM stage-2 error propagation (patches 5-8):
At EL2 in nVHE/pKVM, WARN_ON() is not warn-and-continue: it
expands to a BRK that enters the invalid-host-el2 vector and
branches to hyp_panic(), which is __noreturn.
Four pKVM memory-transition functions wrapped the return value of
kvm_pgtable_stage2_map() in WARN_ON() and discarded it. For the
share and donation paths the map can fail via -ENOMEM when the
vcpu memcache is exhausted, converting a recoverable hypercall
error into a fatal hyp panic. The four patches capture and
propagate the return value, with appropriate stage-2 unmap and
host-side rollback for the reachable failure cases.
Cheers,
/fuad
[1] https://lore.kernel.org/all/20260105154939.11041-1-will@kernel.org/
[2] https://github.com/masoncl/review-prompts
[3] https://sashiko.dev/
Fuad Tabba (8):
KVM: arm64: Make EL2 exception entry and exit context-synchronization
events
KVM: arm64: Synchronise HCR_EL2 writes on the guest exit path
KVM: arm64: Guard against NULL vcpu on VHE hyp panic path
KVM: arm64: Fix __deactivate_fgt macro parameter typo
KVM: arm64: Propagate stage-2 map failure on host->guest share
KVM: arm64: Propagate stage-2 map failure on host->guest donation
KVM: arm64: Propagate stage-2 map failure on guest->host share
KVM: arm64: Propagate stage-2 map failure on guest->host unshare
arch/arm64/include/asm/sysreg.h | 2 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
arch/arm64/kvm/hyp/nvhe/mem_protect.c | 99 +++++++++++++++++++++----
arch/arm64/kvm/hyp/nvhe/switch.c | 11 +++
arch/arm64/kvm/hyp/vhe/switch.c | 14 +++-
5 files changed, 111 insertions(+), 17 deletions(-)
--
2.54.0.545.g6539524ca2-goog
More information about the linux-arm-kernel
mailing list