[PATCH] KVM: arm64: Invert KVM_PGTABLE_WALK_HANDLE_FAULT to fix pKVM walkers

Marc Zyngier maz at kernel.org
Sun Nov 30 11:25:59 PST 2025


On Fri, 28 Nov 2025 14:17:10 +0000,
Will Deacon <will at kernel.org> wrote:
> 
> Commit ddcadb297ce5 ("KVM: arm64: Ignore EAGAIN for walks outside of a
> fault") introduced a new walker flag ('KVM_PGTABLE_WALK_HANDLE_FAULT')
> to KVM's page-table code. When set, the walk logic maintains its
> previous behaviour of terminating a walk as soon as the visitor callback
> returns an error. However, when the flag is clear, the walk will
> continue if the visitor returns -EAGAIN and the error is then suppressed
> and returned as zero to the caller.
> 
> Clearing the flag is beneficial when write-protecting a range of IPAs
> with kvm_pgtable_stage2_wrprotect() but is not useful in any other
> cases, either because we are operating on a single page (e.g.
> kvm_pgtable_stage2_mkyoung() or kvm_phys_addr_ioremap()) or because the
> early termination is desirable (e.g. when mapping pages from a fault in
> user_mem_abort()).
> 
> Subsequently, commit e912efed485a ("KVM: arm64: Introduce the EL1 pKVM
> MMU") hooked up pKVM's hypercall interface to the MMU code at EL1 but
> failed to propagate any of the walker flags. As a result, page-table
> walks at EL2 fail to set KVM_PGTABLE_WALK_HANDLE_FAULT even when the
> early termination semantics are desirable on the fault handling path.
> 
> Rather than complicate the pKVM hypercall interface, invert the flag so
> that the whole thing can be simplified and only pass the new flag
> ('KVM_PGTABLE_WALK_IGNORE_EAGAIN') from the wrprotect code.
> 
> Cc: Fuad Tabba <tabba at google.com>
> Cc: Quentin Perret <qperret at google.com>
> Cc: Marc Zyngier <maz at kernel.org>
> Cc: Oliver Upton <oupton at kernel.org>
> Fixes: fce886a60207 ("KVM: arm64: Plumb the pKVM MMU in KVM")
> Signed-off-by: Will Deacon <will at kernel.org>
> ---
> 
> I found this by inspection and it's a bit fiddly to see what could
> actually go wrong in practice because the 'mappings' tree will return
> -EAGAIN if it finds a pre-existing entry. The permission relaxing path
> looks more problematic, as we'll return 0 instead of -EAGAIN and I
> think we can mark the page dirty twice etc.

I don't really like the new name of the flag, but unsurprisingly, I
also can't come up with anything better.

I otherwise quite like the fact that it becomes a buy-in behaviour.

Reviewed-by: Marc Zyngier <maz at kernel.org>

	M.

-- 
Jazz isn't dead. It just smells funny.



More information about the linux-arm-kernel mailing list