[RFC PATCH 1/2] KVM: arm64: Introduce S2 walker SKIP return options

Leonardo Bras leo.bras at arm.com
Mon May 18 06:45:59 PDT 2026


Hello Oliver, Will,
Thanks for reviewing!

On Mon, May 18, 2026 at 09:52:16AM +0100, Will Deacon wrote:
> On Mon, May 18, 2026 at 12:22:47AM -0700, Oliver Upton wrote:
> > On Fri, May 15, 2026 at 08:59:02PM +0100, Leonardo Bras wrote:
> > > Introduce S2 walker return values:
> > > - SKIP_CHILDREN: skip walking the children of the current node
> > > - SKIP_SIBLINGS: skip waling the siblings of the current node
> > > 
> > > Also, modify __kvm_pgtable_visit() to fulfil the hing on above return
> > > values. Current walkers should not be impacted
> > 
> > I'd rather see something based around new walk flags than introducing an
> > entirely new mechanic around return values.
> > 
> > e.g. you could split the LEAF flag into separate flags for blocks v.
> > pages:
> > 
> > 	KVM_PGTABLE_WALK_PAGE,
> > 	KVM_PGTABLE_WALK_BLOCK,
> > 	KVM_PGTABLE_WALK_LEAF	= KVM_PGTABLE_WALK_PAGE |
> > 				  KVM_PGTABLE_WALK_BLOCK,
> > 
> > and then let __kvm_pgtable_visit() decide how to steer the walk. You may
> > need some special handling to get the address arithmetic right when
> > skipping over a table of page descriptors.

I am probably not getting the whole inner workings of this solution, but 
IIUC the idea would be to walk the blocks, but not the pages, right?

Blocks meaning level2- and pages being level3?
 
> I was wondering along similar lines, but maybe it would be useful just
> to pass a maximum level to the walker logic? That feels like the most
> general case without complicating the existing logic.

This proposal seems simpler for me to understand, and indeed looks like a 
better solution than what I have proposed, taking care of  the 
'already split' case with better performance, as it don't even walk a 
single level-3 entry. 

On the 'splitting' case, it also works flawlessly if the memory is given in 
level-2 blocks. There is only one case that I would like to address here:

- Memory given in level-1 blocks (say 1GB)
- Walker flag says 'walk down to level-2 only'
- Split Walker on level-1 will break page down to (up to) level-3 entries.
- Walker will continue to be called on level-2 entries, even though it's 
  not necessary.

To solve this, I would like to suggest a new flag, that skips a table 
that has just been created. This could be easily implemented in 
__kvm_page_visit() on top of the max level flags suggested.

enum kvm_pgtable_walk_flags {
[...]
	KVM_PGTABLE_WALK_SKIP_LEVEL3		= BIT(7),
	KVM_PGTABLE_WALK_SKIP_LEVEL2		= BIT(8),
	KVM_PGTABLE_WALK_SKIP_LEVEL1		= BIT(9),
	KVM_PGTABLE_WALK_SKIP_NEW_TABLE		= BIT(10),
};

How does that sound?

Thanks!
Leo



More information about the linux-arm-kernel mailing list