[PATCH v5 02/25] KVM: arm64: Allow attaching of non-coalescable pages to a hyp pool

Quentin Perret qperret at google.com
Fri Oct 28 02:28:15 PDT 2022


Hey Oliver,

On Friday 28 Oct 2022 at 00:17:40 (+0000), Oliver Upton wrote:
> On Thu, Oct 20, 2022 at 02:38:04PM +0100, Will Deacon wrote:
> > From: Quentin Perret <qperret at google.com>
> > 
> > All the contiguous pages used to initialize a 'struct hyp_pool' are
> > considered coalescable, which means that the hyp page allocator will
> > actively try to merge them with their buddies on the hyp_put_page() path.
> > However, using hyp_put_page() on a page that is not part of the inital
> > memory range given to a hyp_pool() is currently unsupported.
> > 
> > In order to allow dynamically extending hyp pools at run-time, add a
> > check to __hyp_attach_page() to allow inserting 'external' pages into
> > the free-list of order 0. This will be necessary to allow lazy donation
> > of pages from the host to the hypervisor when allocating guest stage-2
> > page-table pages at EL2.
> 
> Is it ever going to be the case that we wind up mixing static and
> dynamic memory within the same buddy allocator? Reading ahead a bit it
> would seem pKVM uses separate allocators (i.e. pkvm_hyp_vm::pool for
> donated memory) but just wanted to make sure.

So, in the code we have that builds on top of this, yes, but to be
frank it's a bit of a corner case. The per-guest pool is initially
populated with a physically contiguous memory range that covers the
need to allocate the guest's stage-2 PGD, which may be up to 16
contiguous pages IIRC. But aside from that, all subsequent allocations
will be at order 0 granularity, and those pages are added to the pool
dynamically.

> I suppose what I'm getting at is the fact that the pool range makes
> little sense in this case. Adding a field to hyp_pool describing the
> type of pool that it is would make this more readable, such that we know
> a pool contains only donated memory, and thus zero order pages should
> never be coalesced.

Right. In fact I think it would be fair to say that the buddy allocator
is overkill for what we need so far. The only high-order allocations we
do are for concatenated stage-2 PGDs (host and guest), but these could
be easily special-cased, and we'd be left with only page-size allocs for
which a simple free list would do just fine. I've considered removing
the buddy allocator TBH, but it doesn't really hurt to have it, and it
might turn out to be useful in the future (e.g. SMMU support might
require high-order allocs IIUC, and the ability to coalesce would come
handy then).



More information about the linux-arm-kernel mailing list