[PATCH v3 29/32] KVM: arm64: Wrap the host with a stage 2

Will Deacon will at kernel.org
Mon Mar 8 13:52:34 GMT 2021


On Mon, Mar 08, 2021 at 01:38:07PM +0000, Quentin Perret wrote:
> On Monday 08 Mar 2021 at 12:46:07 (+0000), Will Deacon wrote:
> > > > > +static int host_stage2_idmap(u64 addr)
> > > > > +{
> > > > > +	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W;
> > > > > +	struct kvm_mem_range range;
> > > > > +	bool is_memory = find_mem_range(addr, &range);
> > > > > +	struct hyp_pool *pool = is_memory ? &host_s2_mem : &host_s2_dev;
> > > > > +	int ret;
> > > > > +
> > > > > +	if (is_memory)
> > > > > +		prot |= KVM_PGTABLE_PROT_X;
> > > > > +
> > > > > +	hyp_spin_lock(&host_kvm.lock);
> > > > > +	ret = kvm_pgtable_stage2_idmap_greedy(&host_kvm.pgt, addr, prot,
> > > > > +					      &range, pool);
> > > > > +	if (is_memory || ret != -ENOMEM)
> > > > > +		goto unlock;
> > > > > +	host_stage2_unmap_dev_all();
> > > > > +	ret = kvm_pgtable_stage2_idmap_greedy(&host_kvm.pgt, addr, prot,
> > > > > +					      &range, pool);
> > > > 
> > > > I find this _really_ hard to reason about, as range is passed by reference
> > > > and we don't reset it after the first call returns -ENOMEM for an MMIO
> > > > mapping. Maybe some commentary on why it's still valid here?
> > > 
> > > Sure, I'll add something. FWIW, that is intended -- -ENOMEM can only be
> > > caused by the call to kvm_pgtable_stage2_map() which leaves the range
> > > untouched. So, as long as we don't release the lock, the range returned
> > > by the first call to kvm_pgtable_stage2_idmap_greedy() should still be
> > > valid. I suppose I could call kvm_pgtable_stage2_map() directly the
> > > second time to make it obvious but I thought this would expose the
> > > internal of kvm_pgtable_stage2_idmap_greedy() a little bit too much.
> > 
> > I can see it both ways, but updating the kerneldoc for
> > kvm_pgtable_stage2_idmap_greedy() to state in which cases the range is
> > updated and how would be helpful. It just says "negative error code on
> > failure" at the moment.
> 
> Alternatively I could expose the 'reduce' path (maybe with another name
> e.g. stage2_find_compatible_range() or so) and remove the
> kvm_pgtable_stage2_idmap_greedy() wrapper. So it'd be the caller's
> responsibility to not release the lock in between
> stage2_find_compatible_range() and kvm_pgtable_stage2_map() for
> instance, but that sounds reasonable to me. And that would make it
> explicit it's the _map() path that failed with -ENOMEM, and that the
> range can be re-used the second time.
> 
> Thoughts?

I suppose it depends on whether or not you reckon this could be optimised
into a single-pass of the page-table. If not, then splitting it up makes
sense to me (and actually, it's not like this has tonnes of callers so
even if we changed things in future it wouldn't be too hard to fix them up).

Will



More information about the linux-arm-kernel mailing list