[RFC PATCH v2 6/8] KVM: arm64: Only write protect selected PTE

Catalin Marinas catalin.marinas at arm.com
Tue Sep 26 08:58:03 PDT 2023


On Fri, Sep 22, 2023 at 04:59:08PM +0000, Oliver Upton wrote:
> On Fri, Sep 22, 2023 at 05:00:40PM +0100, Catalin Marinas wrote:
> > On Fri, Aug 25, 2023 at 10:35:26AM +0100, Shameer Kolothum wrote:
> > > From: Keqian Zhu <zhukeqian1 at huawei.com>
> > > 
> > > This function write protects all PTEs between the ffs and fls of mask.
> > > There may be unset bits between this range. It works well under pure
> > > software dirty log, as software dirty log is not working during this
> > > process.
> > > 
> > > But it will unexpectly clear dirty status of PTE when hardware dirty
> > > log is enabled. So change it to only write protect selected PTE.
> > 
> > Ah, I did wonder about losing the dirty status. The equivalent to S1
> > would be for kvm_pgtable_stage2_wrprotect() to set a software dirty bit.
> > 
> > I'm only superficially familiar with how KVM does dirty tracking for
> > live migration. Does it need to first write-protect the pages and
> > disable DBM? Is DBM re-enabled later? Or does stage2_wp_range() with
> > your patches leave the DBM on? If the latter, the 'wp' aspect is a bit
> > confusing since DBM basically means writeable (and maybe clean). So
> > better to have something like stage2_clean_range().
> 
> KVM has never enabled DBM and we solely rely on write-protection faults
> for dirty tracking. IOW, we do not have a writable-clean state for
> stage-2 PTEs (yet).

When I did the stage 2 AF support I left out DBM as it was unlikely
to be of any use in the real world. Now with dirty tracking for
migration, we may have a better use for this feature.

What I find confusing with these patches is that stage2_wp_range() is
supposed to make a stage 2 pte read-only, as the name implies. However,
if the pte was writeable, it leaves it writeable, clean with DBM
enabled. Doesn't the change to kvm_pgtable_stage2_wrprotect() in patch 4
break other uses of stage2_wp_range()? E.g. kvm_mmu_wp_memory_region()?

Unless I misunderstood, I'd rather change
kvm_arch_mmu_enable_log_dirty_pt_masked() to call a new function,
stage2_clean_range(), which clears S2AP[1] together with setting DBM if
previously writeable. But we should not confuse this with
write-protecting or change the write-protecting functions to mark a pte
writeable+clean.

-- 
Catalin



More information about the linux-arm-kernel mailing list