On Fri, Sep 22, 2023 at 04:59:08PM +0000, Oliver Upton wrote: > On Fri, Sep 22, 2023 at 05:00:40PM +0100, Catalin Marinas wrote: > > On Fri, Aug 25, 2023 at 10:35:26AM +0100, Shameer Kolothum wrote: > > > From: Keqian Zhu <zhukeqian1@xxxxxxxxxx> > > > > > > This function write protects all PTEs between the ffs and fls of mask. > > > There may be unset bits between this range. It works well under pure > > > software dirty log, as software dirty log is not working during this > > > process. > > > > > > But it will unexpectly clear dirty status of PTE when hardware dirty > > > log is enabled. So change it to only write protect selected PTE. > > > > Ah, I did wonder about losing the dirty status. The equivalent to S1 > > would be for kvm_pgtable_stage2_wrprotect() to set a software dirty bit. > > > > I'm only superficially familiar with how KVM does dirty tracking for > > live migration. Does it need to first write-protect the pages and > > disable DBM? Is DBM re-enabled later? Or does stage2_wp_range() with > > your patches leave the DBM on? If the latter, the 'wp' aspect is a bit > > confusing since DBM basically means writeable (and maybe clean). So > > better to have something like stage2_clean_range(). > > KVM has never enabled DBM and we solely rely on write-protection faults > for dirty tracking. IOW, we do not have a writable-clean state for > stage-2 PTEs (yet). When I did the stage 2 AF support I left out DBM as it was unlikely to be of any use in the real world. Now with dirty tracking for migration, we may have a better use for this feature. What I find confusing with these patches is that stage2_wp_range() is supposed to make a stage 2 pte read-only, as the name implies. However, if the pte was writeable, it leaves it writeable, clean with DBM enabled. Doesn't the change to kvm_pgtable_stage2_wrprotect() in patch 4 break other uses of stage2_wp_range()? E.g. kvm_mmu_wp_memory_region()? Unless I misunderstood, I'd rather change kvm_arch_mmu_enable_log_dirty_pt_masked() to call a new function, stage2_clean_range(), which clears S2AP[1] together with setting DBM if previously writeable. But we should not confuse this with write-protecting or change the write-protecting functions to mark a pte writeable+clean. -- Catalin