> On Feb 20, 2022, at 10:23 PM, Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Thu, Feb 17, 2022 at 08:00:12PM -0800, Nadav Amit wrote: >> >>> On Feb 17, 2022, at 6:23 PM, Nadav Amit <nadav.amit@xxxxxxxxx> wrote: >>> >>>> PS: I always think here the VM_SOFTDIRTY check is wrong, IMHO it should be: >>>> >>>> if (dirty_accountable && pte_dirty(ptent) && >>>> (pte_soft_dirty(ptent) || >>>> (vma->vm_flags & VM_SOFTDIRTY))) { >>>> ptent = pte_mkwrite(ptent); >>>> } >> >> I know it is off-topic (not directly related to my patch), but >> I tried to understand the logic - both of the existing code and of >> your suggested change - and I failed. >> >> IIUC dirty_accountable (whose value is taken from >> vma_wants_writenotify()) means that the writes *should* be tracked, >> and therefore the page should remain read-only. > > Right. > >> >> So basically the condition should have been based on >> !dirty_accountable, i.e. the inverted value of dirty_accountable. >> >> The problem is that dirty_accountable also reflects VM_SOFTDIRTY >> considerations, so looking on the PTE does not tell you whether >> the PTE should remain write-protected even if it is dirty. > > My understanding is that the dirty bits (especially if both set) means > we've tracked dirty on this pte already so we don't need to, hence we can > set the dirty bit here. E.g., continuous mprotect(RO), mprotect(RW) upon a > full dirty pte. > > When something wants to enable tracking again, it needs to clear the dirty > bit, either the real one or soft-dirty one. So it's a pure performance > enhancement to conditionally set write bit here, when we're sure we won't > need any further tracking on this pte. > > One thing to mention is that this path only applies to VM_SHARED|VM_WRITE, > because that's what checked the first in vma_wants_writenotify(): > > /* If it was private or non-writable, the write bit is already clear */ > if ((vm_flags & (VM_WRITE|VM_SHARED)) != ((VM_WRITE|VM_SHARED))) > return 0; > > IOW private mappings are not optimized in current tree yet. > > Peter Collingbourne proposed a patch some time ago to optimize it but it > didn't get merged somehow. Meanwhile even with his latest version it > should still miss the thp case, so if to reference the private optimization > Andrea's tree would be the best: > > https://github.com/aagit/aa/commit/fadb5e04d94472614c76819acd979b2f60e4eff6 > > Hope it clarifies things a bit. Thanks, Thanks for the clarification. That’s what I suspected - I did not encounter it since I only used private anonymous mappings. I will try to create a test-case and send an additional fix for this issue. Regards, Nadav