On Mon, Sep 09, 2024 at 08:08:16PM -0400, Peter Xu wrote: > On Mon, Sep 09, 2024 at 04:15:39PM -0700, Andrew Morton wrote: > > On Mon, 9 Sep 2024 18:43:22 -0400 Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > > > > > Do we need the logic to clear dirty bit in the child as that in > > > > > > __copy_present_ptes()? (and also for the pmd's case). > > > > > > > > > > > > e.g. > > > > > > if (vma->vm_flags & VM_SHARED) > > > > > > pud = pud_mkclean(pud); > > > > > > > > > > Yeah, good question. I remember I thought about that when initially > > > > > working on these lines, but I forgot the details, or maybe I simply tried > > > > > to stick with the current code base, as the dirty bit used to be kept even > > > > > in the child here. > > > > > > > > > > I'd expect there's only performance differences, but still sounds like I'd > > > > > better leave that to whoever knows the best on the implications, then draft > > > > > it as a separate patch but only when needed. > > > > > > > > Sorry, but this vaguensss simply leaves me with nowhere to go. > > > > > > > > I'll drop the series - let's revisit after -rc1 please. > > > > > > Andrew, would you please explain why it needs to be dropped? > > > > > > I meant in the reply that I think we should leave that as is, and I think > > > so far nobody in real life should care much on this bit, so I think it's > > > fine to leave the dirty bit as-is. > > > > > > I still think whoever has a better use of the dirty bit and would like to > > > change the behavior should find the use case and work on top, but only if > > > necessary. > > > > Well. "I'd expect there's only performance differences" means to me > > "there might be correctness issues, I don't know". Is it or is it not > > merely a performance thing? > > There should have no correctness issue pending. It can only be about > performance, and AFAIU what this patch does is exactly the way where it > shouldn't ever change performance either, as it didn't change how dirty bit > was processed (just like before this patch), not to mention correctness (in > regards to dirty bits). > > I can provide some more details. > > Here the question we're discussing is "whether we should clear the dirty > bit in the child for a pgtable entry when it's VM_SHARED". Yan observed > that we don't do the same thing for pte/pmd/pud, which is true. > > Before this patch: > > - For pte: we clear dirty bit if VM_SHARED in child when copy > - For pmd/pud: we never clear dirty bit in the child when copy Hi Peter, Not sure if I missed anything. It looks that before this patch, pmd/pud are alawys write protected without checking "is_cow_mapping(vma->vm_flags) && pud_write(pud)". pud_wrprotect() clears dirty bit by moving the dirty value to the software bit. And I have a question that why previously pmd/pud are always write protected. Thanks Yan > > The behavior of clearing dirty bit for VM_SHARED in child for pte level > originates to the 1st commit that git history starts. So we always do so > for 19 years. > > That makes sense to me, because clearing dirty bit in pte normally requires > a SetDirty on the folio, e.g. in unmap path: > > if (pte_dirty(pteval)) > folio_mark_dirty(folio); > > Hence cleared dirty bit in the child should avoid some extra overheads when > the pte maps a file cache, so clean pte can at least help us to avoid calls > into e.g. mapping's dirty_folio() functions (in which it should normally > check folio_test_set_dirty() again anyway, and parent pte still have the > dirty bit set so we won't miss setting folio dirty): > > folio_mark_dirty(): > if (folio_test_reclaim(folio)) > folio_clear_reclaim(folio); > return mapping->a_ops->dirty_folio(mapping, folio); > > However there's the other side of thing where when the dirty bit is missing > I _think_ it also means when the child writes to the cleaned pte, it'll > require (e.g. on hardware accelerated archs) MMU setting dirty bit which is > slower than if we don't clear the dirty bit... and on software emulated > dirty bits it could even require a page fault, IIUC. > > In short, personally I don't know what's the best to do, on keep / remove > the dirty bit even if it's safe either way: there are pros and cons on > different decisions. > > That's why I said I'm not sure which is the best way. I had a feeling that > most of the people didn't even notice this, and we kept running this code > for the past 19 years just all fine.. > > OTOH, we don't do the same for pmds/puds (in which case we persist dirty > bits always in child), and I didn't check whether it's intended, or why. > It'll have similar reasoning as above discussion on pte, or even more I > overlooked. > > So again, the safest approach here is in terms of dirty bit we keep what we > do as before. And that's what this patch does as of now. > > IOW, if I'll need a repost, I'll repost exactly the same thing (with the > fixup I sent later, which is already in mm-unstable). > > Thanks, > > -- > Peter Xu > >