On Sat, 8 Jul 2023 at 12:12, Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote: > > kernel/fork.c | 1 + > 1 file changed, 1 insertion(+) I ended up editing your explanation a lot. I'm not convinced that the bug has much to do with the delayed tlb flushing. I think it's more fundamental than some tlb coherence issue: our VM copying simply expects to not have any unrelated concurrent page fault activity, and various random internal data structures simply rely on that. I made up an example that I'm not sure is relevant to any of the particular failures, but that I think is a non-TLB case: the parent 'vma->anon_vma' chain is copied by dup_mmap() in anon_vma_fork(), and it's possible that the parent vma didn't have any anon_vma associated with it at that point. But a concurrent page fault to the same vma - even *before* the page tables have been copied, and when the TLB is still entirely coherent - could then cause a anon_vma_prepare() on that parent vma, and associate one of the pages with that anon-vma. Then the page table copy happens, and that page gets marked read-only again, and is added to both the parent and the child vma's, but the child vma never got associated with the parents new anon_vma, because it didn't exist when anon_vma_fork() happened. Does this ever happen? I have no idea. But it would seem to be an example that really has nothing to do with any TLB state, and is just simply "we cannot handle concurrent page faults while we're busy copying the mm". Again - maybe I messed up, but it really feels like the missing vma_start_write() was more fundamental, and not some "TLB coherency" issue. Linus