> On Dec 22, 2020, at 11:44 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > > On Mon, Dec 21, 2020 at 02:55:12PM -0800, Nadav Amit wrote: >> wouldn’t mmap_write_downgrade() be executed before mprotect_fixup() (so > > I assume you mean "in" mprotect_fixup, after change_protection. > > If you would downgrade the mmap_lock to read there, then it'd severely > slowdown the non contention case, if there's more than vma that needs > change_protection. > > You'd need to throw away the prev->vm_next info and you'd need to do a > new find_vma after droping the mmap_lock for reading and re-taking the > mmap_lock for writing at every iteration of the loop. > > To do less harm to the non-contention case you could perhaps walk > vma->vm_next and check if it's outside the mprotect range and only > downgrade in such case. So let's assume we intend to optimize with > mmap_write_downgrade only the last vma. … I read in detail your response and you make great points. To be fair, I did not think it through and just tried to make a point that not taking mmap_lock for write is an unrelated optimization. So you make a great point that it is actually related and can only(?) benefit uffd and arguably soft-dirty, to which I am going to add mmap_write_lock(). Yet, my confidence in doing the optimization that you suggested (keep using mmap_read_lock()) as part of the bug fix is very low and just got lower since we discussed. So I do want in the future to try to avoid the overheads I introduce (sorry), but it requires a systematic solution and some thought. Perhaps any change to PTE in a page-table should increase a page-table generation that we would save in the page-table page-struct (next to the PTL) and pte_same() would also look at the original and updated "page-table generation” when it checks if a PTE changed. So if a PTE went through not-writable -> writable -> not-writable cycle without a TLB flush this can go detected. Yet, there is still a question of how to detect pending TLB flushes in finer granularity to avoid frequent unwarranted TLB flushes while a TLB flush is pending. It all requires some thought, and the fact that soft-dirty appears to be broken too indicates that bugs can get unnoticed for some time. Sorry for being a chicken, but I prefer to be safe than sorry.