On Tue, Dec 10, 2024 at 03:37:50PM -0800, Suren Baghdasaryan wrote: > > Replace vm_lock with vm_refcnt. Replace vm_detached with vm_refcnt == 0 > > -- that is, attach sets refcount to 1 to indicate it is part of the mas, > > detached is the final 'put'. > > I need to double-check if we ever write-lock a detached vma. I don't > think we do but better be safe. If we do then that wait-until() should > accept 0x8000'0001 as well. vma_start_write() __is_vma_write_locked() mmap_assert_write_locked(vma->vm_mm); So this really should hold afaict. > > RCU lookup does the inc_not_zero thing, when increment succeeds, compare > > mm/addr to validate. > > > > vma_start_write() already relies on mmap_lock being held for writing, > > and thus does not have to worry about writer-vs-writer contention, that > > is fully resolved by mmap_sem. This means we only need to wait for > > readers to drop out. > > > > vma_start_write() > > add(0x8000'0001); // could fetch_add and double check the high > > // bit wasn't already set. > > wait-until(refcnt == 0x8000'0002); // mas + writer ref > > WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq); > > sub(0x8000'0000); > > > > vma_end_write() > > put(); > > We don't really have vma_end_write(). Instead it's vma_end_write_all() > which increments mm_lock_seq unlocking all write-locked VMAs. > Therefore in vma_start_write() I think we can sub(0x8000'0001) at the > end. Right, I know you don't, but you should :-), I've suggested adding this before. > > vma_start_read() then becomes something like: > > > > if (vm_lock_seq == mm_lock_seq) > > return false; > > > > cnt = fetch_inc(1); > > if (cnt & msb || vm_lock_seq == mm_lock_seq) { > > put(); > > return false; > > } > > > > return true; > > > > vma_end_read() then becomes: > > put(); > > > > > > and the down_read() from uffffffd requires mmap_read_lock() and thus > > does not have to worry about writers, it can simpy be inc() and put(), > > no? > > I think your proposal should work. Let me try to code it and see if > something breaks. Btw, for the wait-until() and put() you can use rcuwait; that is the simplest wait form we have. It's suitable because we only ever have the one waiter.