On Mon, Sep 05, 2022 at 11:32:48AM -0700, Suren Baghdasaryan wrote: > On Mon, Sep 5, 2022 at 5:32 AM 'Michal Hocko' via kernel-team > <kernel-team@xxxxxxxxxxx> wrote: > > > > Unless I am missing something, this is not based on the Maple tree > > rewrite, right? Does the change in the data structure makes any > > difference to the approach? I remember discussions at LSFMM where it has > > been pointed out that some issues with the vma tree are considerably > > simpler to handle with the maple tree. > > Correct, this does not use the Maple tree yet but once Maple tree > transition happens and it supports RCU-safe lookups, my code in > find_vma_under_rcu() becomes really simple. > > > > > On Thu 01-09-22 10:34:48, Suren Baghdasaryan wrote: > > [...] > > > One notable way the implementation deviates from the proposal is the way > > > VMAs are marked as locked. Because during some of mm updates multiple > > > VMAs need to be locked until the end of the update (e.g. vma_merge, > > > split_vma, etc). > > > > I think it would be really helpful to spell out those issues in a greater > > detail. Not everybody is aware of those vma related subtleties. > > Ack. I'll expand the description of the cases when multiple VMAs need > to be locked in the same update. The main difficulties are: > 1. Multiple VMAs might need to be locked within one > mmap_write_lock/mmap_write_unlock session (will call it an update > transaction). > 2. Figuring out when it's safe to unlock a previously locked VMA is > tricky because that might be happening in different functions and at > different call levels. > > So, instead of the usual lock/unlock pattern, the proposed solution > marks a VMA as locked and provides an efficient way to: > 1. Identify locked VMAs. > 2. Unlock all locked VMAs in bulk. > > We also postpone unlocking the locked VMAs until the end of the update > transaction, when we do mmap_write_unlock. Potentially this keeps a > VMA locked for longer than is absolutely necessary but it results in a > big reduction of code complexity. Correct me if I'm wrong, but it looks like any time multiple VMAs need to be locked we need mmap_lock anyways, which is what makes your approach so sweet. If however we ever want to lock multiple VMAs without taking mmap_lock, then deadlock avoidance algorithms aren't that bad - there's the ww_mutex approach, which is simple and works well when there isn't much expected contention (the advantage of the ww_mutex approach is that it doesn't have to track all held locks). I've also written full cycle detection; that approcah gets you fewer restarts, at the cost of needing a list of all currently held locks.