* Luigi Rizzo <lrizzo@xxxxxxxxxx> [210803 19:58]: > On Wed, Aug 4, 2021 at 1:35 AM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > On Tue, Aug 03, 2021 at 11:07:35PM +0000, Liam Howlett wrote: > > > * Luigi Rizzo <lrizzo@xxxxxxxxxx> [210803 17:49]: > > > > On Tue, Aug 3, 2021 at 6:08 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > > > > > > > On Sat, Jul 31, 2021 at 10:53:41AM -0700, Luigi Rizzo wrote: > > > > > > find_vma() and variants need protection when used. > > > > > > This patch adds mmap_assert_lock() calls in the functions. > > > > > > > > > > > > To make sure the invariant is satisfied, we also need to add a > > > > > > mmap_read_loc() around the get_user_pages_remote() call in > > > > > > get_arg_page(). The lock is not strictly necessary because the mm > > > > > > has been newly created, but the extra cost is limited because > > > > > > the same mutex was also acquired shortly before in __bprm_mm_init(), > > > > > > so it is hot and uncontended. > > > > > > > > > > > > Signed-off-by: Luigi Rizzo <lrizzo@xxxxxxxxxx> > > > > > > fs/exec.c | 2 ++ > > > > > > mm/mmap.c | 2 ++ > > > > > > 2 files changed, 4 insertions(+) > > > > > > > > > > > > diff --git a/fs/exec.c b/fs/exec.c > > > > > > index 38f63451b928..ac7603e985b4 100644 > > > > > > +++ b/fs/exec.c > > > > > > @@ -217,8 +217,10 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos, > > > > > > * We are doing an exec(). 'current' is the process > > > > > > * doing the exec and bprm->mm is the new process's mm. > > > > > > */ > > > > > > + mmap_read_lock(bprm->mm); > > > > > > ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags, > > > > > > &page, NULL, NULL); > > > > > > + mmap_read_unlock(bprm->mm); > > > > > > if (ret <= 0) > > > > > > return NULL; > > > > > > > > > > Wasn't Jann Horn working on something like this too? > > > > > > > > > > https://lore.kernel.org/linux-mm/20201016225713.1971256-1-jannh@xxxxxxxxxx/ > > > > > > > > > > IIRC it was very tricky here, are you sure it is OK to obtain this lock > > > > > here? > > > > > > > > I cannot comment on Jann's patch series but no other thread knows > > > > about this mm at this point in the code so the lock is definitely > > > > safe to acquire (shortly before there was also a write lock acquired > > > > on the same mm, in the same conditions). > > > > > > If there is no other code that knows about this mm, then does one need > > > the lock at all? Is this just to satisfy the new check you added? > > > > > > If you want to make this change, I would suggest writing it in a way to > > > ensure the call to expand_downwards() in the same function also holds > > > the lock. I believe this is technically required as well? What do you > > > think? > > > > This is essentially what Jann was doing. Since the mm is newly created > > we can create it write locked and then we can add proper locking tests > > to many of the functions called along this path. That sounds good. Jann has left the patch as pending a fix since November 2020. Can't the removal of the lock/unlock be added to the next iteration of the patch? Was there a v4 of that patch? > > > > Adding useless locks around each troublesome callsite just seems > > really confusing to me. > > Uhm... by that reasoning, even creating the mm locked (and unlocking > at the end) is equally unnecessary. I think taking the lock is more clear than leaving it the way it's currently written. It is actually confusing to see the lock taken after calling expand_downwards() which explicitly mentions the lock as required in the comments though. This should at least have a comment about early creation not requiring the lock. > > My goal was to add asserts and invariants that are easy > to understand and get right, rather than optimize a path > that does not appear to be critical. > > Adding one read lock pair around the one function we annotate > is easy to understand and clearly a leaf lock. > > Having alloc_bprm return a locked object is a bit unconventional, > and also passing it to other methods raises the question of whether > they take other lock possibly causing lock order reversals > in the future. We are (probably?) okay as the usual order right now is to take the mmap sem before the pte and interval tree. It's also just for the set up, so unless there is a special case that could cause trouble... or maybe I should ask which cases will cause trouble? Thanks, Liam