On Fri, Oct 18, 2024 at 04:47:10PM +0200, Roberto Sassu wrote: > From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> > > Commit ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in > remap_file_pages()") fixed a security issue, it added an LSM check when > trying to remap file pages, so that LSMs have the opportunity to evaluate > such action like for other memory operations such as mmap() and mprotect(). > > However, that commit called security_mmap_file() inside the mmap_lock lock, > while the other calls do it before taking the lock, after commit > 8b3ec6814c83 ("take security_mmap_file() outside of ->mmap_sem"). > > This caused lock inversion issue with IMA which was taking the mmap_lock > and i_mutex lock in the opposite way when the remap_file_pages() system > call was called. > > Solve the issue by splitting the critical region in remap_file_pages() in > two regions: the first takes a read lock of mmap_lock and retrieves the VMA > and the file associated, and calculate the 'prot' and 'flags' variable; the > second takes a write lock on mmap_lock, checks that the VMA flags and the > VMA file descriptor are the same as the ones obtained in the first critical > region (otherwise the system call fails), and calls do_mmap(). > > In between, after releasing the read lock and taking the write lock, call > security_mmap_file(), and solve the lock inversion issue. Great description! > > Cc: stable@xxxxxxxxxxxxxxx > Fixes: ea7e2d5e49c0 ("mm: call the security_mmap_file() LSM hook in remap_file_pages()") > Reported-by: syzbot+91ae49e1c1a2634d20c0@xxxxxxxxxxxxxxxxxxxxxxxxx > Closes: https://lore.kernel.org/linux-security-module/66f7b10e.050a0220.46d20.0036.GAE@xxxxxxxxxx/ > Reviewed-by: Roberto Sassu <roberto.sassu@xxxxxxxxxx> (Calculate prot and flags earlier) > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Other than some nits below: Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> I think you're definitely good to un-RFC here. > --- > mm/mmap.c | 62 ++++++++++++++++++++++++++++++++++++++++--------------- > 1 file changed, 45 insertions(+), 17 deletions(-) > > diff --git a/mm/mmap.c b/mm/mmap.c > index 9c0fb43064b5..762944427e03 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1640,6 +1640,7 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > unsigned long populate = 0; > unsigned long ret = -EINVAL; > struct file *file; > + vm_flags_t vm_flags; > > pr_warn_once("%s (%d) uses deprecated remap_file_pages() syscall. See Documentation/mm/remap_file_pages.rst.\n", > current->comm, current->pid); > @@ -1656,12 +1657,53 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > if (pgoff + (size >> PAGE_SHIFT) < pgoff) > return ret; > > - if (mmap_write_lock_killable(mm)) > + if (mmap_read_lock_killable(mm)) > + return -EINTR; I'm kinda verbose generally, but I'd love a comment like: /* * Look up VMA under read lock first so we can perform the security * without holding locks (which can be problematic). We reacquire a * write lock later and check nothing changed underneath us. */ > + > + vma = vma_lookup(mm, start); > + > + if (!vma || !(vma->vm_flags & VM_SHARED)) { > + mmap_read_unlock(mm); > + return -EINVAL; > + } > + > + prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > + prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > + prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > + > + flags &= MAP_NONBLOCK; > + flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > + if (vma->vm_flags & VM_LOCKED) > + flags |= MAP_LOCKED; > + > + /* Save vm_flags used to calculate prot and flags, and recheck later. */ > + vm_flags = vma->vm_flags; > + file = get_file(vma->vm_file); > + > + mmap_read_unlock(mm); > + Maybe worth adding a comment to explain why you're doing this without the lock so somebody looking at this later can understand the dance? > + ret = security_mmap_file(file, prot, flags); > + if (ret) { > + fput(file); > + return ret; > + } > + > + ret = -EINVAL; > + Again, being verbose, I'd put something here like: /* OK security check passed, take write lock + let it rip */ > + if (mmap_write_lock_killable(mm)) { > + fput(file); > return -EINTR; > + } > > vma = vma_lookup(mm, start); > > - if (!vma || !(vma->vm_flags & VM_SHARED)) > + if (!vma) > + goto out; > + I'd also add something like: /* Make sure things didn't change under us. */ > + if (vma->vm_flags != vm_flags) > + goto out; > + And drop this newline to group them together (super nitty I know, sorry!) > + if (vma->vm_file != file) > goto out; > > if (start + size > vma->vm_end) { > @@ -1689,25 +1731,11 @@ SYSCALL_DEFINE5(remap_file_pages, unsigned long, start, unsigned long, size, > goto out; > } > > - prot |= vma->vm_flags & VM_READ ? PROT_READ : 0; > - prot |= vma->vm_flags & VM_WRITE ? PROT_WRITE : 0; > - prot |= vma->vm_flags & VM_EXEC ? PROT_EXEC : 0; > - > - flags &= MAP_NONBLOCK; > - flags |= MAP_SHARED | MAP_FIXED | MAP_POPULATE; > - if (vma->vm_flags & VM_LOCKED) > - flags |= MAP_LOCKED; > - > - file = get_file(vma->vm_file); > - ret = security_mmap_file(vma->vm_file, prot, flags); > - if (ret) > - goto out_fput; > ret = do_mmap(vma->vm_file, start, size, > prot, flags, 0, pgoff, &populate, NULL); > -out_fput: > - fput(file); > out: > mmap_write_unlock(mm); > + fput(file); > if (populate) > mm_populate(ret, populate); > if (!IS_ERR_VALUE(ret)) > -- > 2.34.1 > These are just nits, this looks good to me!