On Mon, Jan 06, 2025 at 12:06:52PM +0100, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > > The patch below does not apply to the 6.6-stable tree. > If someone wants it applied there, or to any other stable or longterm > tree, then please email the backport, including the original git commit > id to <stable@xxxxxxxxxxxxxxx>. I actually intentionally didn't add Cc: Stable there as I knew this needed manual backport (to >=6.6.y... - the feature is only introduced in 6.6!) but I guess it was added on. Now the auto-scripts fired anyway, can you confirm whether 6.12 got it or not? To save me the effort of backporting this to 6.12 as well if I don't need to. I'll send out a manual backport for 6.6.y shortly. Older stable kernels obviously don't need this. Thanks! > > To reproduce the conflict and resubmit, you may use the following commands: > > git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y > git checkout FETCH_HEAD > git cherry-pick -x 8ec396d05d1b737c87311fb7311f753b02c2a6b1 > # <resolve conflicts, build, test, etc.> > git commit -s > git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2025010652-resemble-faceplate-702c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. > > Possible dependencies: > > > > thanks, > > greg k-h > > ------------------ original commit in Linus's tree ------------------ > > From 8ec396d05d1b737c87311fb7311f753b02c2a6b1 Mon Sep 17 00:00:00 2001 > From: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > Date: Thu, 28 Nov 2024 15:06:17 +0000 > Subject: [PATCH] mm: reinstate ability to map write-sealed memfd mappings > read-only > > Patch series "mm: reinstate ability to map write-sealed memfd mappings > read-only". > > In commit 158978945f31 ("mm: perform the mapping_map_writable() check > after call_mmap()") (and preceding changes in the same series) it became > possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. > > Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path > behaviour") unintentionally undid this logic by moving the > mapping_map_writable() check before the shmem_mmap() hook is invoked, > thereby regressing this change. > > This series reworks how we both permit write-sealed mappings being mapped > read-only and disallow mprotect() from undoing the write-seal, fixing this > regression. > > We also add a regression test to ensure that we do not accidentally > regress this in future. > > Thanks to Julian Orth for reporting this regression. > > > This patch (of 2): > > In commit 158978945f31 ("mm: perform the mapping_map_writable() check > after call_mmap()") (and preceding changes in the same series) it became > possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. > > This was previously unnecessarily disallowed, despite the man page > documentation indicating that it would be, thereby limiting the usefulness > of F_SEAL_WRITE logic. > > We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE > seal (one which disallows future writes to the memfd) to also be used for > F_SEAL_WRITE. > > For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a > read-only mapping to disallow mprotect() from overriding the seal - an > operation performed by seal_check_write(), invoked from shmem_mmap(), the > f_op->mmap() hook used by shmem mappings. > > By extending this to F_SEAL_WRITE and critically - checking > mapping_map_writable() to determine if we may map the memfd AFTER we > invoke shmem_mmap() - the desired logic becomes possible. This is because > mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will > have cleared. > > Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path > behaviour") unintentionally undid this logic by moving the > mapping_map_writable() check before the shmem_mmap() hook is invoked, > thereby regressing this change. > > We reinstate this functionality by moving the check out of shmem_mmap() > and instead performing it in do_mmap() at the point at which VMA flags are > being determined, which seems in any case to be a more appropriate place > in which to make this determination. > > In order to achieve this we rework memfd seal logic to allow us access to > this information using existing logic and eliminate the clearing of > VM_MAYWRITE from seal_check_write() which we are performing in do_mmap() > instead. > > Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.1732804776.git.lorenzo.stoakes@xxxxxxxxxx > Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > Reported-by: Julian Orth <ju.orth@xxxxxxxxx> > Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@xxxxxxxxxxxxxx/ > Cc: Jann Horn <jannh@xxxxxxxxxx> > Cc: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > Cc: Shuah Khan <shuah@xxxxxxxxxx> > Cc: Vlastimil Babka <vbabka@xxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > diff --git a/include/linux/memfd.h b/include/linux/memfd.h > index 3f2cf339ceaf..d437e3070850 100644 > --- a/include/linux/memfd.h > +++ b/include/linux/memfd.h > @@ -7,6 +7,7 @@ > #ifdef CONFIG_MEMFD_CREATE > extern long memfd_fcntl(struct file *file, unsigned int cmd, unsigned int arg); > struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx); > +unsigned int *memfd_file_seals_ptr(struct file *file); > #else > static inline long memfd_fcntl(struct file *f, unsigned int c, unsigned int a) > { > @@ -16,6 +17,19 @@ static inline struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > { > return ERR_PTR(-EINVAL); > } > + > +static inline unsigned int *memfd_file_seals_ptr(struct file *file) > +{ > + return NULL; > +} > #endif > > +/* Retrieve memfd seals associated with the file, if any. */ > +static inline unsigned int memfd_file_seals(struct file *file) > +{ > + unsigned int *sealsp = memfd_file_seals_ptr(file); > + > + return sealsp ? *sealsp : 0; > +} > + > #endif /* __LINUX_MEMFD_H */ > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 338a76ce9083..fb397918c43d 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -4101,6 +4101,37 @@ void mem_dump_obj(void *object); > static inline void mem_dump_obj(void *object) {} > #endif > > +static inline bool is_write_sealed(int seals) > +{ > + return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE); > +} > + > +/** > + * is_readonly_sealed - Checks whether write-sealed but mapped read-only, > + * in which case writes should be disallowing moving > + * forwards. > + * @seals: the seals to check > + * @vm_flags: the VMA flags to check > + * > + * Returns whether readonly sealed, in which case writess should be disallowed > + * going forward. > + */ > +static inline bool is_readonly_sealed(int seals, vm_flags_t vm_flags) > +{ > + /* > + * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as > + * MAP_SHARED and read-only, take care to not allow mprotect to > + * revert protections on such mappings. Do this only for shared > + * mappings. For private mappings, don't need to mask > + * VM_MAYWRITE as we still want them to be COW-writable. > + */ > + if (is_write_sealed(seals) && > + ((vm_flags & (VM_SHARED | VM_WRITE)) == VM_SHARED)) > + return true; > + > + return false; > +} > + > /** > * seal_check_write - Check for F_SEAL_WRITE or F_SEAL_FUTURE_WRITE flags and > * handle them. > @@ -4112,24 +4143,15 @@ static inline void mem_dump_obj(void *object) {} > */ > static inline int seal_check_write(int seals, struct vm_area_struct *vma) > { > - if (seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) { > - /* > - * New PROT_WRITE and MAP_SHARED mmaps are not allowed when > - * write seals are active. > - */ > - if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE)) > - return -EPERM; > + if (!is_write_sealed(seals)) > + return 0; > > - /* > - * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as > - * MAP_SHARED and read-only, take care to not allow mprotect to > - * revert protections on such mappings. Do this only for shared > - * mappings. For private mappings, don't need to mask > - * VM_MAYWRITE as we still want them to be COW-writable. > - */ > - if (vma->vm_flags & VM_SHARED) > - vm_flags_clear(vma, VM_MAYWRITE); > - } > + /* > + * New PROT_WRITE and MAP_SHARED mmaps are not allowed when > + * write seals are active. > + */ > + if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE)) > + return -EPERM; > > return 0; > } > diff --git a/mm/memfd.c b/mm/memfd.c > index c17c3ea701a1..35a370d75c9a 100644 > --- a/mm/memfd.c > +++ b/mm/memfd.c > @@ -170,7 +170,7 @@ static int memfd_wait_for_pins(struct address_space *mapping) > return error; > } > > -static unsigned int *memfd_file_seals_ptr(struct file *file) > +unsigned int *memfd_file_seals_ptr(struct file *file) > { > if (shmem_file(file)) > return &SHMEM_I(file_inode(file))->seals; > diff --git a/mm/mmap.c b/mm/mmap.c > index d32b7e701058..16f8e8be01f8 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -47,6 +47,7 @@ > #include <linux/oom.h> > #include <linux/sched/mm.h> > #include <linux/ksm.h> > +#include <linux/memfd.h> > > #include <linux/uaccess.h> > #include <asm/cacheflush.h> > @@ -368,6 +369,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > if (file) { > struct inode *inode = file_inode(file); > + unsigned int seals = memfd_file_seals(file); > unsigned long flags_mask; > > if (!file_mmap_ok(file, inode, pgoff, len)) > @@ -408,6 +410,8 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > vm_flags |= VM_SHARED | VM_MAYSHARE; > if (!(file->f_mode & FMODE_WRITE)) > vm_flags &= ~(VM_MAYWRITE | VM_SHARED); > + else if (is_readonly_sealed(seals, vm_flags)) > + vm_flags &= ~VM_MAYWRITE; > fallthrough; > case MAP_PRIVATE: > if (!(file->f_mode & FMODE_READ)) >