On Mon, Jan 06, 2025 at 11:19:17AM +0000, Lorenzo Stoakes wrote: > On Mon, Jan 06, 2025 at 12:06:52PM +0100, gregkh@xxxxxxxxxxxxxxxxxxx wrote: > > > > The patch below does not apply to the 6.6-stable tree. > > If someone wants it applied there, or to any other stable or longterm > > tree, then please email the backport, including the original git commit > > id to <stable@xxxxxxxxxxxxxxx>. > > I actually intentionally didn't add Cc: Stable there as I knew this needed > manual backport (to >=6.6.y... - the feature is only introduced in 6.6!) but I > guess it was added on. > > Now the auto-scripts fired anyway, can you confirm whether 6.12 got it or > not? To save me the effort of backporting this to 6.12 as well if I don't > need to. > > I'll send out a manual backport for 6.6.y shortly. Older stable kernels > obviously don't need this. Correction - the feature I am fixing landed in 6.7, so no need for a backport to 6.6 then :) Only 6.12 requires a backport. > > Thanks! > > > > > To reproduce the conflict and resubmit, you may use the following commands: > > > > git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y > > git checkout FETCH_HEAD > > git cherry-pick -x 8ec396d05d1b737c87311fb7311f753b02c2a6b1 > > # <resolve conflicts, build, test, etc.> > > git commit -s > > git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '2025010652-resemble-faceplate-702c@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. > > > > Possible dependencies: > > > > > > > > thanks, > > > > greg k-h > > > > ------------------ original commit in Linus's tree ------------------ > > > > From 8ec396d05d1b737c87311fb7311f753b02c2a6b1 Mon Sep 17 00:00:00 2001 > > From: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > Date: Thu, 28 Nov 2024 15:06:17 +0000 > > Subject: [PATCH] mm: reinstate ability to map write-sealed memfd mappings > > read-only > > > > Patch series "mm: reinstate ability to map write-sealed memfd mappings > > read-only". > > > > In commit 158978945f31 ("mm: perform the mapping_map_writable() check > > after call_mmap()") (and preceding changes in the same series) it became > > possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. > > > > Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path > > behaviour") unintentionally undid this logic by moving the > > mapping_map_writable() check before the shmem_mmap() hook is invoked, > > thereby regressing this change. > > > > This series reworks how we both permit write-sealed mappings being mapped > > read-only and disallow mprotect() from undoing the write-seal, fixing this > > regression. > > > > We also add a regression test to ensure that we do not accidentally > > regress this in future. > > > > Thanks to Julian Orth for reporting this regression. > > > > > > This patch (of 2): > > > > In commit 158978945f31 ("mm: perform the mapping_map_writable() check > > after call_mmap()") (and preceding changes in the same series) it became > > possible to mmap() F_SEAL_WRITE sealed memfd mappings read-only. > > > > This was previously unnecessarily disallowed, despite the man page > > documentation indicating that it would be, thereby limiting the usefulness > > of F_SEAL_WRITE logic. > > > > We fixed this by adapting logic that existed for the F_SEAL_FUTURE_WRITE > > seal (one which disallows future writes to the memfd) to also be used for > > F_SEAL_WRITE. > > > > For background - the F_SEAL_FUTURE_WRITE seal clears VM_MAYWRITE for a > > read-only mapping to disallow mprotect() from overriding the seal - an > > operation performed by seal_check_write(), invoked from shmem_mmap(), the > > f_op->mmap() hook used by shmem mappings. > > > > By extending this to F_SEAL_WRITE and critically - checking > > mapping_map_writable() to determine if we may map the memfd AFTER we > > invoke shmem_mmap() - the desired logic becomes possible. This is because > > mapping_map_writable() explicitly checks for VM_MAYWRITE, which we will > > have cleared. > > > > Commit 5de195060b2e ("mm: resolve faulty mmap_region() error path > > behaviour") unintentionally undid this logic by moving the > > mapping_map_writable() check before the shmem_mmap() hook is invoked, > > thereby regressing this change. > > > > We reinstate this functionality by moving the check out of shmem_mmap() > > and instead performing it in do_mmap() at the point at which VMA flags are > > being determined, which seems in any case to be a more appropriate place > > in which to make this determination. > > > > In order to achieve this we rework memfd seal logic to allow us access to > > this information using existing logic and eliminate the clearing of > > VM_MAYWRITE from seal_check_write() which we are performing in do_mmap() > > instead. > > > > Link: https://lkml.kernel.org/r/99fc35d2c62bd2e05571cf60d9f8b843c56069e0.1732804776.git.lorenzo.stoakes@xxxxxxxxxx > > Fixes: 5de195060b2e ("mm: resolve faulty mmap_region() error path behaviour") > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > Reported-by: Julian Orth <ju.orth@xxxxxxxxx> > > Closes: https://lore.kernel.org/all/CAHijbEUMhvJTN9Xw1GmbM266FXXv=U7s4L_Jem5x3AaPZxrYpQ@xxxxxxxxxxxxxx/ > > Cc: Jann Horn <jannh@xxxxxxxxxx> > > Cc: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > > Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > > Cc: Shuah Khan <shuah@xxxxxxxxxx> > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> > > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > > > > diff --git a/include/linux/memfd.h b/include/linux/memfd.h > > index 3f2cf339ceaf..d437e3070850 100644 > > --- a/include/linux/memfd.h > > +++ b/include/linux/memfd.h > > @@ -7,6 +7,7 @@ > > #ifdef CONFIG_MEMFD_CREATE > > extern long memfd_fcntl(struct file *file, unsigned int cmd, unsigned int arg); > > struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx); > > +unsigned int *memfd_file_seals_ptr(struct file *file); > > #else > > static inline long memfd_fcntl(struct file *f, unsigned int c, unsigned int a) > > { > > @@ -16,6 +17,19 @@ static inline struct folio *memfd_alloc_folio(struct file *memfd, pgoff_t idx) > > { > > return ERR_PTR(-EINVAL); > > } > > + > > +static inline unsigned int *memfd_file_seals_ptr(struct file *file) > > +{ > > + return NULL; > > +} > > #endif > > > > +/* Retrieve memfd seals associated with the file, if any. */ > > +static inline unsigned int memfd_file_seals(struct file *file) > > +{ > > + unsigned int *sealsp = memfd_file_seals_ptr(file); > > + > > + return sealsp ? *sealsp : 0; > > +} > > + > > #endif /* __LINUX_MEMFD_H */ > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 338a76ce9083..fb397918c43d 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -4101,6 +4101,37 @@ void mem_dump_obj(void *object); > > static inline void mem_dump_obj(void *object) {} > > #endif > > > > +static inline bool is_write_sealed(int seals) > > +{ > > + return seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE); > > +} > > + > > +/** > > + * is_readonly_sealed - Checks whether write-sealed but mapped read-only, > > + * in which case writes should be disallowing moving > > + * forwards. > > + * @seals: the seals to check > > + * @vm_flags: the VMA flags to check > > + * > > + * Returns whether readonly sealed, in which case writess should be disallowed > > + * going forward. > > + */ > > +static inline bool is_readonly_sealed(int seals, vm_flags_t vm_flags) > > +{ > > + /* > > + * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as > > + * MAP_SHARED and read-only, take care to not allow mprotect to > > + * revert protections on such mappings. Do this only for shared > > + * mappings. For private mappings, don't need to mask > > + * VM_MAYWRITE as we still want them to be COW-writable. > > + */ > > + if (is_write_sealed(seals) && > > + ((vm_flags & (VM_SHARED | VM_WRITE)) == VM_SHARED)) > > + return true; > > + > > + return false; > > +} > > + > > /** > > * seal_check_write - Check for F_SEAL_WRITE or F_SEAL_FUTURE_WRITE flags and > > * handle them. > > @@ -4112,24 +4143,15 @@ static inline void mem_dump_obj(void *object) {} > > */ > > static inline int seal_check_write(int seals, struct vm_area_struct *vma) > > { > > - if (seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) { > > - /* > > - * New PROT_WRITE and MAP_SHARED mmaps are not allowed when > > - * write seals are active. > > - */ > > - if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE)) > > - return -EPERM; > > + if (!is_write_sealed(seals)) > > + return 0; > > > > - /* > > - * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as > > - * MAP_SHARED and read-only, take care to not allow mprotect to > > - * revert protections on such mappings. Do this only for shared > > - * mappings. For private mappings, don't need to mask > > - * VM_MAYWRITE as we still want them to be COW-writable. > > - */ > > - if (vma->vm_flags & VM_SHARED) > > - vm_flags_clear(vma, VM_MAYWRITE); > > - } > > + /* > > + * New PROT_WRITE and MAP_SHARED mmaps are not allowed when > > + * write seals are active. > > + */ > > + if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE)) > > + return -EPERM; > > > > return 0; > > } > > diff --git a/mm/memfd.c b/mm/memfd.c > > index c17c3ea701a1..35a370d75c9a 100644 > > --- a/mm/memfd.c > > +++ b/mm/memfd.c > > @@ -170,7 +170,7 @@ static int memfd_wait_for_pins(struct address_space *mapping) > > return error; > > } > > > > -static unsigned int *memfd_file_seals_ptr(struct file *file) > > +unsigned int *memfd_file_seals_ptr(struct file *file) > > { > > if (shmem_file(file)) > > return &SHMEM_I(file_inode(file))->seals; > > diff --git a/mm/mmap.c b/mm/mmap.c > > index d32b7e701058..16f8e8be01f8 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -47,6 +47,7 @@ > > #include <linux/oom.h> > > #include <linux/sched/mm.h> > > #include <linux/ksm.h> > > +#include <linux/memfd.h> > > > > #include <linux/uaccess.h> > > #include <asm/cacheflush.h> > > @@ -368,6 +369,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > > > if (file) { > > struct inode *inode = file_inode(file); > > + unsigned int seals = memfd_file_seals(file); > > unsigned long flags_mask; > > > > if (!file_mmap_ok(file, inode, pgoff, len)) > > @@ -408,6 +410,8 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > vm_flags |= VM_SHARED | VM_MAYSHARE; > > if (!(file->f_mode & FMODE_WRITE)) > > vm_flags &= ~(VM_MAYWRITE | VM_SHARED); > > + else if (is_readonly_sealed(seals, vm_flags)) > > + vm_flags &= ~VM_MAYWRITE; > > fallthrough; > > case MAP_PRIVATE: > > if (!(file->f_mode & FMODE_READ)) > >