On Fri, Dec 06, 2024 at 06:19:49PM +0000, Lorenzo Stoakes wrote: > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote: > > diff --git a/mm/mmap.c b/mm/mmap.c > > index b1b2a24ef82e..c7b96b057fda 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > if (!file_mmap_ok(file, inode, pgoff, len)) > > return -EOVERFLOW; > > > > Not maybe in favour of _where_ in the logic we check this and definitely > not in expanding this do_mmap() stuff much further. > > See comment at bottom though... I have a cunning plan :) > > > + if (is_exec_sealed(seals)) { > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution? > I've not tested this scenario so don't know if we somehow disallow this in > another way but note on write checks we only care about shared mappings. > > I mean one could argue that a MAP_PRIVATE situation is the same as copying > the data into an anon buffer and doing what you want with it, here you > could argue the same... > > So probably we should only care about VM_SHARED? Thanks for taking a look at this! I'd originally implemented it for just the VM_SHARED case, but after discussing it with Kalesh, I changed it to disallow executable mappings for both MAP_SHARED and MAP_PRIVATE. Our thought was that write sealing didn't apply in the MAP_PRIVATE case to support COW with MAP_PRIVATE. There's nothing similar to COW with execution, so I decided to prevent it for both cases; it also retains the same behavior as the ashmem driver. > > + /* No new executable mappings if the file is exec sealed. */ > > + if (prot & PROT_EXEC) > > Seems strange to reference a prot flag rather than vma flag, we should have > that set up by now. That makes sense. I can change this to check for VM_EXEC in v2 of this series. > > + return -EACCES; > > + /* > > + * Prevent an initially non-executable mapping from > > + * later becoming executable via mprotect(). > > + */ > > + vm_flags &= ~VM_MAYEXEC; > > + } > > + > > You know, I'm in two minds about this... I explicitly moved logic to > do_mmap() in [0] to workaround a chicken-and-egg scenario with having > accidentally undone the ability to mmap() read-only F_WRITE_SEALed > mappings, which meant I 'may as well' move the 'future proofing' clearing > of VM_MAYWRITE for F_SEAL_FUTURE_WRITE too. > > But now I feel that the use of shmem_mmap() and hugetlbfs_file_mmap() to do > _any_ of this is pretty odious in general, we may as well do it all > upfront. > > [0]:https://lore.kernel.org/all/cover.1732804776.git.lorenzo.stoakes@xxxxxxxxxx/ I agree. I really like the idea of handling the future proofing and error checking in one place. It makes understanding how these seals work a lot easier, and has the added benefit of only worrying about the check once rather than having to duplicate it in both shmem_mmap() and hugetlbfs_file_mmap(). > > flags_mask = LEGACY_MAP_MASK; > > if (file->f_op->fop_flags & FOP_MMAP_SYNC) > > flags_mask |= MAP_SYNC; > > -- > > 2.47.0.338.g60cca15819-goog > > > > So actually - overall - could you hold off a bit on this until I've had a > think and can perhaps send a patch that refactors this? > > Then your patch can build on top of that one and we can handle this all in > one place and sanely :) > > Sorry you've kicked off thought processes here and that's often a dangerous > thing :P Thanks again for reviewing these patches! Happy that I was able to get the gears turning :) I'm really interested in helping with this, so is there any forum you'd like to use for collaborating on this or any way I can help? I'm also more than happy to test out any patches that you'd like! Thanks, Isaac