On Fri, Dec 06, 2024 at 09:14:58PM +0000, Lorenzo Stoakes wrote: > On Fri, Dec 06, 2024 at 12:48:09PM -0800, Isaac Manjarres wrote: > > On Fri, Dec 06, 2024 at 06:19:49PM +0000, Lorenzo Stoakes wrote: > > > On Thu, Dec 05, 2024 at 05:09:22PM -0800, Isaac J. Manjarres wrote: > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > > index b1b2a24ef82e..c7b96b057fda 100644 > > > > --- a/mm/mmap.c > > > > +++ b/mm/mmap.c > > > > @@ -375,6 +375,17 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > > > > if (!file_mmap_ok(file, inode, pgoff, len)) > > > > return -EOVERFLOW; > > > > > > > > > > Not maybe in favour of _where_ in the logic we check this and definitely > > > not in expanding this do_mmap() stuff much further. > > > > > > See comment at bottom though... I have a cunning plan :) > > > > > > > + if (is_exec_sealed(seals)) { > > > > > > Are we intentionally disallowing a MAP_PRIVATE memfd's mapping's execution? > > > I've not tested this scenario so don't know if we somehow disallow this in > > > another way but note on write checks we only care about shared mappings. > > > > > > I mean one could argue that a MAP_PRIVATE situation is the same as copying > > > the data into an anon buffer and doing what you want with it, here you > > > could argue the same... > > > > > > So probably we should only care about VM_SHARED? > > > > Thanks for taking a look at this! > > > > I'd originally implemented it for just the VM_SHARED case, but after > > discussing it with Kalesh, I changed it to disallow executable > > mappings for both MAP_SHARED and MAP_PRIVATE. > > > > Our thought was that write sealing didn't apply in the MAP_PRIVATE case > > to support COW with MAP_PRIVATE. There's nothing similar to COW with > > execution, so I decided to prevent it for both cases; it also retains > > the same behavior as the ashmem driver. > > Hm, yeah I'm not sure that's really justified, I mean what's to stop a > caller from just mapping their own memory mapping executable, copying the > data and executing? > That's a fair point. In that case, I think it makes sense to enforce the seal only when the mapping is shared. The case I'm trying to address is when a process (A) allocates a memfd that is meant to be read and written by itself and another process (B). A shares the buffer with B, but B injects code into the buffer, and compromises A such that A maps the buffer with PROT_EXEC and runs the code that B injected into it. If A used F_SEAL_FUTURE_EXEC prior to sharing the buffer, then it could reduce the attack surface on itself in this scenario. > There's also further concerns around execution restriction for instance in > memfd_add_seals(): > > /* > * SEAL_EXEC implys SEAL_WRITE, making W^X from the start. > */ > if (seals & F_SEAL_EXEC && inode->i_mode & 0111) > seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE; > > So you probably want to change this to include F_SEAL_FUTURE_EXEC, and note Do you mean adding a case where if F_SEAL_FUTURE_EXEC is in the seals, then we should clear the X bits of the file and use F_SEAL_EXEC as well? I don't think the case in the if condition should imply F_SEAL_FUTURE_EXEC, since the file is still executable in this case? > your proposal interacts negatively with the whole > MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED mode set in vm.memfd_noeec - any system > with this set to '2' will literally not allow you to do what you want if > set to 2. > > See https://origin.kernel.org/doc/html/latest/userspace-api/mfd_noexec.html Sorry, I didn't follow how this would impact MFD_NOEXEC_SCOPE_NOEXEC_ENFORCED. Could you please clarify that? > > Thanks again for reviewing these patches! Happy that I was able to get > > the gears turning :) > > > > I'm really interested in helping with this, so is there any forum you'd > > like to use for collaborating on this or any way I can help? > > > > I'm also more than happy to test out any patches that you'd like! > > Thanks, I'm just going to post to the mailing list, this is the discussion > forum I'm making use of for this :) > > I will cc- you on my patch and definitely I'd appreciate testing thanks! > > But yeah, to be clear I'm not done with reviewing this, I need more time to > digest what you're trying to do here, but you definitely need to think > about the exec limitations. Thanks for sending out the patch. I took a look and tested it out and it definitely makes implementing this a lot nicer! Thanks, Isaac