On Mon, Jan 6, 2025 at 5:26 PM Isaac Manjarres <isaacmanjarres@xxxxxxxxxx> wrote: > > On Mon, Jan 06, 2025 at 09:35:09AM -0800, Jeff Xu wrote: > > + Kees because this is related to W^X memfd and security. > > > > On Fri, Jan 3, 2025 at 7:04 AM Jann Horn <jannh@xxxxxxxxxx> wrote: > > > > > > On Fri, Jan 3, 2025 at 12:32 AM Isaac J. Manjarres > > > <isaacmanjarres@xxxxxxxxxx> wrote: > > > > Android currently uses the ashmem driver [1] for creating shared memory > > > > regions between processes. Ashmem buffers can initially be mapped with > > > > PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the > > > > ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the > > > > permissions that the buffer can be mapped with. > > > > > > > > Processes can remove the ability to map ashmem buffers as executable to > > > > ensure that those buffers cannot be exploited to run unintended code. > > > > > > Is there really code out there that first maps an ashmem buffer with > > > PROT_EXEC, then uses the ioctl to remove execute permission for future > > > mappings? I don't see why anyone would do that. > > > > > > > For instance, suppose process A allocates a memfd that is meant to be > > > > read and written by itself and another process, call it B. > > > > > > > > Process A shares the buffer with process B, but process B injects code > > > > into the buffer, and compromises process A, such that it makes A map > > > > the buffer with PROT_EXEC. This provides an opportunity for process A > > > > to run the code that process B injected into the buffer. > > > > > > > > If process A had the ability to seal the buffer against future > > > > executable mappings before sharing the buffer with process B, this > > > > attack would not be possible. > > > > > > I think if you want to enforce such restrictions in a scenario where > > > the attacker can already make the target process perform > > > semi-arbitrary syscalls, it would probably be more reliable to enforce > > > rules on executable mappings with something like SELinux policy and/or > > > F_SEAL_EXEC. > > > > > I would like to second on the suggestion of making this as part of F_SEAL_EXEC. > > Thanks for taking a look at this patch Jeff! Can you please elaborate > some more on how F_SEAL_EXEC should be extended to restricting executable > mappings? > > I understand that if a memfd file is non-executable (either because it > was made non-executable via fchmod() or by being created with > MFD_NOEXEC_SEAL) one could argue that applying F_SEAL_EXEC to that file > would also mean preventing any executable mappings. However, it is not > clear to me if we should tie a file's executable permissions to whether > or not if it can be mapped as executable. For example, shared object > files don't have to have executable permissions, but processes should > be able to map them as executable. > > The case where we apply F_SEAL_EXEC on an executable memfd also seems > awkward to me, since memfds can be mapped as executable by default > so what would happen in that scenario? > > I also shared the same concerns in my response to Jann in [1]. > Apology for not being clear. I meant this below: when 1> memfd is created with MFD_NOEXEC_SEAL or 2> memfd is no-exec (NX) and F_SEAL_EXEC is set. We could also block the memfd from being mapped as executable. MFD_NOEXEC_SEAL/F_SEAL_EXEC is added in 6fd7353829ca, which is about 2 years old, I m not sure any application uses the case of creating a MFD_NOEXEC_SEAL memfd and still wants to mmap it as executable memory, that is a strange user case. It is more logical that applications want to block both execve() and mmap() for a non-executable memfd. Therefore I think we could reuse the F_SEAL_EXEC bit + NX state for this feature, for simplicity. > > > > diff --git a/mm/memfd.c b/mm/memfd.c > > > > index 5f5a23c9051d..cfd62454df5e 100644 > > > > --- a/mm/memfd.c > > > > +++ b/mm/memfd.c > > > > @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file) > > > > } > > > > > > > > #define F_ALL_SEALS (F_SEAL_SEAL | \ > > > > + F_SEAL_FUTURE_EXEC |\ > > > > F_SEAL_EXEC | \ > > > > F_SEAL_SHRINK | \ > > > > F_SEAL_GROW | \ > > > > @@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm_flags_ptr) > > > > return 0; > > > > } > > > > > > > > +static inline bool is_exec_sealed(unsigned int seals) > > > > +{ > > > > + return seals & F_SEAL_FUTURE_EXEC; > > > > +} > > > > + > > > > +static int check_exec_seal(unsigned long *vm_flags_ptr) > > > > +{ > > > > + unsigned long vm_flags = *vm_flags_ptr; > > > > + unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC); > > > > + > > > > + /* Executability is not a concern for private mappings. */ > > > > + if (!(mask & VM_SHARED)) > > > > + return 0; > > > > > > Why is it not a concern for private mappings? > > > > > > > + /* > > > > + * New PROT_EXEC and MAP_SHARED mmaps are not allowed when exec seal > > > > + * is active. > > > > + */ > > > > + if (mask & VM_EXEC) > > > > + return -EPERM; > > > > + > > > > + /* > > > > + * Prevent mprotect() from making an exec-sealed mapping executable in > > > > + * the future. > > > > + */ > > > > + *vm_flags_ptr &= ~VM_MAYEXEC; > > > > + > > > > + return 0; > > > > +} > > > > + > > > > int memfd_check_seals_mmap(struct file *file, unsigned long *vm_flags_ptr) > > > > { > > > > int err = 0; > > > > unsigned int *seals_ptr = memfd_file_seals_ptr(file); > > > > unsigned int seals = seals_ptr ? *seals_ptr : 0; > > > > > > > > - if (is_write_sealed(seals)) > > > > + if (is_write_sealed(seals)) { > > > > err = check_write_seal(vm_flags_ptr); > > > > + if (err) > > > > + return err; > > > > + } > > > > + > > > > + if (is_exec_sealed(seals)) > > > > + err = check_exec_seal(vm_flags_ptr); > > > > > > memfd_check_seals_mmap is only for mmap() path, right ? > > > > How about the mprotect() path ? i.e. An attacker can first create a > > RW VMA mapping for the memfd and later mprotect the VMA to be > > executable. > > > > Similar to the check_write_seal call , we might want to block mprotect > > for write seal as well. > > > > So when memfd_check_seals_mmap() is called, if the file is exec_sealed, > check_exec_seal() will not only just check that VM_EXEC is not set, > but it will also clear VM_MAYEXEC, which will prevent the mapping from > being changed to executable via mprotect() later. > Thanks for clarification. The name of check_exec_seal() is misleading , check implies a read operation, but this function actually does update. Maybe renaming to check_and_update_exec_seal or something like that ? Do you know which code checks for VM_MAYEXEC flag in the mprotect code path ? it isn't obvious to me, i.e. when I grep the VM_MAYEXEC inside mm path, it only shows one place in mprotect and that doesn't do the work. ~/mm/mm$ grep VM_MAYEXEC * mmap.c: mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; mmap.c: vm_flags &= ~VM_MAYEXEC; mprotect.c: if (rier && (vma->vm_flags & VM_MAYEXEC)) nommu.c: vm_flags |= VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; nommu.c: vm_flags |= VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; Thanks -Jeff > [1] https://lore.kernel.org/all/Z3x_8uFn2e0EpDqM@xxxxxxxxxx/ > > Thanks, > Isaac